Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-6781

Speed up git changelog generation

    Details

    • Type: Improvement
    • Status: Closed (View Workflow)
    • Priority: Major
    • Resolution: Fixed
    • Component/s: git-plugin
    • Labels:
      None
    • Similar Issues:

      Description

      Generation of change log takes significant time in case there are many commits between builds.

      Per build logs, hudson issues

      [workspace] $ git log --pretty=format:%H 961016ffca61b0cf4ef8cc33816b8c3496ac73b2..1074f07c642d6f9f5259c2092751a424ed6c4667
      

      and then for every commit

      [workspace] $ git log -M --summary --pretty=raw -n 1 1074f07c642d6f9f5259c2092751a424ed6c4667
      [workspace] $ git diff-tree -M -r 1074f07c642d6f9f5259c2092751a424ed6c4667
      

      In my case special job that was building minor changes that were rebased onto long history of non-related commit took 2 minutes to build, and 1h 40 minutes to compute change log for 900 commits.

      Can it be improved by e.g. introducing detail-less changelog, so that single "git log" statement gives hashes/messages/authors, and hudson shows no "deatails" link (gitweb link only is enough)?

        Attachments

          Activity

          pagrus pagrus created issue -
          Hide
          pamdirac John McNair added a comment -

          Even better than a change log with no details would be to avoid forking a process 2x per commit.

          I haven't gone through all the details. Is something relying on the specific format produced by the sequence of those two commands? Would it be possible to get the same information, though perhaps in a slightly different format, via a single git log command executed over a commit range?

          Show
          pamdirac John McNair added a comment - Even better than a change log with no details would be to avoid forking a process 2x per commit. I haven't gone through all the details. Is something relying on the specific format produced by the sequence of those two commands? Would it be possible to get the same information, though perhaps in a slightly different format, via a single git log command executed over a commit range?
          pamdirac John McNair made changes -
          Field Original Value New Value
          Assignee magnayn [ magnayn ] pamdirac [ pamdirac ]
          Hide
          pamdirac John McNair added a comment -

          I fixed this by switching to "git whatchanged" to generate the change log. It allows one git command to generate a slightly different but compatible change log in place of 2x commits git commands. The problem is more severe than just the forking alluded to above since each command is also a separate remoting call.

          I setup a test with 5,000 commits in the change log and tied the job to a slave. Under git-0.9, it would take several hours to generate the change log. With my fix, the change log is generated before I switch to the console output after starting a new build.

          The fix is here: http://github.com/pamdirac/Hudson-GIT-plugin/commit/00585801d959c3b7782a56466afe3b1c3dcfb645

          Show
          pamdirac John McNair added a comment - I fixed this by switching to "git whatchanged" to generate the change log. It allows one git command to generate a slightly different but compatible change log in place of 2x commits git commands. The problem is more severe than just the forking alluded to above since each command is also a separate remoting call. I setup a test with 5,000 commits in the change log and tied the job to a slave. Under git-0.9, it would take several hours to generate the change log. With my fix, the change log is generated before I switch to the console output after starting a new build. The fix is here: http://github.com/pamdirac/Hudson-GIT-plugin/commit/00585801d959c3b7782a56466afe3b1c3dcfb645
          Hide
          abayer Andrew Bayer added a comment -

          Ok, one thing I'd like to see covered before merging this in - if the format of the output is slightly different, are we sure the changelog parser will spit out the same results for both old and new formats? For something like this, which could break changelog parsing for older builds, I'd like to see unit tests to verify the backwards compatibility, just to be safe.

          Show
          abayer Andrew Bayer added a comment - Ok, one thing I'd like to see covered before merging this in - if the format of the output is slightly different, are we sure the changelog parser will spit out the same results for both old and new formats? For something like this, which could break changelog parsing for older builds, I'd like to see unit tests to verify the backwards compatibility, just to be safe.
          Hide
          pamdirac John McNair added a comment - - edited

          The change log parser did not change. That's what I meant by compatible. The existing tests are still valid. In fact, for reasons unknown to me, the existing tests are run against the "git whatchanged" format instead of the "git log; git diff-tree" format that was there. So I didn't change the tests.

          The differences are:

          1. git log adds lines like this:
          create mode 100644 some/file

          for both create and delete but not for modify. These lines do not exist in whatchanged. These are ignored in 0.9 anyway. The parser looks for these diff-tree lines:
          :100644 100644 ec3d6ab1bfe8c6305ca417e8cb05e93a4f9002b8 49c5764d08e9ea890864493984daa85d8cb49849 M some/file1
          :000000 100644 0000000000000000000000000000000000000000 17c44b2efa46ef79330291af1c99436534315077 A other/file2

          2. git diff-tree emits the sha1 by itself on the first line. This is missing in whatchanged, but the parser already ignores it. The sha1 is determined by the first line (of "git log" in 0.9 and "git whatchanged" in this patch):
          commit <sha1>

          Show
          pamdirac John McNair added a comment - - edited The change log parser did not change. That's what I meant by compatible. The existing tests are still valid. In fact, for reasons unknown to me, the existing tests are run against the "git whatchanged" format instead of the "git log; git diff-tree" format that was there. So I didn't change the tests. The differences are: 1. git log adds lines like this: create mode 100644 some/file for both create and delete but not for modify. These lines do not exist in whatchanged. These are ignored in 0.9 anyway. The parser looks for these diff-tree lines: :100644 100644 ec3d6ab1bfe8c6305ca417e8cb05e93a4f9002b8 49c5764d08e9ea890864493984daa85d8cb49849 M some/file1 :000000 100644 0000000000000000000000000000000000000000 17c44b2efa46ef79330291af1c99436534315077 A other/file2 2. git diff-tree emits the sha1 by itself on the first line. This is missing in whatchanged, but the parser already ignores it. The sha1 is determined by the first line (of "git log" in 0.9 and "git whatchanged" in this patch): commit <sha1>
          Hide
          pamdirac John McNair added a comment -

          I pushed one more commit which adds a new test for the legacy format.

          Show
          pamdirac John McNair added a comment - I pushed one more commit which adds a new test for the legacy format.
          Hide
          abayer Andrew Bayer added a comment -

          Thanks - I just wanted to be safe, to make sure we didn't break anything in the process. I'm going to give it a day on my testbed, just to be safe, and then I'll merge in - I may give it another week before releasing, though, to see if any 0.9-related bugs come in.

          Show
          abayer Andrew Bayer added a comment - Thanks - I just wanted to be safe, to make sure we didn't break anything in the process. I'm going to give it a day on my testbed, just to be safe, and then I'll merge in - I may give it another week before releasing, though, to see if any 0.9-related bugs come in.
          Hide
          abayer Andrew Bayer added a comment -

          Looks good so far - and yeah, this isn't particularly risky the more I look at it. I may push this out as 0.9.1 tomorrow morning if nothing goes weird in the meantime, so that it doesn't get delayed by any 0.9 bugs.

          Show
          abayer Andrew Bayer added a comment - Looks good so far - and yeah, this isn't particularly risky the more I look at it. I may push this out as 0.9.1 tomorrow morning if nothing goes weird in the meantime, so that it doesn't get delayed by any 0.9 bugs.
          Hide
          pamdirac John McNair added a comment -

          Much appreciated.

          Show
          pamdirac John McNair added a comment - Much appreciated.
          Hide
          abayer Andrew Bayer added a comment -

          Marking as resolved, since I just released 0.9.1 with this fix. =) Thanks for fixing this!

          Show
          abayer Andrew Bayer added a comment - Marking as resolved, since I just released 0.9.1 with this fix. =) Thanks for fixing this!
          abayer Andrew Bayer made changes -
          Status Open [ 1 ] Resolved [ 5 ]
          Resolution Fixed [ 1 ]
          Hide
          mrobinet mrobinet added a comment -

          Excellent! I'm anxious to try this. I added the initial GitWeb support. I scoured man pages for a single command that could be used to get all of this information and finally resorted to using the combination of diff-tree and log commands. I did not know about the "whatchanged" command. Thanks for fixing this.

          Show
          mrobinet mrobinet added a comment - Excellent! I'm anxious to try this. I added the initial GitWeb support. I scoured man pages for a single command that could be used to get all of this information and finally resorted to using the combination of diff-tree and log commands. I did not know about the "whatchanged" command. Thanks for fixing this.
          abayer Andrew Bayer made changes -
          Status Resolved [ 5 ] Closed [ 6 ]
          rtyler R. Tyler Croy made changes -
          Workflow JNJira [ 136876 ] JNJira + In-Review [ 204262 ]

            People

            • Assignee:
              pamdirac John McNair
              Reporter:
              pagrus pagrus
            • Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: