Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-31731

Git-plugin duplicate builds & can theoretically infinitely build the same commit

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Unresolved
    • Icon: Major Major
    • git-plugin
    • None
    • git-plugin 2.4.0

      Summary

      There is a race condition when "Execute concurrent builds" is checked. When one or more builds are scheduled at the same time they all retrieve the same BuildData (linear search of previous builds until the first BuildData is found), and thus they all choose the same commit to build. If there are multiple branches to build each job also schedules another build. As long as the latest build has not yet determined what commit to build, the newly scheduled jobs will also use that same inaccurate BuildData which is infinitely passed along to newly scheduled jobs.

      To break this chain, the latest build must finish determining the commit to build before any new jobs perform "checkout()" which is where the previous build's BuildData is retrieved. I watched my instance with the environment below build the same commit ~50 times between 2 slaves before I turned it off.

      Minimum environment to reproduce

      1. Git repo with at least 3 unbuilt buildable branches
      2. Jenkins job with execute concurrent builds checked
      3. 2 slaves with 1 executor (or possibly 1 slave with 2 executors, but I try this)
      4. Add sleep(30 seconds) to the top of determineRevisionToBuild() to mimic realistic timing and reveal the race condition

      Expected behavior

      1. Do notifycommit request which will schedule polling which will find changes and schedule build #1
      2. While build #1 is running (i.e. sleeping 30s), do notifycommit request again so that polling runs once build #1 finishes
      3. Build #1 will choose buildable branch #1 and schedule another build because there are 2 remaining buildable branches. Build #1 finishes successfully.
      4. Build #2 and polling both start at same time. Polling finds 2 buildable branches and schedules a build. Build #2 is sleeping and has not yet determined what commit to build.
      5. Build #3 starts and uses the BuildData from Build #2 (which is the BuildData from build #1 because it hasn't determined anything yet). Build #3 chooses the same commit as Build #2. Both Build #2/#3 schedule another build because there is 1 more buildable branch.
      6. Build #2 finishes and Build #4 starts. Build #4 uses BuildData from Build #3 (which hasn't been updated with its commit choice yet and is equivalent to BuildData from Build #1).
      7. Repeat 6

      Solutions

      Quick fix: BuildData should not be added to the Build Actions until it has determined what to build! For example (https://github.com/bjacklyn/git-plugin/commit/8d11e349f9fc04f7385ce6ffc8db772ce82c89c4). This will eliminate the possibility of infinite rebuilds, however, this still means duplicate builds will occur until one of the builds determines a commit.

      Medium fix: Builds could be queued internally in the Git plugin, such that later builds wait for earlier builds to determine what commit they're building. This forces later builds to use the correct BuildData. Everything else can happen concurrently (fetching/cloning), just the determination needs to be sequential.

      Longterm fix: BuildData needs to be replaced with something else, but it is unclear what the actual requirements are. I see this pull request from 2013 was rejected (https://github.com/jenkinsci/git-plugin/pull/163) and this one hasn't been worked on since March 2015 (https://github.com/jenkinsci/git-plugin/pull/313). Is there a in-depth writeup somewhere of the exact requirements?

            Unassigned Unassigned
            bjacklyn Brandon Jacklyn
            Votes:
            1 Vote for this issue
            Watchers:
            5 Start watching this issue

              Created:
              Updated: