Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-30873

Jenkins' Git plugin reparses all previous build.xml on restart

    Details

    • Type: Bug
    • Status: Open (View Workflow)
    • Priority: Critical
    • Resolution: Unresolved
    • Component/s: git-plugin
    • Labels:
      None
    • Environment:
    • Similar Issues:

      Description

      edit: I originally thought this was related to renaming a project, but this seems to be a general issue when restarting Jenkins. When we upgraded to 1.634 we simply restart Jenkins with the new .war file and even after 12h it wasn't done parsing the XML files from our tens of thousands of archived builds. We ended up moving all the old build directories out of the way and this immediately resolved the problem for us (we just can't access these old builds anymore, but that's less problematic than Jenkins itself being stuck on certain projects).


      I renamed a job with the following steps:

      1. Shut down Jenkins
      2. mv ~jenkins/job/prevname ~jenkins/job/newname
      3. Adjust any other references to the job name
      4. Start Jenkins

      (It's not the first time I do this and I've never had any issues in the past)

      This time I renamed a couple jobs with "lots" of previous builds (26k and 5k, respectively). Upon launching the first build for either of these jobs after restarting Jenkins, it was apparent that the build wasn't starting, couldn't be stopped, and was seemingly hung. Checking /threadDump and doing a bit of strace'ing showed that Jenkins was loading every single prior build.xml for these jobs.

      Common part of the stack trace:

      	at jenkins.model.lazy.LazyBuildMixIn.loadBuild(LazyBuildMixIn.java:158)
      	at jenkins.model.lazy.LazyBuildMixIn$1.create(LazyBuildMixIn.java:135)
      	at hudson.model.RunMap.retrieve(RunMap.java:224)
      	at hudson.model.RunMap.retrieve(RunMap.java:57)
      	at jenkins.model.lazy.AbstractLazyLoadRunMap.load(AbstractLazyLoadRunMap.java:465)
      	-  locked hudson.model.RunMap@2cedf74
      	at jenkins.model.lazy.AbstractLazyLoadRunMap.load(AbstractLazyLoadRunMap.java:448)
      	at jenkins.model.lazy.AbstractLazyLoadRunMap.getByNumber(AbstractLazyLoadRunMap.java:356)
      	at jenkins.model.lazy.AbstractLazyLoadRunMap.search(AbstractLazyLoadRunMap.java:332)
      	at jenkins.model.lazy.LazyBuildMixIn$RunMixIn.getPreviousBuild(LazyBuildMixIn.java:357)
      	at hudson.model.AbstractBuild.getPreviousBuild(AbstractBuild.java:199)
      	at hudson.model.AbstractBuild.getPreviousBuild(AbstractBuild.java:107)
      	at hudson.plugins.git.GitSCM.getBuildData(GitSCM.java:1553)
      	at hudson.plugins.git.GitSCM.buildEnvVars(GitSCM.java:1161)
      	at hudson.model.AbstractBuild.getEnvironment(AbstractBuild.java:940)
      	at hudson.model.AbstractBuild$AbstractBuildExecution.decideWorkspace(AbstractBuild.java:481)
      	at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:521)
      	at hudson.model.Run.execute(Run.java:1741)
      	at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:43)
      	at hudson.model.ResourceController.execute(ResourceController.java:98)
      	at hudson.model.Executor.run(Executor.java:408)
      

      strace output:

      ~/jobs/foo-master @sw-pm1.sjc> strace -fp 1964 -e trace='!futex' |& fgrep arista                                                                                                                                                           
      [pid  2216] open("/home/jenkins/jobs/foo-master/builds/6461/build.xml", O_RDONLY) = 506
      [pid  2216] open("/home/jenkins/jobs/foo-master/builds/6459/build.xml", O_RDONLY) = 500
      [pid  2216] open("/home/jenkins/jobs/foo-master/builds/6458/build.xml", O_RDONLY) = 506
      [pid  2216] open("/home/jenkins/jobs/foo-master/builds/6457/build.xml", O_RDONLY) = 500
      [... several minutes later...]
      [pid  2216] open("/home/jenkins/jobs/foo-master/builds/1089/build.xml", O_RDONLY) = 506
      [pid  2216] open("/home/jenkins/jobs/foo-master/builds/1088/build.xml", O_RDONLY) = 506
      [pid  2216] open("/home/jenkins/jobs/foo-master/builds/1087/build.xml", O_RDONLY) = 506
      [pid  2216] open("/home/jenkins/jobs/foo-master/builds/1086/build.xml", O_RDONLY) = 502
      [pid  2216] open("/home/jenkins/jobs/foo-master/builds/1085/build.xml", O_RDONLY) = 502
      [pid  2216] open("/home/jenkins/jobs/foo-master/builds/1084/build.xml", O_RDONLY) = 502
      

      I've also seen it load other files, e.g.:

      [pid 14094] open("/home/jenkins/jobs/foo/builds/5266/archive", O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
      [pid 14094] open("/home/jenkins/jobs/foo/builds/5268/log", O_RDONLY) = 500
      [pid 14094] open("/home/jenkins/jobs/foo/builds/5268/log", O_RDONLY) = 500
      [pid 18327] open("/home/jenkins/jobs/foo/builds/5177/build.xml", O_RDONLY) = 492
      [pid 18328] open("/home/jenkins/jobs/foo/builds/5178/build.xml", O_RDONLY) = 492
      

        Attachments

          Issue Links

            Activity

            Hide
            markewaite Mark Waite added a comment -

            A warning is logged by git plugin 3.1.0 (released on 4 Mar 2017) when many BuildData instances are attached to a build/

            Show
            markewaite Mark Waite added a comment - A warning is logged by git plugin 3.1.0 (released on 4 Mar 2017) when many BuildData instances are attached to a build/
            Hide
            markewaite Mark Waite added a comment -

            Benoit Sigoure archiving (or removing) builds is the only way that I know for people to deal with it. I intentionally limit the number of builds I retain for the hundreds of jobs in my environment so that I don't have to fight as much with this problem.

            Pull requests to resolve the problem without breaking compatibility are certainly welcomed. Nicolas De Loof attempted at least two different ways to remove that dependency (BuildData). Unfortunately, it is an area of the plugin code which is especially challenging.

            Show
            markewaite Mark Waite added a comment - Benoit Sigoure archiving (or removing) builds is the only way that I know for people to deal with it. I intentionally limit the number of builds I retain for the hundreds of jobs in my environment so that I don't have to fight as much with this problem. Pull requests to resolve the problem without breaking compatibility are certainly welcomed. Nicolas De Loof attempted at least two different ways to remove that dependency (BuildData). Unfortunately, it is an area of the plugin code which is especially challenging.
            Hide
            tsuna Benoit Sigoure added a comment -

            How do people cope with this bug? We have Jenkins jobs that have tens of thousands of builds and Jenkins becomes unbearably slow because of this bug. We run a script regularly to archive old builds (simply by moving some of the directories under jobs/foo/1234 to jobs/foo/oldbuilds/1234), but that's really just a kludge.

            Show
            tsuna Benoit Sigoure added a comment - How do people cope with this bug? We have Jenkins jobs that have tens of thousands of builds and Jenkins becomes unbearably slow because of this bug. We run a script regularly to archive old builds (simply by moving some of the directories under jobs/foo/1234 to jobs/foo/oldbuilds/1234 ), but that's really just a kludge.
            Hide
            markewaite Mark Waite added a comment -

            I believe this is another case of a BuildData related problem in the plugin as described in JENKINS-19022. It is a hard problem to solve without risking major compatibility breaks in the plugin. Refer to pull request 313 for some of the experiments that have been attempted and the challenges which remain.

            Show
            markewaite Mark Waite added a comment - I believe this is another case of a BuildData related problem in the plugin as described in JENKINS-19022 . It is a hard problem to solve without risking major compatibility breaks in the plugin. Refer to pull request 313 for some of the experiments that have been attempted and the challenges which remain.
            Hide
            danielbeck Daniel Beck added a comment -

            The current Jenkins design basically relies on plugins not requiring to load an excessive number of builds to not defeat lazy loading of build records. It looks like the Git Plugin is responsible in this case.

            Note that some of this can be prevented through enough RAM (to prevent GC and discarding of already loaded build records) and fast disks; but if the problem only occurs once per job after startup, the former won't help much.

            Show
            danielbeck Daniel Beck added a comment - The current Jenkins design basically relies on plugins not requiring to load an excessive number of builds to not defeat lazy loading of build records. It looks like the Git Plugin is responsible in this case. Note that some of this can be prevented through enough RAM (to prevent GC and discarding of already loaded build records) and fast disks; but if the problem only occurs once per job after startup, the former won't help much.
            Hide
            tsuna Benoit Sigoure added a comment -

            I originally thought this was related to renaming a project, but this seems to be a general issue when restarting Jenkins. When we upgraded to 1.634 we simply restart Jenkins with the new .war file and even after 12h it wasn't done parsing the XML files from our tens of thousands of archived builds. We ended up moving all the old build directories out of the way and this immediately resolved the problem for us (we just can't access these old builds anymore, but that's less problematic than Jenkins itself being stuck on certain projects).

            Show
            tsuna Benoit Sigoure added a comment - I originally thought this was related to renaming a project, but this seems to be a general issue when restarting Jenkins. When we upgraded to 1.634 we simply restart Jenkins with the new .war file and even after 12h it wasn't done parsing the XML files from our tens of thousands of archived builds. We ended up moving all the old build directories out of the way and this immediately resolved the problem for us (we just can't access these old builds anymore, but that's less problematic than Jenkins itself being stuck on certain projects).
            Hide
            tsuna Benoit Sigoure added a comment -

            This doesn't seem to be specific to only project renames, we're seeing this upon simply restarting Jenkins to upgrade to 1.634

            Show
            tsuna Benoit Sigoure added a comment - This doesn't seem to be specific to only project renames, we're seeing this upon simply restarting Jenkins to upgrade to 1.634

              People

              • Assignee:
                Unassigned
                Reporter:
                tsuna Benoit Sigoure
              • Votes:
                2 Vote for this issue
                Watchers:
                7 Start watching this issue

                Dates

                • Created:
                  Updated: