Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-56838

pipeline job hangs forever at checkout GitSCM

    Details

    • Type: Bug
    • Status: Closed (View Workflow)
    • Priority: Major
    • Resolution: Duplicate
    • Component/s: git-plugin, pipeline
    • Labels:
    • Environment:
    • Similar Issues:

      Description

      This issue resembles very much JENKINS-43106.

      We have a pipeline job that is run in parallel on 10 different executors every night, multiple times.

      For the past week, the jobs get stuck on gitSCM checkout which is configured as follows:

      dir(sourceDir) {
          deleteDir()
          echo "Checking out ${commitId} from ${url}"
          checkout changelog: updateChanges, scm: [
              $class: 'GitSCM', branches: [[name: commitId]],
              userRemoteConfigs: [[url: url]]]
          commitId = sh returnStdout: true, script: "git rev-parse HEAD"
          echo "Checked out ${commitId} from ${url}"

      When this code is run in another job, it never fails.

      The difference is that when it fails, we have a "manager" job that runs the following:

      def call(testRunners, maxNumberOfTests, buildId) {
          def parallelRuns = [:]
          def numberOfRuns = 0
          for (int i = 0; i < availableExecutors; i++) {
             parallelRuns[i] = {
                waitUntil {
                   build job: 'TestRunner', parameters: [
                     string(name: 'sessionId', value: buildId),
                     string(name: 'randomBit', value: "${randomBit}")
                   ], propagate: false
                return (numberOfRuns > maxNumberOfRuns)}}}
        parallel parallelRuns
      }

      On the other hand it passes when we have a single pipeline, running in parallel the same function.

      Randomly, the TestRunner job will hang on checkout: I can see the first echo. There's no access to the git server (according to the git-daemon logs).

        Attachments

          Issue Links

            Activity

            Hide
            markewaite Mark Waite added a comment -

            The conditions which caused JENKINS-46106 seemed to be specifically connected to Jira plugin version 2.5.0. Since your list of installed plugins does not include Jira plugin 2.5.0, I assume it is not the same condition as JENKINS-46106. I don't have any suggestions of experiments which might help isolate the problem.

            Others might be able to analyze a thread dump from the Jenkins server when the checkout hangs. I don't have that skill.

            Show
            markewaite Mark Waite added a comment - The conditions which caused JENKINS-46106 seemed to be specifically connected to Jira plugin version 2.5.0. Since your list of installed plugins does not include Jira plugin 2.5.0, I assume it is not the same condition as JENKINS-46106 . I don't have any suggestions of experiments which might help isolate the problem. Others might be able to analyze a thread dump from the Jenkins server when the checkout hangs. I don't have that skill.
            Hide
            tsvi Tsvi Mostovicz added a comment - - edited

            From the support logs I found the following hint.

            2019-04-02 02:42:31.157+0000 [id=10666] WARNING hudson.Proc$LocalProc#join: Process leaked file descriptors. See https://jenkins.io/redirect/troubleshooting/process-leaked-file-descriptors for more information
            java.lang.Exception
              at hudson.Proc$LocalProc.join(Proc.java:334)
              at hudson.Proc.joinWithTimeout(Proc.java:170)
              at org.jenkinsci.plugins.gitclient.CliGitAPIImpl.launchCommandIn(CliGitAPIImpl.java:2311)
            {{   at org.jenkinsci.plugins.gitclient.CliGitAPIImpl.launchCommandIn(CliGitAPIImpl.java:2248)}}
              at org.jenkinsci.plugins.gitclient.CliGitAPIImpl.launchCommandIn(CliGitAPIImpl.java:2244)
             at org.jenkinsci.plugins.gitclient.CliGitAPIImpl.launchCommand(CliGitAPIImpl.java:1777)
              at org.jenkinsci.plugins.gitclient.CliGitAPIImpl.launchCommand(CliGitAPIImpl.java:1789)
             at org.jenkinsci.plugins.gitclient.CliGitAPIImpl$9.sparseCheckout(CliGitAPIImpl.java:2675)
              at org.jenkinsci.plugins.gitclient.CliGitAPIImpl$9.execute(CliGitAPIImpl.java:2595)
              at hudson.plugins.git.GitSCM.checkout(GitSCM.java:1228)
              at org.jenkinsci.plugins.workflow.steps.scm.SCMStep.checkout(SCMStep.java:120)
              at org.jenkinsci.plugins.workflow.libs.SCMSourceRetriever.lambda$doRetrieve$1(SCMSourceRetriever.java:147)
             at org.jenkinsci.plugins.workflow.libs.SCMSourceRetriever.retrySCMOperation(SCMSourceRetriever.java:98)
              at org.jenkinsci.plugins.workflow.libs.SCMSourceRetriever.doRetrieve(SCMSourceRetriever.java:146)
              at org.jenkinsci.plugins.workflow.libs.SCMSourceRetriever.retrieve(SCMSourceRetriever.java:87)
              at org.jenkinsci.plugins.workflow.libs.LibraryAdder.retrieve(LibraryAdder.java:157)
              at org.jenkinsci.plugins.workflow.libs.LibraryAdder.add(LibraryAdder.java:138)
             at org.jenkinsci.plugins.workflow.libs.LibraryDecorator$1.call(LibraryDecorator.java:125)
             at org.codehaus.groovy.control.CompilationUnit.applyToPrimaryClassNodes(CompilationUnit.java:1065)
              at org.codehaus.groovy.control.CompilationUnit.doPhaseOperation(CompilationUnit.java:603)
             at org.codehaus.groovy.control.CompilationUnit.processPhaseOperations(CompilationUnit.java:581)
             at org.codehaus.groovy.control.CompilationUnit.compile(CompilationUnit.java:558)
             at groovy.lang.GroovyClassLoader.doParseClass(GroovyClassLoader.java:298)
             at groovy.lang.GroovyClassLoader.parseClass(GroovyClassLoader.java:268)
              at groovy.lang.GroovyShell.parseClass(GroovyShell.java:688)
              at groovy.lang.GroovyShell.parse(GroovyShell.java:700)
              at org.jenkinsci.plugins.workflow.cps.CpsGroovyShell.lambda$doParse$0(CpsGroovyShell.java:135)
              at org.jenkinsci.plugins.scriptsecurity.sandbox.groovy.GroovySandbox.runInSandbox(GroovySandbox.java:136)
              at org.jenkinsci.plugins.workflow.cps.CpsGroovyShell.doParse(CpsGroovyShell.java:132)
              at org.jenkinsci.plugins.workflow.cps.CpsGroovyShell.reparse(CpsGroovyShell.java:127)
              at org.jenkinsci.plugins.workflow.cps.CpsFlowExecution.parseScript(CpsFlowExecution.java:560)
              at org.jenkinsci.plugins.workflow.cps.CpsFlowExecution.start(CpsFlowExecution.java:521)
              at org.jenkinsci.plugins.workflow.job.WorkflowRun.run(WorkflowRun.java:320)
              at hudson.model.ResourceController.execute(ResourceController.java:97)
              at hudson.model.Executor.run(Executor.java:429)

             

            Show
            tsvi Tsvi Mostovicz added a comment - - edited From the support logs I found the following hint. 2019-04-02 02:42:31.157+0000 [id=10666] WARNING hudson.Proc$LocalProc#join: Process leaked file descriptors. See https://jenkins.io/redirect/troubleshooting/process-leaked-file-descriptors for more information java.lang.Exception   at hudson.Proc$LocalProc.join(Proc.java:334)   at hudson.Proc.joinWithTimeout(Proc.java:170)   at org.jenkinsci.plugins.gitclient.CliGitAPIImpl.launchCommandIn(CliGitAPIImpl.java:2311) {{   at org.jenkinsci.plugins.gitclient.CliGitAPIImpl.launchCommandIn(CliGitAPIImpl.java:2248)}}   at org.jenkinsci.plugins.gitclient.CliGitAPIImpl.launchCommandIn(CliGitAPIImpl.java:2244)  at org.jenkinsci.plugins.gitclient.CliGitAPIImpl.launchCommand(CliGitAPIImpl.java:1777)   at org.jenkinsci.plugins.gitclient.CliGitAPIImpl.launchCommand(CliGitAPIImpl.java:1789)  at org.jenkinsci.plugins.gitclient.CliGitAPIImpl$9.sparseCheckout(CliGitAPIImpl.java:2675)   at org.jenkinsci.plugins.gitclient.CliGitAPIImpl$9.execute(CliGitAPIImpl.java:2595)   at hudson.plugins.git.GitSCM.checkout(GitSCM.java:1228)   at org.jenkinsci.plugins.workflow.steps.scm.SCMStep.checkout(SCMStep.java:120)   at org.jenkinsci.plugins.workflow.libs.SCMSourceRetriever.lambda$doRetrieve$1(SCMSourceRetriever.java:147)  at org.jenkinsci.plugins.workflow.libs.SCMSourceRetriever.retrySCMOperation(SCMSourceRetriever.java:98)   at org.jenkinsci.plugins.workflow.libs.SCMSourceRetriever.doRetrieve(SCMSourceRetriever.java:146)   at org.jenkinsci.plugins.workflow.libs.SCMSourceRetriever.retrieve(SCMSourceRetriever.java:87)   at org.jenkinsci.plugins.workflow.libs.LibraryAdder.retrieve(LibraryAdder.java:157)   at org.jenkinsci.plugins.workflow.libs.LibraryAdder.add(LibraryAdder.java:138)  at org.jenkinsci.plugins.workflow.libs.LibraryDecorator$1.call(LibraryDecorator.java:125)  at org.codehaus.groovy.control.CompilationUnit.applyToPrimaryClassNodes(CompilationUnit.java:1065)   at org.codehaus.groovy.control.CompilationUnit.doPhaseOperation(CompilationUnit.java:603)  at org.codehaus.groovy.control.CompilationUnit.processPhaseOperations(CompilationUnit.java:581)  at org.codehaus.groovy.control.CompilationUnit.compile(CompilationUnit.java:558)  at groovy.lang.GroovyClassLoader.doParseClass(GroovyClassLoader.java:298)  at groovy.lang.GroovyClassLoader.parseClass(GroovyClassLoader.java:268)   at groovy.lang.GroovyShell.parseClass(GroovyShell.java:688)   at groovy.lang.GroovyShell.parse(GroovyShell.java:700)   at org.jenkinsci.plugins.workflow.cps.CpsGroovyShell.lambda$doParse$0(CpsGroovyShell.java:135)   at org.jenkinsci.plugins.scriptsecurity.sandbox.groovy.GroovySandbox.runInSandbox(GroovySandbox.java:136)   at org.jenkinsci.plugins.workflow.cps.CpsGroovyShell.doParse(CpsGroovyShell.java:132)   at org.jenkinsci.plugins.workflow.cps.CpsGroovyShell.reparse(CpsGroovyShell.java:127)   at org.jenkinsci.plugins.workflow.cps.CpsFlowExecution.parseScript(CpsFlowExecution.java:560)   at org.jenkinsci.plugins.workflow.cps.CpsFlowExecution.start(CpsFlowExecution.java:521)   at org.jenkinsci.plugins.workflow.job.WorkflowRun.run(WorkflowRun.java:320)   at hudson.model.ResourceController.execute(ResourceController.java:97)   at hudson.model.Executor.run(Executor.java:429)  
            Hide
            tsvi Tsvi Mostovicz added a comment -

            This appears to be due to the fact that Git SCM parses through all of the older builds when checking out.

            (See https://github.com/jenkinsci/git-plugin/blob/34fa174566716c8c6a1ab392a0cf3b5c05fc4d41/src/main/java/hudson/plugins/git/GitSCM.java#L1136)

            As we were not clearing out old builds, over time the performance hit caused by parsing over all the older builds became so huge, checkout would timeout after 1 hour.

            Clearing out old builds resolved the issue for us.

             

            Note that the File variable passed to the function was set to null, as the scm step function calling it had updateChanges set to false.

            This might be a possible enhancement to the GitSCM plugin to eventually skip calculation of the history if the changelog file passed to checkout is null.

             

            As I'm not really sure about how this might affect other parts of the code, I'm weary of implementing it.

             

            Show
            tsvi Tsvi Mostovicz added a comment - This appears to be due to the fact that Git SCM parses through all of the older builds when checking out. (See  https://github.com/jenkinsci/git-plugin/blob/34fa174566716c8c6a1ab392a0cf3b5c05fc4d41/src/main/java/hudson/plugins/git/GitSCM.java#L1136 ) As we were not clearing out old builds, over time the performance hit caused by parsing over all the older builds became so huge, checkout would timeout after 1 hour. Clearing out old builds resolved the issue for us.   Note that the File variable passed to the function was set to null, as the scm step function calling it had updateChanges set to false. This might be a possible enhancement to the GitSCM plugin to eventually skip calculation of the history if the changelog file passed to checkout is null.   As I'm not really sure about how this might affect other parts of the code, I'm weary of implementing it.  
            Hide
            tsvi Tsvi Mostovicz added a comment -

            Mark Waite please see the comment I added. I solved the issue for us, I wonder though who should I tag for the enhancement I proposed. (I don't mind putting in a PR myself, but I have never developed for Jenkins so I'll need some handholding)

            Show
            tsvi Tsvi Mostovicz added a comment - Mark Waite please see the comment I added. I solved the issue for us, I wonder though who should I tag for the enhancement I proposed. (I don't mind putting in a PR myself, but I have never developed for Jenkins so I'll need some handholding)
            Hide
            markewaite Mark Waite added a comment -

            I suspect this is related to JENKINS-19022, git plugin mistakenly retains list of SHA1's of all preceding builds in later builds, bloating memory use. We've attempted at least 3 different times to resolve JENKINS-19022, each time abandoning due to incompatibilities that are introduced by the proposed changes.

            Reducing the number of builds retained in history is the most direct workaround for the problem. Other workarounds exist as well, like groovy scripts that will remove old BuildData records.

            Unless you're ready for months and months of work to implement the fix without causing compatibility problems, I'd recommend you prefer retaining fewer builds rather than attempting to make a code change in the git plugin for this case.

            Show
            markewaite Mark Waite added a comment - I suspect this is related to JENKINS-19022 , git plugin mistakenly retains list of SHA1's of all preceding builds in later builds, bloating memory use. We've attempted at least 3 different times to resolve JENKINS-19022 , each time abandoning due to incompatibilities that are introduced by the proposed changes. Reducing the number of builds retained in history is the most direct workaround for the problem. Other workarounds exist as well, like groovy scripts that will remove old BuildData records. Unless you're ready for months and months of work to implement the fix without causing compatibility problems, I'd recommend you prefer retaining fewer builds rather than attempting to make a code change in the git plugin for this case.
            Hide
            tsvi Tsvi Mostovicz added a comment -

            Based on my read of JENKINS-19022, it seems this is caused by the same issue

            Show
            tsvi Tsvi Mostovicz added a comment - Based on my read of JENKINS-19022 , it seems this is caused by the same issue

              People

              • Assignee:
                Unassigned
                Reporter:
                tsvi Tsvi Mostovicz
              • Votes:
                0 Vote for this issue
                Watchers:
                2 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: