Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-42940

Timeout step hangs after restart if timeout occurred, but enclosed block did not exit yet

    Details

    • Similar Issues:
    • Released As:
      workflow-basic-steps 2.20

      Description

      In case the timeout occurs, and Jenkins is restarted during the grace period if waits for the inner block to terminate, then the build hangs forever with this exception in the Jenkins log:

      2020-03-13 02:09:40.575+0000 [id=1502]  WARNING o.j.p.w.f.FlowExecutionList$ItemListenerImpl$1#onFailure: Failed to load CpsFlowExecution[Owner[devops-gate/master/blackbox-self-service/25907:devops-gate/master/blackbox-self-service #25907]]
      java.lang.NullPointerException
              at org.jenkinsci.plugins.workflow.steps.TimeoutStepExecution.cancel(TimeoutStepExecution.java:151)
              at org.jenkinsci.plugins.workflow.steps.TimeoutStepExecution.setupTimer(TimeoutStepExecution.java:139)
              at org.jenkinsci.plugins.workflow.steps.TimeoutStepExecution.onResume(TimeoutStepExecution.java:90)
              at org.jenkinsci.plugins.workflow.flow.FlowExecutionList$ItemListenerImpl$1.onSuccess(FlowExecutionList.java:185)
              at org.jenkinsci.plugins.workflow.flow.FlowExecutionList$ItemListenerImpl$1.onSuccess(FlowExecutionList.java:180)
              ...
      

      Reproducability of this issue relies on a block that does not immediately Exit. For example:

      node {
       timeout (time: 10, unit: 'SECONDS') {
         build job: 'hang2', parameters: [ new StringParameterValue('A','B') ], quietPeriod: 0
       }}
      

      with a second Pipeline Job hang2:

      retry(3) {
           sleep 300
       }
      

      Creates this console log:

      Gestartet durch Benutzer RK
       [Pipeline] node
       Running on host in /$JENKINS_HOME/workspace/hang
       [Pipeline] {
       [Pipeline] timeout
       Timeout set to expire in 10 Sekunden
       [Pipeline] {
       [Pipeline] build (Building hang2)
       Scheduling project: hang2
       Starting building: hang2 #1
       Cancelling nested steps due to timeout
       Resuming build at Fri Mar 10 15:49:00 CET 2017 after Jenkins restart
       Waiting to resume hang #1|: ???
       Waiting to resume hang #1|: host ist offline
       Waiting to resume hang #1|: host ist offline
      
      Ready to run at Fri Mar 10 15:49:10 CET 2017
      
      Timeout expired 3,7 Sekunden ago
      

      ... and then it hangs forever.

      Reason: when onResume() is called, the timer is expired, so cancel() is called, and since it already tried to cancel, forcible is true, and then killer is null, causing an NPE.

      Fix: Check killer for null on line 94 in cancel() in TimeoutStepExecution().

      Rationale for Major, not minor bug: breaks restart resiliense.

        Attachments

          Issue Links

            Activity

            Hide
            jglick Jesse Glick added a comment -

            Sounds like a bug. Would need to spend time reproducing in a functional test.

            Show
            jglick Jesse Glick added a comment - Sounds like a bug. Would need to spend time reproducing in a functional test.
            Hide
            dnusbaum Devin Nusbaum added a comment -

            This was also reported as JENKINS-61019. I reproduced it in a test and filed a PR to fix this in a way that still results in the body being cancelled, see jenkinsci/workflow-basic-steps-plugin#112.

            Show
            dnusbaum Devin Nusbaum added a comment - This was also reported as JENKINS-61019 . I reproduced it in a test and filed a PR to fix this in a way that still results in the body being cancelled, see jenkinsci/workflow-basic-steps-plugin#112 .
            Hide
            dnusbaum Devin Nusbaum added a comment -

            A fix for this issue was just released in version 2.20 of Pipeline: Basic Steps Plugin.

            Show
            dnusbaum Devin Nusbaum added a comment - A fix for this issue was just released in version 2.20 of Pipeline: Basic Steps Plugin .

              People

              • Assignee:
                dnusbaum Devin Nusbaum
                Reporter:
                rk R K
              • Votes:
                1 Vote for this issue
                Watchers:
                4 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: