Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-45107

NullPointerException in lockable-resources when resuming pipeline after restart, with a lock statement which was opened pre-restart

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Not A Defect
    • Icon: Major Major
    • None
    • Jenkins LTS 2.46.3, lockable-resources 2.0

      I have to caveat this upfront that I have two Jenkins masters, both on the same LTS version, with the same version of lockable-resources and the same versions of all the pipeline plugins (latest as available to 2.46.3 today, please ask if you need versions), and this bug only occurs on one of them, but I can't see why.

      Script used to repro:

      def builders = [:]
      
      def stages = [
      	[name:'first', lockName:'lock-first'],
      	[name:'second', lockName:'lock-second'],
      	[name:'first-two', lockName:'lock-first'],
      	[name:'second-two', lockName:'lock-second']
      ]
      
      stages.each { stage ->
      	builders[stage.name] = {
      		lock(stage.lockName) {
                  node {
      				commonBuildCommand()
      			}
      		}
      	}
      }
      
      parallel builders
      
      def commonBuildCommand() {
      	sleep 10
      	sleep 10
      }
      

      When I run this script (and restart during it, as per the log), I will see the following output:

      [Pipeline] parallel
      [Pipeline] [first] { (Branch: first)
      [Pipeline] [second] { (Branch: second)
      [Pipeline] [first-two] { (Branch: first-two)
      [Pipeline] [second-two] { (Branch: second-two)
      [Pipeline] [first] lock
      [first] Trying to acquire lock on [lock-first]
      [first] Lock acquired on [lock-first]
      [Pipeline] [first] {
      [Pipeline] [second] lock
      [second] Trying to acquire lock on [lock-second]
      [second] Lock acquired on [lock-second]
      [Pipeline] [second] {
      [Pipeline] [first-two] lock
      [first-two] Trying to acquire lock on [lock-first]
      [first-two] Found 0 available resource(s). Waiting for correct amount: 1.
      [first-two] [lock-first] is locked, waiting...
      [Pipeline] [second-two] lock
      [second-two] Trying to acquire lock on [lock-second]
      [second-two] Found 0 available resource(s). Waiting for correct amount: 1.
      [second-two] [lock-second] is locked, waiting...
      [Pipeline] [first] node
      [first] Running on <node> in <workspace>/Test_Restart
      [Pipeline] [second] node
      [second] Running on <node> in <workspace>/Test_Restart@2
      [Pipeline] [first] {
      [Pipeline] [second] {
      [Pipeline] [first] sleep
      [first] Sleeping for 10 sec
      [Pipeline] [second] sleep
      [second] Sleeping for 10 sec
      Resuming build at Fri Jun 23 15:15:52 BST 2017 after Jenkins restart
      Waiting to resume part of Experiments ยป Test_Restart #24: ???
      Ready to run at Fri Jun 23 15:15:53 BST 2017
      [Pipeline] [first] sleep
      [first] Sleeping for 10 sec
      [Pipeline] [second] sleep
      [second] Sleeping for 10 sec
      [Pipeline] [first] }
      [Pipeline] [first] // node
      [Pipeline] [first] }
      [first] Lock released on resource [lock-first]
      [Pipeline] [second] }
      [Pipeline] [first] // lock
      [Pipeline] [second] // node
      [Pipeline] [first] }
      [Pipeline] [second] }
      [second] Lock released on resource [lock-second]
      [Pipeline] [second] // lock
      [Pipeline] [second] }
      

      The script will now stall at this point. The following will be output to the Jenkins log at this time (when FINE logging is available for lockable-resources):

      Jun 23, 2017 3:16:03 PM WARNING org.jenkins.plugins.lockableresources.LockableResourcesManager unlockNames
      Skipping queued context for lock. Can not get the Run object from the context to proceed with lock, this could be a legitimate status if the build waiting for the lock was deleted or hard killed. More information at Level.FINE if debug is needed.
      Jun 23, 2017 3:16:03 PM FINE org.jenkins.plugins.lockableresources.LockableResourcesManager
      Can not get the Run object from the context to proceed with lock
      java.lang.NullPointerException
      	at org.jenkinsci.plugins.workflow.cps.CpsStepContext.getFlowExecution(CpsStepContext.java:472)
      	at org.jenkinsci.plugins.workflow.cps.CpsStepContext.getThreadGroupSynchronously(CpsStepContext.java:243)
      	at org.jenkinsci.plugins.workflow.cps.CpsStepContext.getThreadSynchronously(CpsStepContext.java:237)
      	at org.jenkinsci.plugins.workflow.cps.CpsStepContext.doGet(CpsStepContext.java:294)
      	at org.jenkinsci.plugins.workflow.support.DefaultStepContext.get(DefaultStepContext.java:61)
      	at org.jenkins.plugins.lockableresources.LockableResourcesManager.unlockNames(LockableResourcesManager.java:387)
      	at org.jenkins.plugins.lockableresources.LockStepExecution$Callback.finished(LockStepExecution.java:84)
      	at org.jenkinsci.plugins.workflow.steps.BodyExecutionCallback$TailCall.onSuccess(BodyExecutionCallback.java:114)
      	at org.jenkinsci.plugins.workflow.cps.CpsBodyExecution$SuccessAdapter.receive(CpsBodyExecution.java:362)
      	at com.cloudbees.groovy.cps.Outcome.resumeFrom(Outcome.java:73)
      	at com.cloudbees.groovy.cps.Continuable$1.call(Continuable.java:165)
      	at com.cloudbees.groovy.cps.Continuable$1.call(Continuable.java:162)
      	at org.codehaus.groovy.runtime.GroovyCategorySupport$ThreadCategoryInfo.use(GroovyCategorySupport.java:122)
      	at org.codehaus.groovy.runtime.GroovyCategorySupport.use(GroovyCategorySupport.java:261)
      	at com.cloudbees.groovy.cps.Continuable.run0(Continuable.java:162)
      	at org.jenkinsci.plugins.workflow.cps.SandboxContinuable.access$001(SandboxContinuable.java:19)
      	at org.jenkinsci.plugins.workflow.cps.SandboxContinuable$1.call(SandboxContinuable.java:35)
      	at org.jenkinsci.plugins.workflow.cps.SandboxContinuable$1.call(SandboxContinuable.java:32)
      	at org.jenkinsci.plugins.scriptsecurity.sandbox.groovy.GroovySandbox.runInSandbox(GroovySandbox.java:108)
      	at org.jenkinsci.plugins.workflow.cps.SandboxContinuable.run0(SandboxContinuable.java:32)
      	at org.jenkinsci.plugins.workflow.cps.CpsThread.runNextChunk(CpsThread.java:174)
      	at org.jenkinsci.plugins.workflow.cps.CpsThreadGroup.run(CpsThreadGroup.java:330)
      	at org.jenkinsci.plugins.workflow.cps.CpsThreadGroup.access$100(CpsThreadGroup.java:82)
      	at org.jenkinsci.plugins.workflow.cps.CpsThreadGroup$2.call(CpsThreadGroup.java:242)
      	at org.jenkinsci.plugins.workflow.cps.CpsThreadGroup$2.call(CpsThreadGroup.java:230)
      	at org.jenkinsci.plugins.workflow.cps.CpsVmExecutorService$2.call(CpsVmExecutorService.java:64)
      	at java.util.concurrent.FutureTask.run(FutureTask.java:262)
      	at hudson.remoting.SingleLaneExecutorService$1.run(SingleLaneExecutorService.java:112)
      	at jenkins.util.ContextResettingExecutorService$1.run(ContextResettingExecutorService.java:28)
      	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
      	at java.util.concurrent.FutureTask.run(FutureTask.java:262)
      	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
      	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
      	at java.lang.Thread.run(Thread.java:745)
      
      Jun 23, 2017 3:16:03 PM WARNING org.jenkins.plugins.lockableresources.LockableResourcesManager unlockNames
      Skipping queued context for lock. Can not get the Run object from the context to proceed with lock, this could be a legitimate status if the build waiting for the lock was deleted or hard killed. More information at Level.FINE if debug is needed.
      Jun 23, 2017 3:16:03 PM FINE org.jenkins.plugins.lockableresources.LockableResourcesManager
      Can not get the Run object from the context to proceed with lock
      java.lang.NullPointerException
      	at org.jenkinsci.plugins.workflow.cps.CpsStepContext.getFlowExecution(CpsStepContext.java:472)
      	at org.jenkinsci.plugins.workflow.cps.CpsStepContext.getThreadGroupSynchronously(CpsStepContext.java:243)
      	at org.jenkinsci.plugins.workflow.cps.CpsStepContext.getThreadSynchronously(CpsStepContext.java:237)
      	at org.jenkinsci.plugins.workflow.cps.CpsStepContext.doGet(CpsStepContext.java:294)
      	at org.jenkinsci.plugins.workflow.support.DefaultStepContext.get(DefaultStepContext.java:61)
      	at org.jenkins.plugins.lockableresources.LockableResourcesManager.unlockNames(LockableResourcesManager.java:387)
      	at org.jenkins.plugins.lockableresources.LockStepExecution$Callback.finished(LockStepExecution.java:84)
      	at org.jenkinsci.plugins.workflow.steps.BodyExecutionCallback$TailCall.onSuccess(BodyExecutionCallback.java:114)
      	at org.jenkinsci.plugins.workflow.cps.CpsBodyExecution$SuccessAdapter.receive(CpsBodyExecution.java:362)
      	at com.cloudbees.groovy.cps.Outcome.resumeFrom(Outcome.java:73)
      	at com.cloudbees.groovy.cps.Continuable$1.call(Continuable.java:165)
      	at com.cloudbees.groovy.cps.Continuable$1.call(Continuable.java:162)
      	at org.codehaus.groovy.runtime.GroovyCategorySupport$ThreadCategoryInfo.use(GroovyCategorySupport.java:122)
      	at org.codehaus.groovy.runtime.GroovyCategorySupport.use(GroovyCategorySupport.java:261)
      	at com.cloudbees.groovy.cps.Continuable.run0(Continuable.java:162)
      	at org.jenkinsci.plugins.workflow.cps.SandboxContinuable.access$001(SandboxContinuable.java:19)
      	at org.jenkinsci.plugins.workflow.cps.SandboxContinuable$1.call(SandboxContinuable.java:35)
      	at org.jenkinsci.plugins.workflow.cps.SandboxContinuable$1.call(SandboxContinuable.java:32)
      	at org.jenkinsci.plugins.scriptsecurity.sandbox.groovy.GroovySandbox.runInSandbox(GroovySandbox.java:108)
      	at org.jenkinsci.plugins.workflow.cps.SandboxContinuable.run0(SandboxContinuable.java:32)
      	at org.jenkinsci.plugins.workflow.cps.CpsThread.runNextChunk(CpsThread.java:174)
      	at org.jenkinsci.plugins.workflow.cps.CpsThreadGroup.run(CpsThreadGroup.java:330)
      	at org.jenkinsci.plugins.workflow.cps.CpsThreadGroup.access$100(CpsThreadGroup.java:82)
      	at org.jenkinsci.plugins.workflow.cps.CpsThreadGroup$2.call(CpsThreadGroup.java:242)
      	at org.jenkinsci.plugins.workflow.cps.CpsThreadGroup$2.call(CpsThreadGroup.java:230)
      	at org.jenkinsci.plugins.workflow.cps.CpsVmExecutorService$2.call(CpsVmExecutorService.java:64)
      	at java.util.concurrent.FutureTask.run(FutureTask.java:262)
      	at hudson.remoting.SingleLaneExecutorService$1.run(SingleLaneExecutorService.java:112)
      	at jenkins.util.ContextResettingExecutorService$1.run(ContextResettingExecutorService.java:28)
      	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
      	at java.util.concurrent.FutureTask.run(FutureTask.java:262)
      	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
      	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
      	at java.lang.Thread.run(Thread.java:745)
      

      When I then abort the job I'll get this output in the Jenkins log:

      Jun 23, 2017 3:16:25 PM WARNING org.jenkins.plugins.lockableresources.LockStepExecution stop
      Cannot remove context from lockable resource witing list. The context is not in the waiting list.
      Jun 23, 2017 3:16:25 PM WARNING org.jenkins.plugins.lockableresources.LockStepExecution stop
      Cannot remove context from lockable resource witing list. The context is not in the waiting list.
      

      And this in the script:

      Aborted by Phil McArdle (phil.mcardle)
      [Pipeline] [second-two] // lock
      [Pipeline] [first-two] // lock
      [Pipeline] [first-two] }
      [first-two] Failed in branch first-two
      [Pipeline] [second-two] }
      [second-two] Failed in branch second-two
      [Pipeline] // parallel
      [Pipeline] End of Pipeline
      Exception: null
      org.jenkinsci.plugins.workflow.steps.FlowInterruptedException
      	at org.jenkinsci.plugins.workflow.cps.CpsBodyExecution.cancel(CpsBodyExecution.java:242)
      	at org.jenkinsci.plugins.workflow.steps.BodyExecution.cancel(BodyExecution.java:76)
      	at org.jenkinsci.plugins.workflow.cps.steps.ParallelStepExecution.stop(ParallelStepExecution.java:66)
      	at org.jenkinsci.plugins.workflow.cps.CpsThread.stop(CpsThread.java:296)
      	at org.jenkinsci.plugins.workflow.cps.CpsFlowExecution$6.onSuccess(CpsFlowExecution.java:966)
      	at org.jenkinsci.plugins.workflow.cps.CpsFlowExecution$6.onSuccess(CpsFlowExecution.java:955)
      	at org.jenkinsci.plugins.workflow.cps.CpsFlowExecution$4$1.run(CpsFlowExecution.java:744)
      	at org.jenkinsci.plugins.workflow.cps.CpsVmExecutorService$1.run(CpsVmExecutorService.java:35)
      	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
      	at java.util.concurrent.FutureTask.run(FutureTask.java:262)
      	at hudson.remoting.SingleLaneExecutorService$1.run(SingleLaneExecutorService.java:112)
      	at jenkins.util.ContextResettingExecutorService$1.run(ContextResettingExecutorService.java:28)
      	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
      	at java.util.concurrent.FutureTask.run(FutureTask.java:262)
      	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
      	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
      	at java.lang.Thread.run(Thread.java:745)
      Finished: ABORTED
      

      There are no other warnings or errors in the Jenkins log throughout this, but immediately after the above, it will echo:

      Jun 23, 2017 3:22:41 PM INFO org.jenkinsci.plugins.workflow.job.WorkflowRun finish
      Experiments/Test_Restart #26 completed: ABORTED
      Jun 23, 2017 3:22:41 PM WARNING org.jenkinsci.plugins.workflow.flow.FlowExecutionList unregister
      Owner[Experiments/Test_Restart/26:Experiments/Test_Restart #26] was not in the list to begin with: []
      

      I'm not sure if there's anything relevant there, but including just in case.

            Unassigned Unassigned
            philmcardlecg Phil McArdle
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

              Created:
              Updated:
              Resolved: