Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-53775

FileNotFoundException for program.dat when running a Pipeline Job concurrently with the Job DSL plugin

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Resolved (View Workflow)
    • Priority: Minor
    • Resolution: Fixed
    • Labels:
      None
    • Environment:
    • Similar Issues:
    • Released As:
      job-dsl 1.76

      Description

      I unfortunately don't have an easy reproduer, but this bug happens to me fairly regularly (at least a few times a month). For this failure to occur:

      • A Freestyle job is running and performing a "Process Job DSLs" step. Part of this involves updating an existing Pipeline job.
      • That existing Pipeline job is also running (or starting) at around the same time as the "Process Job DLSs" step is trying to update it.

      When the timing is just right, the Pipeline job fails with:

      java.io.FileNotFoundException: /var/jenkins_home/jobs/devops-gate/jobs/projects/jobs/dx4linux/jobs/delphix-build-and-snapshots/jobs/ami-snapshots/builds/1173/program.dat (No such file or directory)
      	at java.io.FileInputStream.open0(Native Method)
      	at java.io.FileInputStream.open(FileInputStream.java:195)
      	at java.io.FileInputStream.<init>(FileInputStream.java:138)
      	at org.jenkinsci.plugins.workflow.support.pickles.serialization.RiverReader.openStreamAt(RiverReader.java:188)
      	at org.jenkinsci.plugins.workflow.support.pickles.serialization.RiverReader.restorePickles(RiverReader.java:136)
      	at org.jenkinsci.plugins.workflow.cps.CpsFlowExecution.loadProgramAsync(CpsFlowExecution.java:773)
      	at org.jenkinsci.plugins.workflow.cps.CpsFlowExecution.onLoad(CpsFlowExecution.java:739)
      	at org.jenkinsci.plugins.workflow.job.WorkflowRun.getExecution(WorkflowRun.java:875)
      	at org.jenkinsci.plugins.workflow.job.WorkflowRun.onLoad(WorkflowRun.java:745)
      	at hudson.model.RunMap.retrieve(RunMap.java:225)
      	at hudson.model.RunMap.retrieve(RunMap.java:57)
      	at jenkins.model.lazy.AbstractLazyLoadRunMap.load(AbstractLazyLoadRunMap.java:499)
      	at jenkins.model.lazy.AbstractLazyLoadRunMap.load(AbstractLazyLoadRunMap.java:481)
      	at jenkins.model.lazy.AbstractLazyLoadRunMap.getByNumber(AbstractLazyLoadRunMap.java:379)
      	at jenkins.model.lazy.LazyBuildMixIn.getBuildByNumber(LazyBuildMixIn.java:231)
      	at org.jenkinsci.plugins.workflow.job.WorkflowJob.getBuildByNumber(WorkflowJob.java:234)
      	at org.jenkinsci.plugins.workflow.job.WorkflowJob.getBuildByNumber(WorkflowJob.java:105)
      	at hudson.model.Run.fromExternalizableId(Run.java:2436)
      	at org.jenkinsci.plugins.workflow.support.steps.build.RunWrapper.getRawBuild(RunWrapper.java:71)
      	at org.jenkinsci.plugins.workflow.support.steps.build.RunWrapper.build(RunWrapper.java:75)
      	at org.jenkinsci.plugins.workflow.support.steps.build.RunWrapper.setResult(RunWrapper.java:87)
      	at sun.reflect.GeneratedMethodAccessor820.invoke(Unknown Source)
      	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
      	at java.lang.reflect.Method.invoke(Method.java:498)
      	at org.codehaus.groovy.reflection.CachedMethod.invoke(CachedMethod.java:93)
      	at groovy.lang.MetaMethod.doMethodInvoke(MetaMethod.java:325)
      	at groovy.lang.MetaClassImpl.setProperty(MetaClassImpl.java:2725)
      	at groovy.lang.MetaClassImpl.setProperty(MetaClassImpl.java:3770)
      	at org.codehaus.groovy.runtime.InvokerHelper.setProperty(InvokerHelper.java:201)
      	at org.codehaus.groovy.runtime.ScriptBytecodeAdapter.setProperty(ScriptBytecodeAdapter.java:484)
      	at org.kohsuke.groovy.sandbox.impl.Checker$7.call(Checker.java:347)
      	at org.kohsuke.groovy.sandbox.GroovyInterceptor.onSetProperty(GroovyInterceptor.java:84)
      	at org.jenkinsci.plugins.scriptsecurity.sandbox.groovy.SandboxInterceptor.onSetProperty(SandboxInterceptor.java:197)
      	at org.kohsuke.groovy.sandbox.impl.Checker$7.call(Checker.java:344)
      	at org.kohsuke.groovy.sandbox.impl.Checker.checkedSetProperty(Checker.java:351)
      	at com.cloudbees.groovy.cps.sandbox.SandboxInvoker.setProperty(SandboxInvoker.java:33)
      	at com.cloudbees.groovy.cps.impl.PropertyAccessBlock.rawSet(PropertyAccessBlock.java:24)
      	at com.cloudbees.groovy.cps.impl.PropertyishBlock$ContinuationImpl.set(PropertyishBlock.java:88)
      	at com.cloudbees.groovy.cps.impl.AssignmentBlock$ContinuationImpl.assignAndDone(AssignmentBlock.java:70)
      	at sun.reflect.GeneratedMethodAccessor706.invoke(Unknown Source)
      	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
      	at java.lang.reflect.Method.invoke(Method.java:498)
      	at com.cloudbees.groovy.cps.impl.ContinuationPtr$ContinuationImpl.receive(ContinuationPtr.java:72)
      	at com.cloudbees.groovy.cps.impl.ConstantBlock.eval(ConstantBlock.java:21)
      	at com.cloudbees.groovy.cps.Next.step(Next.java:83)
      	at com.cloudbees.groovy.cps.Continuable$1.call(Continuable.java:174)
      	at com.cloudbees.groovy.cps.Continuable$1.call(Continuable.java:163)
      	at org.codehaus.groovy.runtime.GroovyCategorySupport$ThreadCategoryInfo.use(GroovyCategorySupport.java:122)
      	at org.codehaus.groovy.runtime.GroovyCategorySupport.use(GroovyCategorySupport.java:261)
      	at com.cloudbees.groovy.cps.Continuable.run0(Continuable.java:163)
      	at org.jenkinsci.plugins.workflow.cps.SandboxContinuable.access$101(SandboxContinuable.java:34)
      	at org.jenkinsci.plugins.workflow.cps.SandboxContinuable.lambda$run0$0(SandboxContinuable.java:59)
      	at org.jenkinsci.plugins.scriptsecurity.sandbox.groovy.GroovySandbox.runInSandbox(GroovySandbox.java:108)
      	at org.jenkinsci.plugins.workflow.cps.SandboxContinuable.run0(SandboxContinuable.java:58)
      	at org.jenkinsci.plugins.workflow.cps.CpsThread.runNextChunk(CpsThread.java:174)
      	at org.jenkinsci.plugins.workflow.cps.CpsThreadGroup.run(CpsThreadGroup.java:332)
      	at org.jenkinsci.plugins.workflow.cps.CpsThreadGroup.access$200(CpsThreadGroup.java:83)
      	at org.jenkinsci.plugins.workflow.cps.CpsThreadGroup$2.call(CpsThreadGroup.java:244)
      	at org.jenkinsci.plugins.workflow.cps.CpsThreadGroup$2.call(CpsThreadGroup.java:232)
      	at org.jenkinsci.plugins.workflow.cps.CpsVmExecutorService$2.call(CpsVmExecutorService.java:64)
      Caused: java.io.IOException: Failed to load build state
      	at org.jenkinsci.plugins.workflow.cps.CpsFlowExecution$3.onSuccess(CpsFlowExecution.java:854)
      	at org.jenkinsci.plugins.workflow.cps.CpsFlowExecution$3.onSuccess(CpsFlowExecution.java:852)
      	at org.jenkinsci.plugins.workflow.cps.CpsFlowExecution$4$1.run(CpsFlowExecution.java:906)
      	at org.jenkinsci.plugins.workflow.cps.CpsVmExecutorService$1.run(CpsVmExecutorService.java:35)
      	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
      	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
      	at hudson.remoting.SingleLaneExecutorService$1.run(SingleLaneExecutorService.java:131)
      	at jenkins.util.ContextResettingExecutorService$1.run(ContextResettingExecutorService.java:28)
      	at jenkins.security.ImpersonatingExecutorService$1.run(ImpersonatingExecutorService.java:59)
      	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
      	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
      	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
      	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
      	at java.lang.Thread.run(Thread.java:748)
      Finished: FAILURE
      [Pipeline] stage
      [Pipeline] { (Cloning repos)
      [Pipeline] node
      19:23:07 Running on jenkins-agent3 in /var/tmp/jenkins_slaves/jenkins-ops/workspace/devops-gate/projects/dx4linux/delphix-build-and-snapshots/ami-snapshots
      [Pipeline] {
      [Pipeline] checkout
      19:23:07 Wiping out workspace first.
      19:23:08 Cloning the remote Git repository
      [...]
      [Pipeline] // timestamps
      [Pipeline] End of Pipeline
      Finished: SUCCESS
      

      Several things are unusual:

      • The job continues running, even after the Pipeline plugin printed "Finished: FAILURE."
      • If the job was running from a parent job, the parent job shows the job as failed (even though the child job is still running).
      • The job gets to the finally block of my pipeline and sends me a notification that it failed, however, the value of currentBuild.result is paradoxically SUCCESS, which leads me to believe that somehow the pipeline got to my finally block but not my catch block (which sets currentBuild.result to FAILURE on any exception). While the details of this behavior might be specific to my pipeline, the general behavior is very wonky.

        Attachments

          Activity

          Hide
          mbochenk Nikita Bochenko added a comment -

          P.S.

          Had a quick look and in many places we are using setConcurrentProperty which is not marked deprecated in docs. This is not related to DSL, however it seems to me that this is the same bug, just in a different context.

          Show
          mbochenk Nikita Bochenko added a comment - P.S. Had a quick look and in many places we are using setConcurrentProperty which is not marked deprecated in docs. This is not related to DSL, however it seems to me that this is the same bug, just in a different context.
          Hide
          mbochenk Nikita Bochenko added a comment -

          P.P.S.

          For freestyle jobs I have to set up {{concurrentBuild(true).}}I think by default freestyle jobs are not allowed concurrent execution. Not related to pipeline, of course, but feels inconsistent and may cause confusion?

          Show
          mbochenk Nikita Bochenko added a comment - P.P.S. For freestyle jobs I have to set up {{concurrentBuild(true).}}I think by default freestyle jobs are not allowed concurrent execution. Not related to pipeline, of course, but feels inconsistent and may cause confusion?
          Hide
          basil Basil Crow added a comment -

          I'm not surprised Throttle Concurrent Builds is on the scene of the crime here, but I don't want to jump to any conclusions yet without knowing more details about your configuration. It may or may not be related to the Pipeline race being described here. As an aside, if you're using Throttle Concurrent Builds with categories and Pipeline, you really should be using my patch from throttle-concurrent-builds-plugin#57, which improves CPU usage drastically and also has some correctness benefits. But that may or may not be related to this bug.

          Show
          basil Basil Crow added a comment - I'm not surprised Throttle Concurrent Builds is on the scene of the crime here, but I don't want to jump to any conclusions yet without knowing more details about your configuration. It may or may not be related to the Pipeline race being described here. As an aside, if you're using Throttle Concurrent Builds with categories and Pipeline, you really should be using my patch from throttle-concurrent-builds-plugin#57 , which improves CPU usage drastically and also has some correctness benefits. But that may or may not be related to this bug.
          Hide
          basil Basil Crow added a comment -

          PS if you use that Throttle Concurrent Builds patch and find that it helps, please comment on the PR. I've been using it successfully for over 6 months and trying to get it merged/released (including asking for the privileges to merge/release it myself) but so far have received no response.

          Show
          basil Basil Crow added a comment - PS if you use that Throttle Concurrent Builds patch and find that it helps, please comment on the PR. I've been using it successfully for over 6 months and trying to get it merged/released (including asking for the privileges to merge/release it myself) but so far have received no response.
          Hide
          basil Basil Crow added a comment -

          I've been running with the new syntax described in my previous comment for about a month, and this issue hasn't occurred again. Previously it occurred almost every time I deployed changes to a Job DSL pipeline.

          Show
          basil Basil Crow added a comment - I've been running with the new syntax described in my previous comment for about a month, and this issue hasn't occurred again. Previously it occurred almost every time I deployed changes to a Job DSL pipeline.

            People

            • Assignee:
              daspilker Daniel Spilker
              Reporter:
              basil Basil Crow
            • Votes:
              3 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: