Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-51539

A paused Workflow job does not resume after safeExit when parallel step is wrapped by a node step

    Details

    • Type: Bug
    • Status: Resolved (View Workflow)
    • Priority: Major
    • Resolution: Fixed
    • Environment:
    • Similar Issues:
    • Released As:
      workflow-cps 2.56

      Description

      Hi,

      We have a Jenkins running many pipeline jobs that waits for promotions using input step at the end of the pipeline.

      The input step is NOT tied to a node.

      We also use a docker based cloud plugin to spawn slaves and terminate them based on labels. So no executer is defined on the master and a node('label'){} declaration just do its thing and terminates the slave.
       
      When we upgrade jenkins or restart it from time to time, all of our jobs that wait for promotion are resumed correctly and deployment to prod and staging can continue from where we left them.

      We do have several jobs that do not resume correctly and during startup, their logs looks like this:

      Waiting to resume part of #JOB-NAME: There are no nodes with the label ‘<UNIQUE-DOCKER-LABEL’
      ...
      

      And then after ~6 minutes they fail.

      Trying to understand the issue, I managed to create a VERY simple pipeline that behaves the same

      state('build'){
          node('generic'){
              parallel (
                  a: {
                      echo 'inside a'
                  },
                  b: {
                      echo 'inside b'
                  }
              )
          }
      }
      stage('wait'){
          input message: 'wait??', id: 'wait-for-job'
      }
      

      The 'generic' label is a defined in docker-plugin with jnlp-slave docker image, but I managed to reproduce it with ecs-plugin and kubernetes-plugin as well.

      In order to reproduce the issue, you just need to run this pipeline and wait it will reach the wait stage. Then when it is waiting for input, restart the Jenkins instance gracefully and you will see that the job cannot resume.

      If you move the parallel outside the node, it will resume correctly.

        Attachments

          Issue Links

            Activity

            Hide
            dnusbaum Devin Nusbaum added a comment - - edited

            I believe this was fixed in version 2.56 of the Pipeline Groovy plugin, see JENKINS-53709 which looks like an exact dupe.

            Show
            dnusbaum Devin Nusbaum added a comment - - edited I believe this was fixed in version 2.56 of the Pipeline Groovy plugin, see  JENKINS-53709  which looks like an exact dupe.

              People

              • Assignee:
                Unassigned
                Reporter:
                odavid Ohad David
              • Votes:
                1 Vote for this issue
                Watchers:
                4 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: