Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-61779

Regression: Job stuck in queue waiting forever after upgrade

    Details

    • Similar Issues:

      Description

      I "apt dist-upgrade"-d my build Jenkins server, and was updated from 2.204.4 to 2.222.1, and now I am running into issues.

      The major issue is that my throttled builds do not work. I have a build (see attached creenshot)
      that is configured with throttling:

      <hudson.plugins.throttleconcurrents.ThrottleJobProperty plugin="throttle-concurrents@2.0.2">
      <maxConcurrentPerNode>1</maxConcurrentPerNode>
      <maxConcurrentTotal>1</maxConcurrentTotal>
      <categories class="java.util.concurrent.CopyOnWriteArrayList">
      <string>vm-installer</string>
      <string>kickstart-repo</string>
      </categories>
      <throttleEnabled>true</throttleEnabled>
      <throttleOption>category</throttleOption>
      <limitOneJobWithMatchingParams>true</limitOneJobWithMatchingParams>
      <matrixOptions>
      <throttleMatrixBuilds>false</throttleMatrixBuilds>
      <throttleMatrixConfigurations>true</throttleMatrixConfigurations>
      </matrixOptions>
      <paramsToUseForLimit></paramsToUseForLimit>
      <configVersion>1</configVersion>
      </hudson.plugins.throttleconcurrents.ThrottleJobProperty>

      It is set up to build on a single node:

      <hudson.matrix.LabelAxis>
      <name>label</name>
      <values>
      <string>kickstartbuild</string>
      </values>
      </hudson.matrix.LabelAxis>

      The pop-up says that is waiting for the next available executor, which is "master", which is the only one satisfying this label. It is idle, but Jenkins is still waiting.

      The configuration worked fine before upgrading. Restarting Jenkins does not help.

        Attachments

          Activity

          Hide
          acampeau Alain Campeau added a comment -

          Ran into the same problem. Everything is fine using version 2.204.6 but the problem shows up with version 2.205. So as previously suggested, https://github.com/jenkinsci/jenkins/pull/3983 seems the likely cause.

          After some investigation it seems that the cause of the problem is the use of a label used to refer to the master node in jobs (for node restriction purposes). For instance, we have the "dispatch" label configured for the "master" node and use it in multi-configuration jobs which are the ones that get stuck.

          If I modify a stuck job to use "master" instead of the "DISPATCH" label, the job gets triggered as before. Same thing if I just configure the job to no longer have node restrictions.

          Show
          acampeau Alain Campeau added a comment - Ran into the same problem. Everything is fine using version 2.204.6 but the problem shows up with version 2.205. So as previously suggested,  https://github.com/jenkinsci/jenkins/pull/3983  seems the likely cause. After some investigation it seems that the cause of the problem is the use of a label used to refer to the master node in jobs (for node restriction purposes). For instance, we have the "dispatch" label configured for the "master" node and use it in multi-configuration jobs which are the ones that get stuck. If I modify a stuck job to use "master" instead of the "DISPATCH" label, the job gets triggered as before. Same thing if I just configure the job to no longer have node restrictions.
          Hide
          danielbeck Daniel Beck added a comment -

          I tried to reproduce this issue, but failed to do so.

          If any of you could try to figure out instructions how to reproduce this problem from scratch, please provide detailed and complete instructions how to do that.

          Show
          danielbeck Daniel Beck added a comment - I tried to reproduce this issue, but failed to do so. If any of you could try to figure out instructions how to reproduce this problem from scratch, please provide detailed and complete instructions how to do that.
          Hide
          acampeau Alain Campeau added a comment -

          I've managed to reproduce from scratch on a newly set up Jenkins server (Windows) configured with a single node/agent (Windows). This way I've stripped off a lot of things from our production setup as to eliminate as many possible causes for this.

          Here are the minimum steps I needed to repro:

          • Install latest Jenkins LTS release (2.222.3) on a Windows machine with the default set of plugins
          • Configure the master node with a label such as DISPATCH
          • Add a new node and configure it as a new agent. Due to limited resources I configured the new node/agent to run on the same Windows machine. Configure it with its own label such as WINDOWS and make sure "Usage" configuration is set to "Only build jobs with label expressions matching the node"
          •  Created a new Multi-configuration project job:
            • Configure this job's "Restrict where this project can run" setting so its "Label Expression" value is the one specified for master, so DISPATCH
            • Configure this job's "Configuration Matrix" by adding:
              • a "User-defined matrix" axis with a "Name" of "TARGET" and "Values" of "XboxOne PS4 Switch" (any strings to mimic building for various platforms)
              • a "Slaves" axis with a "Name" of "TARGET_POOL" and make sure to check the "WINDOWS" checkbox - Configure this job's "Build" section by adding a dummy "Execute Windows batch command" whose content is simply an "@echo Hello world!"

          If I launch this job using Jenkins 2.222.3, 2.205 or anything in between, the job is stuck waiting for an executor on master when there are 2 available and the agent using the WINDOWS label is free with at least a single executor.

          If I launch this job using Jenkins 2.204.6 or earlier, the job successfully launches and sequentially runs all three XboxOne, PS4 and Switch configurations on the sole agent using the WINDOWS label while the job itself "runs" on the master node (even though all it does is dispatch really).

          On our production server we use the "Dynamic Axis" plugin to dynamically build an axis of all platforms to build and have multiple Windows, Linux and Mac build machines using an OS-specific labels. But for the sake of keeping these repro steps simple, I've dropped all but one OS and removed the "Dynamic Axis" plugin usage. It doesn't logically make much sense but shows the behavior difference starting with the Jenkins 2.205 release.

          Show
          acampeau Alain Campeau added a comment - I've managed to reproduce from scratch on a newly set up Jenkins server (Windows) configured with a single node/agent (Windows). This way I've stripped off a lot of things from our production setup as to eliminate as many possible causes for this. Here are the minimum steps I needed to repro: Install latest Jenkins LTS release (2.222.3) on a Windows machine with the default set of plugins Configure the master node with a label such as DISPATCH Add a new node and configure it as a new agent. Due to limited resources I configured the new node/agent to run on the same Windows machine. Configure it with its own label such as WINDOWS and make sure "Usage" configuration is set to "Only build jobs with label expressions matching the node"  Created a new Multi-configuration project job: Configure this job's "Restrict where this project can run" setting so its "Label Expression" value is the one specified for master, so DISPATCH Configure this job's "Configuration Matrix" by adding: a "User-defined matrix" axis with a "Name" of "TARGET" and "Values" of "XboxOne PS4 Switch" (any strings to mimic building for various platforms) a "Slaves" axis with a "Name" of "TARGET_POOL" and make sure to check the "WINDOWS" checkbox - Configure this job's "Build" section by adding a dummy "Execute Windows batch command" whose content is simply an "@echo Hello world!" If I launch this job using Jenkins 2.222.3, 2.205 or anything in between, the job is stuck waiting for an executor on master when there are 2 available and the agent using the WINDOWS label is free with at least a single executor. If I launch this job using Jenkins 2.204.6 or earlier, the job successfully launches and sequentially runs all three XboxOne, PS4 and Switch configurations on the sole agent using the WINDOWS label while the job itself "runs" on the master node (even though all it does is dispatch really). On our production server we use the "Dynamic Axis" plugin to dynamically build an axis of all platforms to build and have multiple Windows, Linux and Mac build machines using an OS-specific labels. But for the sake of keeping these repro steps simple, I've dropped all but one OS and removed the "Dynamic Axis" plugin usage. It doesn't logically make much sense but shows the behavior difference starting with the Jenkins 2.205 release.
          Hide
          danielbeck Daniel Beck added a comment -

          Alain Campeau Thanks for these steps, I'll try to reproduce them when I have some time.

          About

          • Configure this job's "Restrict where this project can run" setting so its "Label Expression" value is the one specified for master, so DISPATCH

          What happens when you don't check that box, or specify "master" here? Would that be a viable workaround for this problem, and if not, why not?

          Show
          danielbeck Daniel Beck added a comment - Alain Campeau Thanks for these steps, I'll try to reproduce them when I have some time. About Configure this job's "Restrict where this project can run" setting so its "Label Expression" value is the one specified for master, so DISPATCH What happens when you don't check that box, or specify "master" here? Would that be a viable workaround for this problem, and if not, why not?
          Hide
          sleestack Chris McAfee added a comment - - edited

          We are seeing this problem on LTS 2.222.4

          Oleg Nenashev, can you take a look at this one?

          Show
          sleestack Chris McAfee added a comment - - edited We are seeing this problem on LTS 2.222.4 Oleg Nenashev , can you take a look at this one?

            People

            • Assignee:
              Unassigned
              Reporter:
              nafmo Peter Krefting
            • Votes:
              3 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

              • Created:
                Updated: