Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-60507

Pipeline stuck when allocating machine | node block appears to be neither running nor scheduled

    Details

    • Similar Issues:

      Description

       Our build system is sometimes showing this in the Thread Dump of a Pipeline while waiting for free executors

      Thread #94
      at DSL.node(node block appears to be neither running nor scheduled)
      at WorkflowScript.runOnNode(WorkflowScript:1798)
      at DSL.timeout(body has another 3 hr 14 min to run)
      at WorkflowScript.runOnNode(WorkflowScript:1783)
      at DSL.retry(Native Method)
      at WorkflowScript.runOnNode(WorkflowScript:1781)
      at WorkflowScript.getClosure(WorkflowScript:1901)

       
      In BlueOcean this appears, but the build queue is empty, and executors are available with those labels.

      Still waiting to schedule task
      Waiting for next available executor on pr&&prod&&mac&&build

       

      The job can only be completed by aborting or waiting for the timeout step to do it’s work.

      We started observing it since v2.121.3 (workflow-durable-task-step v2.19) but recently we updated to v2.190.1 (workflow-durable-task-step v2.28) and still seeing stuck pipelines when waiting for executors.

      The only reference I could find was in the last comment of this issue: https://issues.jenkins-ci.org/browse/JENKINS-42556 and there’s no way we can reproduce it. We’ve noticed this fix made by Jesse Glick but not sure if it will help us. We tried turning on Anonymous for a week and we still saw the problem.

      Please let me know if there’s more information/logs that I can help with to track down what might be the cause of this. Thanks.

      I've attached FINEST level logs on hudson.model.Queue, not sure if that will help a lot.
      Our Jenkins runs on RedHat, on Tomcat/9.0.14 and Java 1.8.0_171.

        Attachments

        1. plugins_versions.txt
          5 kB
        2. queue.logs.zip
          1.26 MB
        3. screenshot-1.png
          screenshot-1.png
          40 kB

          Activity

          Hide
          kdemenkov Konstantin Demenkov added a comment - - edited

          I have the same issue on latest 2.204.1 LTS. It appears pretty often (10% of jobs) in working with proxmox slaves over proxmox cloud plugin and jnlp. I suspect some incompatibility in timeouts/ connection's logic between master and proxmox slaves, but really don't know, why it happens.

          Show
          kdemenkov Konstantin Demenkov added a comment - - edited I have the same issue on latest 2.204.1 LTS. It appears pretty often (10% of jobs) in working with proxmox slaves over proxmox cloud plugin and jnlp. I suspect some incompatibility in timeouts/ connection's logic between master and proxmox slaves, but really don't know, why it happens.

            People

            • Assignee:
              Unassigned
              Reporter:
              stoiky Mihai Stoichitescu
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated: