Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-70034

Stopped builds keep displaying in executor status

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Unresolved
    • Icon: Major Major
    • core
    • None

      Regularly, we have builds that seems to starve on executor status while they are stopped (success, failed or canceled).

      This has two direct effects:

      1. Executors are locked and can't be used by other builds
      1. We use cloud nodes (Google Compute Engine) with retention time. Thus, VMs are not reclaimed and uncessarly increase cost by maintaining unused Cloud resources.

      I don't know what causing the issue. But here are some notable points:

      • If I click on red cross (from executor status pane), confirmation dialog message is: "Are you sure you want to abort null"
      • Red cross link is <JENKINS_URL>/computer/<NODE_NAME>/executors/0/stopBuild?runExtId=
      • After deleting node from node management UI, it appears disconnected (instead of being removed BUT VM is properly destroyed) with an executor line "Unknown Pipeline node step" (same red cross URL as above)
      • When accessing "Pipeline Steps" for the build, we have a request processing error (see attached log: 2022-11-07_08-13-55_starving_nodes.log)
      • Looking at running thread (see Console Script below), I have several org.jenkinsci.plugins.workflow.steps.durable_task.DurableTaskStep <NUMBER> waiting (see output below) :
      Thread.getAllStackTraces().each { thread,traces -> 
        println "\n ================================ ${thread.name} [${thread.state}] ================================"
        traces.each { println it }
      }
      
       ================================ org.jenkinsci.plugins.workflow.steps.durable_task.DurableTaskStep [#91456] [TIMED_WAITING] ================================
      java.base@11.0.16.1/jdk.internal.misc.Unsafe.park(Native Method)
      java.base@11.0.16.1/java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:234)
      java.base@11.0.16.1/java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2123)
      java.base@11.0.16.1/java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.poll(ScheduledThreadPoolExecutor.java:1218)
      java.base@11.0.16.1/java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.poll(ScheduledThreadPoolExecutor.java:899)
      java.base@11.0.16.1/java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1053)
      java.base@11.0.16.1/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1114)
      java.base@11.0.16.1/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
      java.base@11.0.16.1/java.lang.Thread.run(Thread.java:829)
      

            Unassigned Unassigned
            loganmzz Logan Mzz
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

              Created:
              Updated: