Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-38514

CauseOfBlockage from QueueTaskDispatcher.canTake discarded

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Resolved (View Workflow)
    • Priority: Major
    • Resolution: Fixed
    • Component/s: core
    • Labels:
    • Similar Issues:

      Description

      If you have a QueueTaskDispatcher which returns a CauseOfBlockage from canRun, that becomes BlockedItem.getCauseOfBlockage, which is displayed in the queue widget.

      But if it returns a CauseOfBlockage from canTake (AFAICT the same for Node.canTake), JobOffer.canTake sees that it is non-null, throws out the actual object with all of its diagnostics, and you wind up with a BuildableItem with CauseOfBlockage.BecauseNodeIsBusy which tells you nothing and may be totally misleading.

      By asking an implementation to return a @CheckForNull CauseOfBlockage rather than a simple boolean, the implication is that a non-null return value will be displayed to the user. Currently this is not the case.

      To add insult to injury, Support Core does not report the result of canTake.

        Attachments

          Issue Links

            Activity

            Hide
            jglick Jesse Glick added a comment -

            Not clear that support-core can do anything, since canTake requires a specific Node.

            Show
            jglick Jesse Glick added a comment - Not clear that support-core can do anything, since canTake requires a specific Node .
            Hide
            jglick Jesse Glick added a comment -

            Unfortunately it is not obvious how the relevant CauseOfBlockage can be identified: there can be numerous JobOffer s which are considered, yet we would expect most of them to refuse canTake, for example because of Node.LabelMissing. The “buildable” item stays in queue when all of the offers are rejected, but how do we identify the one which we expected to be accepted?

            Can certainly improve detail-level logging to allow the issue to be tracked down, but it is less clear that BuildableItem.getWhy can be improved to display the ultimate problem in the UI (or in support bundles without a custom logger).

            Show
            jglick Jesse Glick added a comment - Unfortunately it is not obvious how the relevant CauseOfBlockage can be identified: there can be numerous JobOffer s which are considered, yet we would expect most of them to refuse canTake , for example because of Node.LabelMissing . The “buildable” item stays in queue when all of the offers are rejected, but how do we identify the one which we expected to be accepted? Can certainly improve detail-level logging to allow the issue to be tracked down, but it is less clear that BuildableItem.getWhy can be improved to display the ultimate problem in the UI (or in support bundles without a custom logger).
            Hide
            jglick Jesse Glick added a comment -

            For those running current core builds who wish to diagnose such issues, try running in /script:

            for (i in Jenkins.instance.queue.buildableItems) {
              println "considering ${i}"
              for (c in Jenkins.instance.computers) {
                println "found computer ${c}"
                EXEC: for (e in c.executors) {
                  if (e.interrupted || !e.parking) continue
                  println "with executor ${e}"
                  def o = new Queue.JobOffer(Jenkins.instance.queue, e, null)
                  if (!o.canTake(i)) {
                    println "${o} refused ${i}"
                    def node = o.node
                    if (node == null) {
                      println "no node associated with ${c}"
                      continue
                    }
                    def cob = node.canTake(i)
                    if (cob != null) {
                      println "because of ${cob}"
                      continue
                    }
                    for (d in hudson.model.queue.QueueTaskDispatcher.all()) {
                      cob = d.canTake(node, i)
                      if (cob != null) {
                        println "because of ${cob} from ${d}"
                        continue EXEC
                      }
                    }
                    if (!o.available) {
                      println "${o} not available"
                      if (o.workUnit != null) println "has a workUnit ${o.workUnit}"
                      if (c.offline) println "${c} is offline"
                      if (!c.acceptingTasks) println "${c} is not accepting tasks"
                    }
                  }
                }
              }
            }
            

            In one reported case, the root issue was that the Authorize Project plugin was configured, so Node.canTake was returning anonymous doesn’t have a permission to run on [sic]; yet the build queue (and support bundle) displayed only Waiting for next available executor.

            Show
            jglick Jesse Glick added a comment - For those running current core builds who wish to diagnose such issues, try running in /script : for (i in Jenkins.instance.queue.buildableItems) { println "considering ${i}" for (c in Jenkins.instance.computers) { println "found computer ${c}" EXEC: for (e in c.executors) { if (e.interrupted || !e.parking) continue println "with executor ${e}" def o = new Queue.JobOffer(Jenkins.instance.queue, e, null ) if (!o.canTake(i)) { println "${o} refused ${i}" def node = o.node if (node == null ) { println "no node associated with ${c}" continue } def cob = node.canTake(i) if (cob != null ) { println "because of ${cob}" continue } for (d in hudson.model.queue.QueueTaskDispatcher.all()) { cob = d.canTake(node, i) if (cob != null ) { println "because of ${cob} from ${d}" continue EXEC } } if (!o.available) { println "${o} not available" if (o.workUnit != null ) println "has a workUnit ${o.workUnit}" if (c.offline) println "${c} is offline" if (!c.acceptingTasks) println "${c} is not accepting tasks" } } } } } In one reported case, the root issue was that the Authorize Project plugin was configured, so Node.canTake was returning anonymous doesn’t have a permission to run on [sic]; yet the build queue (and support bundle) displayed only Waiting for next available executor .
            Hide
            scm_issue_link SCM/JIRA link daemon added a comment -

            Code changed in jenkins
            User: Jesse Glick
            Path:
            core/src/main/java/hudson/model/Node.java
            core/src/main/java/hudson/model/Queue.java
            core/src/main/java/hudson/model/queue/CauseOfBlockage.java
            core/src/main/java/jenkins/model/queue/CompositeCauseOfBlockage.java
            core/src/main/resources/hudson/model/Messages.properties
            core/src/main/resources/jenkins/model/queue/CompositeCauseOfBlockage/summary.jelly
            test/src/test/java/hudson/model/queue/QueueTaskDispatcherTest.java
            test/src/test/java/hudson/slaves/NodeCanTakeTaskTest.java
            http://jenkins-ci.org/commit/jenkins/8d23041d4b785947dee1bc02f54a41d86b59bdda
            Log:
            JENKINS-38514 Retain CauseOfBlockage from JobOffer (#2651)

            • Converted to JenkinsRule.
            • Improved messages from Node.canTake.
            • [FIXED JENKINS-38514] BuildableItem needs to retain information from JobOffer about why it is neither blocked nor building.
            • Converted to JenkinsRule.
            • Found an existing usage of BecauseNodeIsNotAcceptingTasks.
            • Ensure that a BuildableItem which is simply waiting for a free executor reports that as its CauseOfBlockage.
            • Review comments from @oleg-nenashev.
            Show
            scm_issue_link SCM/JIRA link daemon added a comment - Code changed in jenkins User: Jesse Glick Path: core/src/main/java/hudson/model/Node.java core/src/main/java/hudson/model/Queue.java core/src/main/java/hudson/model/queue/CauseOfBlockage.java core/src/main/java/jenkins/model/queue/CompositeCauseOfBlockage.java core/src/main/resources/hudson/model/Messages.properties core/src/main/resources/jenkins/model/queue/CompositeCauseOfBlockage/summary.jelly test/src/test/java/hudson/model/queue/QueueTaskDispatcherTest.java test/src/test/java/hudson/slaves/NodeCanTakeTaskTest.java http://jenkins-ci.org/commit/jenkins/8d23041d4b785947dee1bc02f54a41d86b59bdda Log: JENKINS-38514 Retain CauseOfBlockage from JobOffer (#2651) Converted to JenkinsRule. Improved messages from Node.canTake. [FIXED JENKINS-38514] BuildableItem needs to retain information from JobOffer about why it is neither blocked nor building. Converted to JenkinsRule. Found an existing usage of BecauseNodeIsNotAcceptingTasks. Original JENKINS-6598 test was checking behavior we want amended by JENKINS-38514 . Ensure that a BuildableItem which is simply waiting for a free executor reports that as its CauseOfBlockage. Review comments from @oleg-nenashev.

              People

              • Assignee:
                jglick Jesse Glick
                Reporter:
                jglick Jesse Glick
              • Votes:
                1 Vote for this issue
                Watchers:
                4 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: