Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-6598

Let Node and NodeProperty have more control over whether a node can run a task

    Details

    • Type: Patch
    • Status: Closed
    • Priority: Minor
    • Resolution: Done
    • Component/s: core
    • Labels:
      None
    • Similar Issues:
      Show 5 results

      Description

      Right now, the only logic to determine whether a Node can run a particular Queue.Task is in JobOffer.canTake(Task). The logic is as follows:

      1. Check if the task has an assigned label; if it does and this node is not in the label, the node can't take the task
      2. If the task does not have an assigned label and this node only allows tied jobs (Mode.EXCLUSIVE), the node can't take the task
      3. If the node is offline or not accepting tasks, the node can't take the task

      I would like to add Node.canTake(Task) and NodeProperty.canTake(Task) methods. The JobOffer.canTake(Task) method would be changed to call Node.canTake(), moving checks #1 and #2 into the Node.canTake() implementation. Node.canTake() would then call NodeProperty.canTake(Task) on all of its assigned properties; if any of them return false, Node.canTake(Task) will also return false. The default implementation in the NodeProperty base class will return true.

      This allows Node subclasses and custom NodeProperties to control whether or not a particular Task should go to a particular Node, making it possible to do things like capabilities-based job assignment as opposed to the manually-intensive use of tying and node labels.

      I'm attaching a patch I've made to our internal copy of Hudson to make this change. I believe I have commit privileges to commit this if nobody objects to this change, otherwise I can get one of the other Yahoo! folks to do it.

        Attachments

          Activity

          mdillon mdillon created issue -
          Hide
          mdillon mdillon added a comment -

          I just wanted to point out one way this could be improved that I didn't include in the patch.

          As it stands, if all nodes reject a task, it will sit in the queue as a BuildableItem (as it should), but its cause of blockage will be the generic message "Waiting for next available executor". The problem is that the existing JobOffer.canTake() only returns a boolean, so the code assumes that if there is no assigned label for the job and it was not taken by a online node, then it must be waiting for an executor.

          One approach to fixing this would be to have Node.canTake(Task) and in turn NodeProperty.canTake(Task) return a CauseOfBlockage. I don't think that it's possible in general to use this CauseOfBlockage as the queue item tooltip, because that would involved folding together multiple CauseOfBlockage instances from all blocking nodes, but it would be possible to show a message like "Rejected by all available executors". The same thing could also be accomplished by adding a Node.getCauseOfBlockage(Task) method, but then BuildableItem.getCauseOfBlockage() would have to call it on all nodes and the settings of the node could have changed since canTake() was called.

          Show
          mdillon mdillon added a comment - I just wanted to point out one way this could be improved that I didn't include in the patch. As it stands, if all nodes reject a task, it will sit in the queue as a BuildableItem (as it should), but its cause of blockage will be the generic message "Waiting for next available executor". The problem is that the existing JobOffer.canTake() only returns a boolean, so the code assumes that if there is no assigned label for the job and it was not taken by a online node, then it must be waiting for an executor. One approach to fixing this would be to have Node.canTake(Task) and in turn NodeProperty.canTake(Task) return a CauseOfBlockage. I don't think that it's possible in general to use this CauseOfBlockage as the queue item tooltip, because that would involved folding together multiple CauseOfBlockage instances from all blocking nodes, but it would be possible to show a message like "Rejected by all available executors". The same thing could also be accomplished by adding a Node.getCauseOfBlockage(Task) method, but then BuildableItem.getCauseOfBlockage() would have to call it on all nodes and the settings of the node could have changed since canTake() was called.
          Hide
          abayer Andrew Bayer added a comment -

          +1 - this would make JENKINS-6586 much, much easier. Well, ok, it'd make it work, is probably the more accurate way to put it, given the bizarre problems I'm having with dynamically adding/removing labels and the resulting changes not actually mattering in terms of whether a job gets run. The code looks good to me, and the functionality will be an excellent addition.

          The CauseOfBlockage stuff could either go in the same change as this, or perhaps more cleanly, a separate change. I'd tend towards the latter.

          Show
          abayer Andrew Bayer added a comment - +1 - this would make JENKINS-6586 much, much easier. Well, ok, it'd make it work , is probably the more accurate way to put it, given the bizarre problems I'm having with dynamically adding/removing labels and the resulting changes not actually mattering in terms of whether a job gets run. The code looks good to me, and the functionality will be an excellent addition. The CauseOfBlockage stuff could either go in the same change as this, or perhaps more cleanly, a separate change. I'd tend towards the latter.
          Hide
          kohsuke Kohsuke Kawaguchi added a comment -

          For the JENKINS-6586 use case, this change by itself is not suffice. You'd need an extension point not scoped to a node, something like:

          interface QueueTaskDispatcher extends ExtensionPoint {
            boolean canTake(Node,Task);
          }
          

          Technically speaking, this would make it possible for custom Node implementations and NodeProperty implementations to insert the canTake logic without the proposed changes, although there's not much harm in leaving it in, either.

          Show
          kohsuke Kohsuke Kawaguchi added a comment - For the JENKINS-6586 use case, this change by itself is not suffice. You'd need an extension point not scoped to a node, something like: interface QueueTaskDispatcher extends ExtensionPoint { boolean canTake(Node,Task); } Technically speaking, this would make it possible for custom Node implementations and NodeProperty implementations to insert the canTake logic without the proposed changes, although there's not much harm in leaving it in, either.
          kohsuke Kohsuke Kawaguchi made changes -
          Field Original Value New Value
          Assignee kohsuke [ kohsuke ]
          Hide
          mdillon mdillon added a comment -

          I'd be happy with either approach. Dean Yu actually suggested an approach similar to this when we discussed the idea of involving Node and NodeProperty in the canTake decision. The reason I went with the approach I did was on analogy with the extensions that have been added the JobProperty over the years to allow it to participate more fully in the build lifecycle.

          Show
          mdillon mdillon added a comment - I'd be happy with either approach. Dean Yu actually suggested an approach similar to this when we discussed the idea of involving Node and NodeProperty in the canTake decision. The reason I went with the approach I did was on analogy with the extensions that have been added the JobProperty over the years to allow it to participate more fully in the build lifecycle.
          Hide
          scm_issue_link SCM/JIRA link daemon added a comment -

          Code changed in hudson
          User: : kohsuke
          Path:
          trunk/hudson/main/core/src/main/java/hudson/model/Node.java
          trunk/hudson/main/core/src/main/java/hudson/model/Queue.java
          trunk/hudson/main/core/src/main/java/hudson/model/queue/QueueTaskDispatcher.java
          trunk/hudson/main/core/src/main/java/hudson/slaves/NodeProperty.java
          trunk/hudson/main/core/src/main/resources/hudson/model/Messages.properties
          trunk/hudson/main/test/src/test/java/hudson/slaves/NodeCanTakeTaskTest.java
          trunk/www/changelog.html
          http://jenkins-ci.org/commit/31304
          Log:
          [FIXED JENKINS-6598] applied a patch from Mike Dillon, plus the separate independent extension point. In 1.360.

          Show
          scm_issue_link SCM/JIRA link daemon added a comment - Code changed in hudson User: : kohsuke Path: trunk/hudson/main/core/src/main/java/hudson/model/Node.java trunk/hudson/main/core/src/main/java/hudson/model/Queue.java trunk/hudson/main/core/src/main/java/hudson/model/queue/QueueTaskDispatcher.java trunk/hudson/main/core/src/main/java/hudson/slaves/NodeProperty.java trunk/hudson/main/core/src/main/resources/hudson/model/Messages.properties trunk/hudson/main/test/src/test/java/hudson/slaves/NodeCanTakeTaskTest.java trunk/www/changelog.html http://jenkins-ci.org/commit/31304 Log: [FIXED JENKINS-6598] applied a patch from Mike Dillon, plus the separate independent extension point. In 1.360.
          scm_issue_link SCM/JIRA link daemon made changes -
          Status Open [ 1 ] Resolved [ 5 ]
          Resolution Fixed [ 1 ]
          abayer Andrew Bayer made changes -
          Status Resolved [ 5 ] Closed [ 6 ]
          rtyler R. Tyler Croy made changes -
          Resolution Fixed [ 1 ]
          Status Closed [ 6 ] Reopened [ 4 ]
          rtyler R. Tyler Croy made changes -
          Reporter mdillon [ mdillon ] Mike Dillon [ md5 ]
          rtyler R. Tyler Croy made changes -
          Status Reopened [ 4 ] Closed [ 6 ]
          Resolution Done [ 10000 ]
          rtyler R. Tyler Croy made changes -
          Workflow JNJira [ 136693 ] JNJira + In-Review [ 204175 ]

            People

            • Assignee:
              kohsuke Kohsuke Kawaguchi
              Reporter:
              md5 Mike Dillon
            • Votes:
              1 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: