Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-36547

Queue.Task.getFullDisplayName is a poor choice of key for LoadBalancer.CONSISTENT_HASH

    XMLWordPrintable

    Details

    • Similar Issues:

      Description

      The default LoadBalancer, CONSISTENT_HASH, tries to reassign the same node to repeated builds of a given project, so as to reuse workspaces whenever possible. However to identify the project it uses Queue.Task.getFullDisplayName as seen here. For an AbstractProject this works OK, since this method will be implemented in AbstractItem and remains pretty stable across builds—at least so long as you do not rename the project or a containing folder!

      For Pipeline builds it does not work at all, since the PlaceholderTask implementation encodes the build number. Thus, the key to the load balancer changes with every build, and there is no affinity.

      Queue.Task.getName used to be documented to serve this function but it is no different in the case of PlaceholderTask, and worse in the case of AbstractProject (since it throws out the project path, though it is more resilient to display name changes). getUrl sounds like a better key, but again for PlaceholderTask it is intentionally sensitive to the Run's build number, so that you can click on a queue item and see which Pipeline build it was coming from, not just which job. And Task has no getFullName, oddly.

      The load balancer could perhaps call getOwnerTask before using getFullDisplayName. This would have no effect on AbstractProject, but for PlaceholderTask would let it delegate to WorkflowJob, so the affinity would work tolerably well, at least so long as your build has only a single node block—for multiple blocks it will tend to break down. There is also some risk of mangling affinity calculations for other SubTask implementations with different needs.

      What we really need is an extended API for a Task which can specify a dedicated affinity key directly, not overloaded by UI elements. AbstractProject should implement it to return getFullName. PlaceholderTask could implement this at a minimum to return WorkflowJob.getFullName, but could also do better by looking up an enclosing label for the node as in JENKINS-26132 and concatenating that with the job name, so that for example a script like

      stage('Build') {
        node('unix') {
          git '.../main-sources'
          sh './build.sh'
          stash 'binaries'
        }
      }
      stage('Test') {
        parallel unixTests: {
          node('unix') {
            git '.../tests'
            unstash 'binaries'
            sh './test.sh'
          }
        }, windowsTests: {
          node('windows') {
            git '.../tests'
            unstash 'binaries'
            bat 'test.bat'
          }
        }
      }
      stage('Deploy') {
        node('unix') {
          git '.../tools'
          unstash 'binaries'
          sh './deploy.sh'
        }
      }
      

      would result in four affinity keys being used: prjname#Build, prhname#unixTests, prjname#windowsTests, prjname#Deploy, each resulting in distinct persistent workspaces.

        Attachments

          Issue Links

            Activity

            Hide
            johncalsbeek John Calsbeek added a comment -

            I'll throw in a +1 for this! We similarly had a "regression" switching to pipeline because of the lack of stable assignment.

            Show
            johncalsbeek John Calsbeek added a comment - I'll throw in a +1 for this! We similarly had a "regression" switching to pipeline because of the lack of stable assignment.
            Hide
            paulvanderende Paul van der Ende added a comment - - edited

            Same issue here. Besides that our incremental builds suffer from poor locality because the pipeline always start on a different node then last time.

            Show
            paulvanderende Paul van der Ende added a comment - - edited Same issue here. Besides that our incremental builds suffer from poor locality because the pipeline always start on a different node then last time.
            Hide
            jglick Jesse Glick added a comment -

            As mentioned parenthetically in the PR for JENKINS-48348, the implementation of getFullDisplayName may also be more expensive than we would want for calls from Queue.maintain. A dedicated method can be optimized to distinct constraints.

            Show
            jglick Jesse Glick added a comment - As mentioned parenthetically in the PR for  JENKINS-48348 , the implementation of getFullDisplayName may also be more expensive than we would want for calls from Queue.maintain . A dedicated method can be optimized to distinct constraints.
            Hide
            macdrega Joerg Schwaerzler added a comment -

            Jesse Glick: Thanks for mentioning. We can see in our instance allocating a node takes between 10 - 30s this may be caused by this. I'd like to give it a try - however I'm unsure how to improve the algorithm for speed. Should I open a separate ticket for this?

            Maybe in a first step there could be the fix for the affinity and in the seconds step there could be a performance improvement.

            Show
            macdrega Joerg Schwaerzler added a comment - Jesse Glick : Thanks for mentioning. We can see in our instance allocating a node takes between 10 - 30s this may be caused by this. I'd like to give it a try - however I'm unsure how to improve the algorithm for speed. Should I open a separate ticket for this? Maybe in a first step there could be the fix for the affinity and in the seconds step there could be a performance improvement.
            Hide
            macdrega Joerg Schwaerzler added a comment -

            For backward compatibility of the jenkins core to older plugins I'd propose to do a default implementation of - let's say - getAffinityKey() in Queue.Task.

            Show
            macdrega Joerg Schwaerzler added a comment - For backward compatibility of the jenkins core to older plugins I'd propose to do a default implementation of - let's say -  getAffinityKey() in Queue.Task.
            Hide
            oleg_nenashev Oleg Nenashev added a comment -

            Core part was released in Jenkins 2.145

            Show
            oleg_nenashev Oleg Nenashev added a comment - Core part was released in Jenkins 2.145

              People

              • Assignee:
                Unassigned
                Reporter:
                jglick Jesse Glick
              • Votes:
                7 Vote for this issue
                Watchers:
                9 Start watching this issue

                Dates

                • Created:
                  Updated: