Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-36285

Tasks leaving the queue can be slow with a massive queue

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Unresolved
    • Icon: Minor Minor
    • core
    • None
    • 1.651

      In some experiments with scale testing, we've discovered that a truly massive queue (i.e., on the order of 1,000+ tasks in the queue) can result in delays for tasks to leave the queue and actually end up on an executor. In the example I tried out, I created a Pipeline job that forked out three parallel branches on nodes and then kicked that job off 1,000 times -

      def branch(name) {
        return {
          node {
            sh "sleep 30 && head -c 52428800 /dev/urandom > ${name}.bin"
            //archive "${name}.bin"
            stash includes: "${name}.bin", name: "${name}"
          }
        }
      }
      
      
      stage "Thinking"
      
      for (i = 0; i < 5; i++) {
        sleep time: 250, unit: 'MILLISECONDS'
        echo "Thinking $i"
      }
      
      stage "Working"
      
      def branches = [:]
      branches["b1"] = branch("b1")
      branches["b2"] = branch("b2")
      branches["b3"] = branch("b3")
      
      parallel branches
      
      stage "Resting"
      
      for (i = 0; i < 5; i++) {
        sleep time: 150, unit: 'MILLISECONDS'
        echo "Resting $i"
      }
      

      I kicked the builds off with a loop inside another Pipeline job calling the build step, so the initial population of the queue wasn't immediate - at first, all the tasks that entered the queue were able to be allocated to one of the 60 executors I had available for them more or less instantaneously. But once the queue was filled up, I added another 40 executors - by then, there were over 2,000 tasks in the queue and it took a couple minutes for those 40 executors to be allocated.

      I then created another agent with 100 executors and added it - the queue was around 2,300 at that point and it again took a few minutes to fill the executors. Freed executors on the existing agent also took some time to fill up. I grabbed a thread dump while that was going on - it's attached.

      I'm going to try this again with freestyle rather than Pipeline jobs once the build queue has cleaned up.

            Unassigned Unassigned
            abayer Andrew Bayer
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

              Created:
              Updated: