Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-27650

Page loads slow with hundreds of throttled builds in queue

    XMLWordPrintable

    Details

    • Similar Issues:

      Description

      When there are hundreds of throttled builds in the queue, page loads increase by an order of magnitude.

      Steps to reproduce:

      1. Run Jenkins 1.580.2 and latest throttle-concurrent-builds plugin
      2. Create a matrix job with 200 combinations (attached)
      3. In the same job, select "Throttle Concurrent Builds" with a maximum of 7 builds throttled as part of a category called 'semaphore'
      4. Set number of executors on the 'master' queue to 200
      5. Run the job. There should only be 7 builds running due to the throttling

      Page load times will increase by an order of magnitude – I observed 10 seconds from

      time curl http://localhost:8080/jenkins/ajaxBuildQueue

      If you remove the throttling in the job configuration, the page load times will be under 50 ms.

        Attachments

          Issue Links

            Activity

            Hide
            jglick Jesse Glick added a comment -

            Experimenting, no luck so far.

            Show
            jglick Jesse Glick added a comment - Experimenting, no luck so far.
            Hide
            jglick Jesse Glick added a comment -

            JENKINS-19623 apparently was not enough.

            Show
            jglick Jesse Glick added a comment - JENKINS-19623 apparently was not enough.
            Hide
            jglick Jesse Glick added a comment -

            Tried various things in PR 27, but as explained there, the result is not satisfactory. I suspect JENKINS-27708 needs to be addressed first.

            My fear is that the current basic design of ThrottleQueueTaskDispatcher just cannot be made to scale well. I wonder if it would be better to invert the logic: implement ExecutorListener (as a second extension) to track what is running in each category, keeping a map from nodes to a histogram of task counts running by category (WeakHashMap<Node,HashMap<String,Integer>>?). Then canTake/canRun would only need to look up configuration for the proposed job, and do a table lookup to see the current count and compare that to the configured limit.

            I am not sure how that would relate to JENKINS-27708. ExecutorListener seems to be called with the Queue lock held, which is good, but that problem seems to stem from QueueTaskDispatcher being asked to make decisions about multiple jobs before any of them are actually scheduled. The call to taskAccepted does come from new WorkUnitContext, within maintain, so the question is whether this is interleaved with QueueTaskDispatcher calls, or after all of them have completed.

            Show
            jglick Jesse Glick added a comment - Tried various things in PR 27, but as explained there, the result is not satisfactory. I suspect JENKINS-27708 needs to be addressed first. My fear is that the current basic design of ThrottleQueueTaskDispatcher just cannot be made to scale well. I wonder if it would be better to invert the logic: implement ExecutorListener (as a second extension) to track what is running in each category, keeping a map from nodes to a histogram of task counts running by category ( WeakHashMap<Node,HashMap<String,Integer>> ?). Then canTake / canRun would only need to look up configuration for the proposed job, and do a table lookup to see the current count and compare that to the configured limit. I am not sure how that would relate to JENKINS-27708 . ExecutorListener seems to be called with the Queue lock held, which is good, but that problem seems to stem from QueueTaskDispatcher being asked to make decisions about multiple jobs before any of them are actually scheduled. The call to taskAccepted does come from new WorkUnitContext , within maintain , so the question is whether this is interleaved with QueueTaskDispatcher calls, or after all of them have completed.
            Hide
            oleg_nenashev Oleg Nenashev added a comment -

            /** Update to the previous comment:

            • ExecutorListener is not an extension point, we cannot make this approach work
            • There's no listeners in Jenkins core, that could reliably deliver the info
              */

            I've tried to introduce a light-weight off-the-queue caching in PR #28. The result was not satisfactory as well. The performance of canTake() is being improved by up to 10 times on my local benchmarks, but it still no enough to resolve the issue.

            We could somehow merge PRs #27 and #28, but I'm afraid the solution will stay unreliable. An additional synchronisation will be required in such case => scheduling behaviour will be impacted due to the injected quietTimes.

            Hacking of the load balancer could help, but there will be a conflict with other plugins

            Show
            oleg_nenashev Oleg Nenashev added a comment - /** Update to the previous comment: ExecutorListener is not an extension point, we cannot make this approach work There's no listeners in Jenkins core, that could reliably deliver the info */ I've tried to introduce a light-weight off-the-queue caching in PR #28. The result was not satisfactory as well. The performance of canTake() is being improved by up to 10 times on my local benchmarks, but it still no enough to resolve the issue. We could somehow merge PRs #27 and #28, but I'm afraid the solution will stay unreliable. An additional synchronisation will be required in such case => scheduling behaviour will be impacted due to the injected quietTimes. Hacking of the load balancer could help, but there will be a conflict with other plugins
            Hide
            oleg_nenashev Oleg Nenashev added a comment -

            Not in progress anymore.

            Some improvement bits have been integrated into the plugin, but it's not enough IMHO

            Show
            oleg_nenashev Oleg Nenashev added a comment - Not in progress anymore. Some improvement bits have been integrated into the plugin, but it's not enough IMHO

              People

              • Assignee:
                Unassigned
                Reporter:
                recampbell Ryan Campbell
              • Votes:
                12 Vote for this issue
                Watchers:
                20 Start watching this issue

                Dates

                • Created:
                  Updated: