Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-20967

Cloud provisioning called when Jenkins is quieting Down

    Details

    • Type: Bug
    • Status: Open (View Workflow)
    • Priority: Major
    • Resolution: Unresolved
    • Component/s: core
    • Labels:
      None
    • Similar Issues:

      Description

      If Jenkins is quieting down and there are builds in the queue, nodes are still provisioned from any clouds.

      Ideally, Jenkins would not provision new slaves when it is supposed to be quieting down.

        Attachments

          Issue Links

            Activity

            Hide
            stephenconnolly Stephen Connolly added a comment -

            Removing myself as assignee. My current work assignments do not provide sufficient bandwidth to review these issues and in the majority of cases I am only assigned by virtue of being the default assignee. For the credentials-api and scm-api related plugins I have permission to allocate time reviewing changes to these APIs themselves to ensure these APIs remain cohesive, but that can be handled through PR reviews rather than assigning issues in JIRA

            Show
            stephenconnolly Stephen Connolly added a comment - Removing myself as assignee. My current work assignments do not provide sufficient bandwidth to review these issues and in the majority of cases I am only assigned by virtue of being the default assignee. For the credentials-api and scm-api related plugins I have permission to allocate time reviewing changes to these APIs themselves to ensure these APIs remain cohesive, but that can be handled through PR reviews rather than assigning issues in JIRA
            Hide
            pjdarton pjdarton added a comment -

            I've just hit this issue in my own working environment... but it's fortunate that I found this issue report as I was thinking of coding a workaround as described in Ryan's initial comment as I hadn't considered Thomas' concerns...

            SitRep:
            So, back in 2015, Jesse said to wait for PR 1596 - that was merged in early 2016.
            Thomas's PR is still readable, but it was closed due to inactivity early this year (2018).
            Looking at the history for NodeProvisioner, Stephen wrote most of it - kinda ironic that Stephen un-assigned this only a week ago :-/

            TL;DR: That PR needs a lot of tidying up to extract the core intended changes, followed by a review by folks who know this code.

            Show
            pjdarton pjdarton added a comment - I've just hit this issue in my own working environment... but it's fortunate that I found this issue report as I was thinking of coding a workaround as described in Ryan's initial comment as I hadn't considered Thomas' concerns ... SitRep: So, back in 2015, Jesse said to wait for PR 1596 - that was merged in early 2016. Thomas's PR is still readable, but it was closed due to inactivity early this year (2018). Looking at the history for NodeProvisioner , Stephen wrote most of it - kinda ironic that Stephen un-assigned this only a week ago :-/ TL;DR: That PR needs a lot of tidying up to extract the core intended changes, followed by a review by folks who know this code.
            Hide
            jglick Jesse Glick added a comment -

            The PR being linked to is for JENKINS-27034, which sounds unrelated. I think Thomas Suckow was merely saying that the fixes for both would touch similar areas of code, so he wanted to serialize them. If there is a PR open for this issue, it is not mentioned here.

            I would not be inclined to waste much more time on Queue + Cloud + NodeProvisioner when there is a more straightforward way of provisioning a “one-shot” agent on demand for a particular build, exemplified by the dockerNode step in docker-plugin.

            Show
            jglick Jesse Glick added a comment - The PR being linked to is for JENKINS-27034 , which sounds unrelated. I think Thomas Suckow was merely saying that the fixes for both would touch similar areas of code, so he wanted to serialize them. If there is a PR open for this issue, it is not mentioned here. I would not be inclined to waste much more time on Queue + Cloud + NodeProvisioner when there is a more straightforward way of provisioning a “one-shot” agent on demand for a particular build, exemplified by the dockerNode step in docker-plugin .
            Hide
            pjdarton pjdarton added a comment -

            From what I've read, it's the incorrect counting of the runnable workload that's causing this issue - it may well be that the fix for JENKINS-27034 will help fix this issue (or perhaps even fix this problem entirely).
            i.e. This issue may just be a symptom of JENKINS-27034.

            Also, I would not consider time spent fixing Queue/Cloud/NodeProvisioner as time wasted - that's all core cloud functionality that's used to provide executors by all cloud plugins (e.g. we use docker, vSphere and OpenStack; there are others).

            I appreciate that dockerNode is useful, but pipeline-specified one-shot nodes aren't the answer to everything. When it takes a long time for a node to start up (e.g. fully featured VMs rather than lightweight containers), it's important to have clouds configured to supply nodes (with a retention strategy that is not "one shot") in order to maintain build throughput.

            FYI I didn't encounter this issue via the docker-plugin; I noticed this because the Jenkins core was asking the vsphere-plugin for new nodes (where dockerNode isn't a viable replacement) and I was monitoring my vSphere cloud at the time. There may well have been OpenStack and Docker nodes being created as well (but I wasn't monitoring those at the time).

            Show
            pjdarton pjdarton added a comment - From what I've read, it's the incorrect counting of the runnable workload that's causing this issue - it may well be that the fix for JENKINS-27034 will help fix this issue (or perhaps even fix this problem entirely). i.e. This issue may just be a symptom of JENKINS-27034 . Also, I would not consider time spent fixing Queue/Cloud/NodeProvisioner as time wasted - that's all core cloud functionality that's used to provide executors by all cloud plugins (e.g. we use docker, vSphere and OpenStack; there are others). I appreciate that dockerNode is useful, but pipeline-specified one-shot nodes aren't the answer to everything. When it takes a long time for a node to start up (e.g. fully featured VMs rather than lightweight containers), it's important to have clouds configured to supply nodes (with a retention strategy that is not "one shot") in order to maintain build throughput. FYI I didn't encounter this issue via the docker-plugin; I noticed this because the Jenkins core was asking the vsphere-plugin for new nodes (where dockerNode isn't a viable replacement) and I was monitoring my vSphere cloud at the time. There may well have been OpenStack and Docker nodes being created as well (but I wasn't monitoring those at the time).
            Hide
            jglick Jesse Glick added a comment -

            This issue may just be a symptom of JENKINS-27034.

            Might be. A functional test ought to be able to find out.

            it's important to have clouds configured to supply nodes (with a retention strategy that is not "one shot") in order to maintain build throughput

            Well, there is nothing stopping an implementation from keeping a pool of booted and warm VMs ready for use. But yes this was off-topic.

            Show
            jglick Jesse Glick added a comment - This issue may just be a symptom of JENKINS-27034 . Might be. A functional test ought to be able to find out. it's important to have clouds configured to supply nodes (with a retention strategy that is not "one shot") in order to maintain build throughput Well, there is nothing stopping an implementation from keeping a pool of booted and warm VMs ready for use. But yes this was off-topic.

              People

              • Assignee:
                Unassigned
                Reporter:
                recampbell Ryan Campbell
              • Votes:
                2 Vote for this issue
                Watchers:
                6 Start watching this issue

                Dates

                • Created:
                  Updated: