Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-41127

Multiple pipeline instances running concurrently when concurrent execution disabled

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Resolved (View Workflow)
    • Priority: Critical
    • Resolution: Fixed
    • Component/s: core
    • Labels:
    • Environment:
      Jenkins 2.7.1
      Jenkins 2.49
      Pipeline plugin 2.4
    • Similar Issues:
    • Released As:
      Jenkins 2.136

      Description

      I have configured a Jenkins pipeline to disable concurrent builds:

      properties([
          disableConcurrentBuilds()
      ])
      

      However, I have noticed on some occasions the next 2 builds are pulled from the pipeline's queue and executed concurrently. Why this occurs is not obvious at all.

        Attachments

          Issue Links

            Activity

            boon Joe Harte created issue -
            boon Joe Harte made changes -
            Field Original Value New Value
            Description I have configured a Jenkins pipeline to disabled concurrent builds:

            {code}
            properties([
                disableConcurrentBuilds()
            ])
            {code}

            However, I have noticed on some occasions multiple instances of the pipeline execute concurrently. This seems to happen when the pipeline is triggered by an upstream job at the same time that a user manually triggers it - possible race condition?
            I have configured a Jenkins pipeline to disable concurrent builds:

            {code}
            properties([
                disableConcurrentBuilds()
            ])
            {code}

            However, I have noticed on some occasions multiple instances of the pipeline execute concurrently. This seems to happen when the pipeline is triggered by an upstream job at the same time that a user manually triggers it - possible race condition?
            boon Joe Harte made changes -
            Description I have configured a Jenkins pipeline to disable concurrent builds:

            {code}
            properties([
                disableConcurrentBuilds()
            ])
            {code}

            However, I have noticed on some occasions multiple instances of the pipeline execute concurrently. This seems to happen when the pipeline is triggered by an upstream job at the same time that a user manually triggers it - possible race condition?
            I have configured a Jenkins pipeline to disable concurrent builds:

            {code}
            properties([
                disableConcurrentBuilds()
            ])
            {code}

            However, I have noticed on some occasions multiple instances of the pipeline execute concurrently. This seems to happen when the pipeline is triggered by an upstream job at the same time that a user manually triggers it - possible race condition?

            Will attach logs when this occurs again, don't currently have them available.
            Hide
            boon Joe Harte added a comment -

            Bump. Seeing this again. My pipeline is configured to disable concurrent builds, yet I saw two instances running when they were both triggered at exactly the same time, down to the second. Jenkins version now is 2.49.

            Show
            boon Joe Harte added a comment - Bump. Seeing this again. My pipeline is configured to disable concurrent builds, yet I saw two instances running when they were both triggered at exactly the same time, down to the second. Jenkins version now is 2.49.
            Hide
            boon Joe Harte added a comment - - edited

            Saw this yet again. Haven't seen the problem in hundreds of builds since my last comment above, and today I see that Jenkins pulled the next 2 builds from the queue and ran them concurrently, event though concurrent building in explicitly disabled.

             

            Using latest version of Pipeline plugins available at time of writing, and Jenkins version 2.49

             

            Jesse Glick FYI

            Show
            boon Joe Harte added a comment - - edited Saw this yet again. Haven't seen the problem in hundreds of builds since my last comment above, and today I see that Jenkins pulled the next 2 builds from the queue and ran them concurrently, event though concurrent building in explicitly disabled.   Using latest version of Pipeline plugins available at time of writing, and Jenkins version 2.49   Jesse Glick FYI
            boon Joe Harte made changes -
            Environment Jenkins 2.7.1
            Pipeline plugin 2.4
            Jenkins 2.7.1
            Jenkins 2.49
            Pipeline plugin 2.4
            Hide
            jglick Jesse Glick added a comment -

            Without a way to reproduce there is nothing to go on I am afraid.

            Show
            jglick Jesse Glick added a comment - Without a way to reproduce there is nothing to go on I am afraid.
            jglick Jesse Glick made changes -
            Component/s workflow-job-plugin [ 21716 ]
            Component/s pipeline [ 21692 ]
            jglick Jesse Glick made changes -
            Labels bug concurrent pipeline concurrent
            Hide
            nlassai LAKSHMI ANANTHA NALLAMOTHU added a comment -

            Seen this yet again more frequently today. Not quite sure how to reproduce this.

            Show
            nlassai LAKSHMI ANANTHA NALLAMOTHU added a comment - Seen this yet again more frequently today. Not quite sure how to reproduce this.
            boon Joe Harte made changes -
            Component/s throttle-concurrent-builds-plugin [ 15745 ]
            Component/s workflow-job-plugin [ 21716 ]
            boon Joe Harte made changes -
            Component/s workflow-job-plugin [ 21716 ]
            Component/s throttle-concurrent-builds-plugin [ 15745 ]
            Hide
            boon Joe Harte added a comment -

            Jesse Glick Is there anything I can do to help diagnose the problem? This is becoming a serious issue for us, as it is critical we only push one build at a time through certain pipelines.

            The pipeline works fine 90% of the time, and then when a build complete Jenkins will (seemingly randomly) pull the next 2 builds from the queue at once and start executing both concurrently, which totally messes up our pipeline environment.

            Show
            boon Joe Harte added a comment - Jesse Glick Is there anything I can do to help diagnose the problem? This is becoming a serious issue for us, as it is critical we only push one build at a time through certain pipelines. The pipeline works fine 90% of the time, and then when a build complete Jenkins will (seemingly randomly) pull the next 2 builds from the queue at once and start executing both concurrently, which totally messes up our pipeline environment.
            boon Joe Harte made changes -
            Description I have configured a Jenkins pipeline to disable concurrent builds:

            {code}
            properties([
                disableConcurrentBuilds()
            ])
            {code}

            However, I have noticed on some occasions multiple instances of the pipeline execute concurrently. This seems to happen when the pipeline is triggered by an upstream job at the same time that a user manually triggers it - possible race condition?

            Will attach logs when this occurs again, don't currently have them available.
            I have configured a Jenkins pipeline to disable concurrent builds:

            {code}
            properties([
                disableConcurrentBuilds()
            ])
            {code}

            However, I have noticed on some occasions multiple instances of the pipeline execute concurrently for no apparent reason; the next 2 builds are pulled from the queue and exited at the same time.
            boon Joe Harte made changes -
            Description I have configured a Jenkins pipeline to disable concurrent builds:

            {code}
            properties([
                disableConcurrentBuilds()
            ])
            {code}

            However, I have noticed on some occasions multiple instances of the pipeline execute concurrently for no apparent reason; the next 2 builds are pulled from the queue and exited at the same time.
            I have configured a Jenkins pipeline to disable concurrent builds:

            {code}
            properties([
                disableConcurrentBuilds()
            ])
            {code}

            However, I have noticed on some occasions the next 2 builds are pulled from the pipeline's queue and executed concurrently. Why this occurs is not obvious at all.
            Hide
            jglick Jesse Glick added a comment -

            Is there anything I can do to help diagnose the problem?

            I guess set breakpoints in, or add logging to, WorkflowJob.isConcurrentBuild or Queue.allowNewBuildableTask.

            Workaround be to use the lock step instead of job-level granularity.

            Show
            jglick Jesse Glick added a comment - Is there anything I can do to help diagnose the problem? I guess set breakpoints in, or add logging to, WorkflowJob.isConcurrentBuild or Queue.allowNewBuildableTask . Workaround be to use the lock step instead of job-level granularity.
            Hide
            cjbosnic Cameron Bosnic added a comment -

            This has just happened to one of my builds.  I'll try adding the logging you suggested and look into using lock.

            Show
            cjbosnic Cameron Bosnic added a comment - This has just happened to one of my builds.  I'll try adding the logging you suggested and look into using lock.
            Hide
            mstave mstave added a comment -

            There seems to be some sort of race condition, as we're seeing this intermittently.  It may be that it's only happening when there are two queued instances of a job with different parameters.

            Show
            mstave mstave added a comment - There seems to be some sort of race condition, as we're seeing this intermittently.  It may be that it's only happening when there are two queued instances of a job with different parameters.
            Hide
            ataylor Alex Taylor added a comment -

            Jesse Glick I have a consistent way to reproduce this if it helps:

            1. Create a Freestyle job(just to catch when the error happens) called "FreestyleTest" with string parameters for "CurrentBuild", "PreviousBuild", and "QueueStatus"
            2. Create a Pipeline job with the "disable concurrent builds" turned on and a string parameter called "TestString" with the following script: 
              lock('TestResource') {
              def item = Jenkins.instance.getItem("PipelineTest")
              if(item.getLastSuccessfulBuild().number == (currentBuild.number.toInteger()-1))//finding previous number
              {
              def number = params.TestString.toInteger()+1
              node () {
              stage ('Build') {
              sleep 2
              build job: 'PipelineTest', parameters: [string(name: 'TestString', value: "${number}")], wait: false
              }
              stage ('Again'){
              sleep 2
              number = number+number
              build job: 'PipelineTest', parameters: [string(name: 'TestString', value: "${number}")], wait: false
              }
              
              }
              }
              else
              {
              currentBuild.result == "SUCCESS" 
              def RunningBuildsString = ""
              Jenkins.instance.getAllItems(Job).each{
              def jobBuilds=it.getBuilds()
              jobBuilds.each{
              if (it.isBuilding()) { RunningBuildsString = (RunningBuildsString + it.toString() + " ") }
              }
              }
              build job: 'FreestyleTest', parameters: [string(name: 'PreviousBuild', value: "${item.getLastSuccessfulBuild().number}"), string(name: 'CurrentBuild', value: "${currentBuild.number.toInteger()}"), string(name: 'QueueStatus', value: "${RunningBuildsString}")]
              
              }
              }
            1. You will have to run this a couple of times since there is some approval you will have to do(the groovy scripting I am doing is not recommended but needed to check the queue status and the getLastSuccessfulBuild from the filesystem(I wanted to see if it was just not updating the filesystem in time)
            2. Once it is ready to run you will get one "success" and then you will have to start it one more time where it will trigger infinite downstream builds. You just need to wait for the next build of FreestyleTest which will show you when the previous build was not one previous which will then show the queue status. This process takes around 800 builds for me locally but does not use very much memory and resources which is nice

            I am still testing if this same issue can happen with freestyle builds. Additionally the lock is not needed but you will see that the lock does not seem to matter either. I can also enable throttle concurrent builds to limit the number of builds per minute and it will also reproduce.

            Show
            ataylor Alex Taylor added a comment - Jesse Glick I have a consistent way to reproduce this if it helps: Create a Freestyle job(just to catch when the error happens) called "FreestyleTest" with string parameters for "CurrentBuild", "PreviousBuild", and "QueueStatus" Create a Pipeline job with the "disable concurrent builds" turned on and a string parameter called "TestString" with the following script:  lock( 'TestResource' ) { def item = Jenkins.instance.getItem( "PipelineTest" ) if (item.getLastSuccessfulBuild().number == (currentBuild.number.toInteger()-1)) //finding previous number { def number = params.TestString.toInteger()+1 node () { stage ( 'Build' ) { sleep 2 build job: 'PipelineTest' , parameters: [string(name: 'TestString' , value: "${number}" )], wait: false } stage ( 'Again' ){ sleep 2 number = number+number build job: 'PipelineTest' , parameters: [string(name: 'TestString' , value: "${number}" )], wait: false } } } else { currentBuild.result == "SUCCESS" def RunningBuildsString = "" Jenkins.instance.getAllItems(Job).each{ def jobBuilds=it.getBuilds() jobBuilds.each{ if (it.isBuilding()) { RunningBuildsString = (RunningBuildsString + it.toString() + " " ) } } } build job: 'FreestyleTest' , parameters: [string(name: 'PreviousBuild' , value: "${item.getLastSuccessfulBuild().number}" ), string(name: 'CurrentBuild' , value: "${currentBuild.number.toInteger()}" ), string(name: 'QueueStatus' , value: "${RunningBuildsString}" )] } } You will have to run this a couple of times since there is some approval you will have to do(the groovy scripting I am doing is not recommended but needed to check the queue status and the getLastSuccessfulBuild from the filesystem(I wanted to see if it was just not updating the filesystem in time) Once it is ready to run you will get one "success" and then you will have to start it one more time where it will trigger infinite downstream builds. You just need to wait for the next build of FreestyleTest which will show you when the previous build was not one previous which will then show the queue status. This process takes around 800 builds for me locally but does not use very much memory and resources which is nice I am still testing if this same issue can happen with freestyle builds. Additionally the lock is not needed but you will see that the lock does not seem to matter either. I can also enable throttle concurrent builds to limit the number of builds per minute and it will also reproduce.
            jglick Jesse Glick made changes -
            Priority Major [ 3 ] Critical [ 2 ]
            Hide
            abayer Andrew Bayer added a comment -

            I'm running a tweaked version of that now to see what happens - had to make some changes due to serialization.

            @NonCPS
            def getLastNum() {
                def item = Jenkins.instance.getItemByFullName("bug-reproduction/jenkins-41127")
                echo "${item}"
                return item.getLastSuccessfulBuild().number
            }
            
            def lastNum = getLastNum()
            if(lastNum == (currentBuild.number.toInteger()-1)) {//finding previous number 
                def number = params.TestString.toInteger()+1
                node () {
                    stage ('Build') {
                        sleep 2
                        build job: 'jenkins-41127', parameters: [string(name: 'TestString', value: "${number}")], wait: false
                    }
                    stage ('Again'){
                        sleep 2
                        number = number+number
                        build job: 'jenkins-41127', parameters: [string(name: 'TestString', value: "${number}")], wait: false
                    }
                }
            }
            else {
                currentBuild.result == "SUCCESS" 
                def RunningBuildsString = getRunStr()
                build job: 'jenkins-41127-fs', parameters: [string(name: 'PreviousBuild', value: "${lastNum}"), string(name: 'CurrentBuild', value: "${currentBuild.number.toInteger()}"), string(name: 'QueueStatus', value: "${RunningBuildsString}")]
            }
            
            @NonCPS
            def getRunStr() {
                def RunningBuildsString = ""
                Jenkins.instance.getAllItems(Job).each{
                    def jobBuilds=it.getBuilds()
                    jobBuilds.each{
                        if (it.isBuilding()) { RunningBuildsString = (RunningBuildsString + it.toString() + " ") }
                    }  
                }
                return RunningBuildsString
            }
            
            Show
            abayer Andrew Bayer added a comment - I'm running a tweaked version of that now to see what happens - had to make some changes due to serialization. @NonCPS def getLastNum() { def item = Jenkins.instance.getItemByFullName( "bug-reproduction/jenkins-41127" ) echo "${item}" return item.getLastSuccessfulBuild().number } def lastNum = getLastNum() if (lastNum == (currentBuild.number.toInteger()-1)) { //finding previous number def number = params.TestString.toInteger()+1 node () { stage ( 'Build' ) { sleep 2 build job: 'jenkins-41127' , parameters: [string(name: 'TestString' , value: "${number}" )], wait: false } stage ( 'Again' ){ sleep 2 number = number+number build job: 'jenkins-41127' , parameters: [string(name: 'TestString' , value: "${number}" )], wait: false } } } else { currentBuild.result == "SUCCESS" def RunningBuildsString = getRunStr() build job: 'jenkins-41127-fs' , parameters: [string(name: 'PreviousBuild' , value: "${lastNum}" ), string(name: 'CurrentBuild' , value: "${currentBuild.number.toInteger()}" ), string(name: 'QueueStatus' , value: "${RunningBuildsString}" )] } @NonCPS def getRunStr() { def RunningBuildsString = "" Jenkins.instance.getAllItems(Job).each{ def jobBuilds=it.getBuilds() jobBuilds.each{ if (it.isBuilding()) { RunningBuildsString = (RunningBuildsString + it.toString() + " " ) } } } return RunningBuildsString }
            Hide
            abayer Andrew Bayer added a comment -

            Got it to reproduce eventually, while I had some extra logging in Queue#getCauseOfBlockageForItem (and some other places, but that's the one that gave me something interesting). For the first few hundred jobs, everything was consistent: all the pending items would not be blocked by either Queue#getCauseOfBlockageForTask or QueueTaskDispatcher, they all were not BuildableItem, and they all had isConcurrentBuild() == false. The first item would not find its task in either buildables or pendings, and so would kick off. All the other pending items would find their tasks in pendings and so would stay queued. Yay, that's how it's supposed to be.

            But eventually...first item fine, many consecutive items fine, and then...one of them couldn't find its task in pendings and so kicked off too. That was followed by the rest of the queued items behaving like normal. I haven't yet navigated the Queue#maintain code enough to be sure what exactly the code path here is, but I'm fairly sure that the first item got removed from pendings before the queue processing was complete. I'm trying it again with some additional logging to try to make it more clear what's happening when.

            Show
            abayer Andrew Bayer added a comment - Got it to reproduce eventually, while I had some extra logging in Queue#getCauseOfBlockageForItem (and some other places, but that's the one that gave me something interesting). For the first few hundred jobs, everything was consistent: all the pending items would not be blocked by either Queue#getCauseOfBlockageForTask or QueueTaskDispatcher , they all were not BuildableItem , and they all had isConcurrentBuild() == false . The first item would not find its task in either buildables or pendings , and so would kick off. All the other pending items would find their tasks in pendings and so would stay queued. Yay, that's how it's supposed to be. But eventually...first item fine, many consecutive items fine, and then...one of them couldn't find its task in pendings and so kicked off too. That was followed by the rest of the queued items behaving like normal. I haven't yet navigated the Queue#maintain code enough to be sure what exactly the code path here is, but I'm fairly sure that the first item got removed from pendings before the queue processing was complete. I'm trying it again with some additional logging to try to make it more clear what's happening when.
            Hide
            abayer Andrew Bayer added a comment -

            So Queue#maintain() is running twice, one immediately after the other, in some cases - probably race conditiony, not yet sure how the two are getting called. Anyway, the first run is making the first item in the queue buildable and calls makeBuildable on the item, removing said item from blockedProjects, and, via makeFlyweightTaskBuildable and createFlyWeightTaskRunnable, starting the flyweight task and adding the first item to pendings. All is well and good. But then the next run of maintain starts - and it can't find the task for the item we just started (theoretically) and put in pendings on any executor...so it removes the item from pendings. Then it gets to checking the queue again, and the new first item doesn't have anything blocking it (i.e., nothing in buildables or pending) and so...it goes through the same process as the previous item did in the previous maintain run. End result: two builds get started at the same time.

            So - definitely a race condition.

            Show
            abayer Andrew Bayer added a comment - So Queue#maintain() is running twice, one immediately after the other, in some cases - probably race conditiony, not yet sure how the two are getting called. Anyway, the first run is making the first item in the queue buildable and calls makeBuildable on the item, removing said item from blockedProjects , and, via makeFlyweightTaskBuildable and createFlyWeightTaskRunnable , starting the flyweight task and adding the first item to pendings . All is well and good. But then the next run of maintain starts - and it can't find the task for the item we just started (theoretically) and put in pendings on any executor...so it removes the item from pendings . Then it gets to checking the queue again, and the new first item doesn't have anything blocking it (i.e., nothing in buildables or pending ) and so...it goes through the same process as the previous item did in the previous maintain run. End result: two builds get started at the same time. So - definitely a race condition.
            Hide
            abayer Andrew Bayer added a comment -

            fwiw, I think this likely only will happen with a flyweight task - so you could probably brew up a reproduction case with a matrix job, but I doubt you could do so with a freestyle job.

            Show
            abayer Andrew Bayer added a comment - fwiw, I think this likely only will happen with a flyweight task - so you could probably brew up a reproduction case with a matrix job, but I doubt you could do so with a freestyle job.
            Hide
            svanoort Sam Van Oort added a comment -

            Andrew Bayer Could we recategorize to core on the basis of your analysis?

            Show
            svanoort Sam Van Oort added a comment - Andrew Bayer Could we recategorize to core on the basis of your analysis?
            abayer Andrew Bayer made changes -
            Component/s core [ 15593 ]
            Component/s workflow-job-plugin [ 21716 ]
            jamesdumay James Dumay made changes -
            Remote Link This issue links to "CloudBees Internal CD-436 (Web Link)" [ 20566 ]
            Hide
            recampbell Ryan Campbell added a comment -

            Noting the relationship to JENKINS-30231

            Show
            recampbell Ryan Campbell added a comment - Noting the relationship to JENKINS-30231
            recampbell Ryan Campbell made changes -
            Link This issue relates to JENKINS-30231 [ JENKINS-30231 ]
            rkivisto Ray Kivisto made changes -
            Remote Link This issue links to "CloudBees Internal CD-437 (Web Link)" [ 21211 ]
            Hide
            dnusbaum Devin Nusbaum added a comment -

            In my reproductions, the call to Queue#maintain that kicks off the second job concurrently has the following abbreviated state in its initial snapshot:

            Queue.Snapshot { 
                waitingList=[...], 
                blocked=[pipeline #2, ...],
                buildables=[],
                pendings=[pipeline #1]
            }
            

            Interestingly, this is the only call to Queue#maintain out of ~250 builds where pendings is not an empty list.

            Inside of Queue#maintain, pipeline #1 (which is pending) gets removed from pendings, and because the result of makeBuildable on the next line is ignored, pipeline #1 is no longer part of the queue at all, and so nothing is blocking pipeline #2 from being built later on in Queue#maintain.

            I'm not exactly sure why pipeline #1 is removed from the pendings list. Maybe the lostPendings logic is messed up for flyweight tasks? For now I am looking at that logic to see if anything looks wrong. If it looks fine, then I'll try to understand why pipeline #1 is in pendings (maybe the flyweight task is half-started and gets blocked waiting for the Queue lock or something?) .

            Show
            dnusbaum Devin Nusbaum added a comment - In my reproductions, the call to Queue#maintain that kicks off the second job concurrently has the following abbreviated state in its initial snapshot: Queue.Snapshot { waitingList=[...], blocked=[pipeline #2, ...], buildables=[], pendings=[pipeline #1] } Interestingly, this is the only call to Queue#maintain out of ~250 builds where pendings is not an empty list. Inside of Queue#maintain , pipeline #1 (which is pending) gets removed from pendings , and because the result of makeBuildable on the next line is ignored, pipeline #1 is no longer part of the queue at all, and so nothing is blocking pipeline #2 from being built later on in Queue#maintain . I'm not exactly sure why pipeline #1 is removed from the pendings list. Maybe the lostPendings logic is messed up for flyweight tasks? For now I am looking at that logic to see if anything looks wrong. If it looks fine, then I'll try to understand why pipeline #1 is in pendings  (maybe the flyweight task is half-started and gets blocked waiting for the Queue lock or something?) .
            Hide
            dnusbaum Devin Nusbaum added a comment - - edited

            Ok, I think the issue with lostPendings and flyweight tasks is that we loop through executors but not oneOffExecutors, which is where flyweight tasks are executed.

            I will test out looping through both tomorrow to see if that fixes it.

            Show
            dnusbaum Devin Nusbaum added a comment - - edited Ok, I think the issue with lostPendings and flyweight tasks is that we loop through executors but not oneOffExecutors, which is where flyweight tasks are executed . I will test out looping through both tomorrow to see if that fixes it.
            Hide
            svanoort Sam Van Oort added a comment -

            If that fixes it, it will probably be a very welcome change.

            Show
            svanoort Sam Van Oort added a comment - If that fixes it, it will probably be a very welcome change.
            dnusbaum Devin Nusbaum made changes -
            Remote Link This issue links to "jenkinsci/jenkins#3562 (Web Link)" [ 21228 ]
            dnusbaum Devin Nusbaum made changes -
            Assignee Devin Nusbaum [ dnusbaum ]
            dnusbaum Devin Nusbaum made changes -
            Status Open [ 1 ] In Progress [ 3 ]
            dnusbaum Devin Nusbaum made changes -
            Status In Progress [ 3 ] In Review [ 10005 ]
            Hide
            dnusbaum Devin Nusbaum added a comment -

            PR is up: https://github.com/jenkinsci/jenkins/pull/3562. Still looking into creating a regression test for it. I verified the change by running the same reproduction case as Alex/Andrew. Previously, a concurrent build would occur after ~250-750 builds, but after my fix I was able to run 3200 builds without any of them running concurrently.

            Show
            dnusbaum Devin Nusbaum added a comment - PR is up: https://github.com/jenkinsci/jenkins/pull/3562 . Still looking into creating a regression test for it. I verified the change by running the same reproduction case as Alex/Andrew. Previously, a concurrent build would occur after ~250-750 builds, but after my fix I was able to run 3200 builds without any of them running concurrently.
            Hide
            dnusbaum Devin Nusbaum added a comment -

            My best guess as to why this happens so infrequently is that normally after a call to Queue#maintain, the executor owning the flyweight task is the next thread that acquires the Queue's lock (in Executor#run), so Queue.pendings is cleared before the next call to Queue#maintain, but in the problematic case 2 calls to {Queue#maintain happen consecutively without Executor#run being executed yet, so the task is still in Queue.pendings in the second call to {Queue#maintain.

            I wonder if using a fair ordering policy for the Queue's lock would make this less likely, or if the Executor's run method isn't even waiting on the lock yet in the problematic case.

            Show
            dnusbaum Devin Nusbaum added a comment - My best guess as to why this happens so infrequently is that normally after a call to Queue#maintain , the executor owning the flyweight task is the next thread that acquires the Queue's lock (in Executor#run ), so Queue.pendings is cleared before the next call to Queue#maintain , but in the problematic case 2 calls to { Queue#maintain happen consecutively without Executor#run being executed yet, so the task is still in Queue.pendings in the second call to { Queue#maintain . I wonder if using a fair ordering policy for the Queue's lock would make this less likely, or if the Executor's run method isn't even waiting on the lock yet in the problematic case.
            oleg_nenashev Oleg Nenashev made changes -
            Status In Review [ 10005 ] Resolved [ 5 ]
            Resolution Fixed [ 1 ]
            Released As Jenkins 2.136
            dnusbaum Devin Nusbaum made changes -
            Labels concurrent concurrent lts-candidate
            Hide
            dnusbaum Devin Nusbaum added a comment -

            Fixed in Jenkins 2.136. I am marking this as an LTS candidate given the impact and simplicity of the fix, although we will have to give it some time to make sure there are no regressions.

            Show
            dnusbaum Devin Nusbaum added a comment - Fixed in Jenkins 2.136 . I am marking this as an LTS candidate given the impact and simplicity of the fix, although we will have to give it some time to make sure there are no regressions.
            olivergondza Oliver Gondža made changes -
            Labels concurrent lts-candidate concurrent

              People

              • Assignee:
                dnusbaum Devin Nusbaum
                Reporter:
                boon Joe Harte
              • Votes:
                3 Vote for this issue
                Watchers:
                14 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: