Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-28183

Hard killed job's stage blocks stage in following jobs

    XMLWordPrintable

    Details

    • Similar Issues:

      Description

      Build #480 of my job hung in Stage 'Imaging' and I had to do a hard kill using BUILD_URL/doDelete

      Now all builds of my job hang on:

      Running: Imaging
      Entering stage Imaging
      Waiting for builds [480]

      Restarting Jenkins does not help.

        Attachments

          Issue Links

            Activity

            Hide
            jglick Jesse Glick added a comment -

            Hmm, you might need to edit $JENKINS_HOME/org.jenkinsci.plugins.workflow.support.steps.StageStep.xml as a last resort.

            I am surprised at this report since there is a cleanUp function specifically to remove entries corresponding to builds that disappeared somehow without a clean exit. For some reason it is not getting called here, or is failing to detect the deleted build. It should have printed a warning to your log at some point saying Cleaning up apparently deleted job-name#480. I wonder if your job was renamed/moved at any point, or anything else unusual occurred?

            Show
            jglick Jesse Glick added a comment - Hmm, you might need to edit $JENKINS_HOME/org.jenkinsci.plugins.workflow.support.steps.StageStep.xml as a last resort. I am surprised at this report since there is a cleanUp function specifically to remove entries corresponding to builds that disappeared somehow without a clean exit. For some reason it is not getting called here, or is failing to detect the deleted build. It should have printed a warning to your log at some point saying Cleaning up apparently deleted job-name#480 . I wonder if your job was renamed/moved at any point, or anything else unusual occurred?
            Hide
            kishorerp kishorerp added a comment -

            Can we please have a timeline by which this issue would be fixed.
            We cannot afford end users or anyone for that matter to manually kill a job by doing some changes to files in jenkins server.
            Context?
            https://groups.google.com/forum/#!topic/jenkinsci-dev/k4VLUAVfFjA

            Show
            kishorerp kishorerp added a comment - Can we please have a timeline by which this issue would be fixed. We cannot afford end users or anyone for that matter to manually kill a job by doing some changes to files in jenkins server. Context? https://groups.google.com/forum/#!topic/jenkinsci-dev/k4VLUAVfFjA
            Hide
            jglick Jesse Glick added a comment -

            There are many bugs and it is impossible to give a timeline for all of them. If a particular bug is important to you, you can try to fix it yourself, or if you are a CloudBees customer you can request a fix via a support ticket.

            Show
            jglick Jesse Glick added a comment - There are many bugs and it is impossible to give a timeline for all of them. If a particular bug is important to you, you can try to fix it yourself, or if you are a CloudBees customer you can request a fix via a support ticket.
            Hide
            cobexer Ing. Christoph Obexer added a comment -

            Jesse Glick I hit the same issue by aborting a build (abort, then clicking the link at the end of the console). After the weekend I noticed that a new build got stuck waiting on an aborted build. Searching for Cleaning up apparently deleted showed no results (in jenkins.err.log)

            In the log I found: Build 259 is the most recent build, triggered after the a restart(LTS patch update, should be unrelated), still waiting. Build 258 hung waiting on 256 for two days (I only noticed that after the fact because waiting builds are only shown on the jobs page and not on /). The logs below document my cancelling of 258, not sure they are helpful.

            Mär 21, 2016 11:04:22 AM org.jenkinsci.plugins.workflow.support.steps.StageStepExecution cancel
            WARNING: cannot cancel dead CpsStepContext[223]:Owner[jobname/master/258:null] or CpsStepContext[223]:Ownerjobname/master/259:jobname/master #259
            Mär 21, 2016 11:08:52 AM org.jenkinsci.plugins.workflow.cps.CpsFlowExecution$6 onSuccess
            WARNING: Failed to abort CpsFlowExecutionOwner[jobname/master/258:jobname/master #258]
            java.lang.UnsupportedOperationException
            at org.jenkinsci.plugins.workflow.support.steps.StageStepExecution.stop(StageStepExecution.java:66)
            at org.jenkinsci.plugins.workflow.cps.CpsFlowExecution$6.onSuccess(CpsFlowExecution.java:760)
            at org.jenkinsci.plugins.workflow.cps.CpsFlowExecution$6.onSuccess(CpsFlowExecution.java:755)
            at org.jenkinsci.plugins.workflow.support.concurrent.Futures$1.run(Futures.java:150)
            at com.google.common.util.concurrent.MoreExecutors$SameThreadExecutorService.execute(MoreExecutors.java:253)
            at com.google.common.util.concurrent.ExecutionList$RunnableExecutorPair.execute(ExecutionList.java:149)
            at com.google.common.util.concurrent.ExecutionList.add(ExecutionList.java:105)
            at com.google.common.util.concurrent.AbstractFuture.addListener(AbstractFuture.java:155)
            at org.jenkinsci.plugins.workflow.support.concurrent.Futures.addCallback(Futures.java:160)
            at org.jenkinsci.plugins.workflow.support.concurrent.Futures.addCallback(Futures.java:90)
            at org.jenkinsci.plugins.workflow.cps.CpsFlowExecution.interrupt(CpsFlowExecution.java:755)
            at org.jenkinsci.plugins.workflow.job.WorkflowRun$2.interrupt(WorkflowRun.java:230)
            at hudson.model.Executor.interrupt(Executor.java:227)
            at hudson.model.Executor.interrupt(Executor.java:197)
            at hudson.model.Executor.interrupt(Executor.java:187)
            at hudson.model.Executor.interrupt(Executor.java:173)
            at hudson.model.Executor.doStop(Executor.java:867)
            at org.jenkinsci.plugins.workflow.job.WorkflowRun.doStop(WorkflowRun.java:629)
            at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
            at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
            at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
            at java.lang.reflect.Method.invoke(Unknown Source)
            at org.kohsuke.stapler.Function$InstanceFunction.invoke(Function.java:298)
            at org.kohsuke.stapler.interceptor.RequirePOST$Processor.invoke(RequirePOST.java:46)
            at org.kohsuke.stapler.Function$InterceptedFunction.invoke(Function.java:399)
            at org.kohsuke.stapler.Function.bindAndInvoke(Function.java:161)
            at org.kohsuke.stapler.Function.bindAndInvokeAndServeResponse(Function.java:96)
            at org.kohsuke.stapler.MetaClass$1.doDispatch(MetaClass.java:121)
            at org.kohsuke.stapler.NameBasedDispatcher.dispatch(NameBasedDispatcher.java:53)
            at org.kohsuke.stapler.Stapler.tryInvoke(Stapler.java:746)
            at org.kohsuke.stapler.Stapler.invoke(Stapler.java:876)
            at org.kohsuke.stapler.MetaClass$13.dispatch(MetaClass.java:411)
            at org.kohsuke.stapler.Stapler.tryInvoke(Stapler.java:746)
            at org.kohsuke.stapler.Stapler.invoke(Stapler.java:876)
            at org.kohsuke.stapler.MetaClass$6.doDispatch(MetaClass.java:249)
            at org.kohsuke.stapler.NameBasedDispatcher.dispatch(NameBasedDispatcher.java:53)
            at org.kohsuke.stapler.Stapler.tryInvoke(Stapler.java:746)
            at org.kohsuke.stapler.Stapler.invoke(Stapler.java:876)
            at org.kohsuke.stapler.MetaClass$6.doDispatch(MetaClass.java:249)
            at org.kohsuke.stapler.NameBasedDispatcher.dispatch(NameBasedDispatcher.java:53)
            at org.kohsuke.stapler.Stapler.tryInvoke(Stapler.java:746)
            at org.kohsuke.stapler.Stapler.invoke(Stapler.java:876)
            at org.kohsuke.stapler.Stapler.invoke(Stapler.java:649)
            at org.kohsuke.stapler.Stapler.service(Stapler.java:238)
            at javax.servlet.http.HttpServlet.service(HttpServlet.java:848)
            at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:686)
            at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1494)
            at hudson.util.PluginServletFilter$1.doFilter(PluginServletFilter.java:132)
            at com.smartcodeltd.jenkinsci.plugin.assetbundler.filters.LessCSS.doFilter(LessCSS.java:47)
            at hudson.util.PluginServletFilter$1.doFilter(PluginServletFilter.java:129)
            at hudson.plugins.scm_sync_configuration.extensions.ScmSyncConfigurationFilter$1.call(ScmSyncConfigurationFilter.java:49)
            at hudson.plugins.scm_sync_configuration.extensions.ScmSyncConfigurationFilter$1.call(ScmSyncConfigurationFilter.java:44)
            at hudson.plugins.scm_sync_configuration.ScmSyncConfigurationDataProvider.provideRequestDuring(ScmSyncConfigurationDataProvider.java:106)
            at hudson.plugins.scm_sync_configuration.extensions.ScmSyncConfigurationFilter.doFilter(ScmSyncConfigurationFilter.java:44)
            at hudson.util.PluginServletFilter$1.doFilter(PluginServletFilter.java:129)
            at hudson.plugins.greenballs.GreenBallFilter.doFilter(GreenBallFilter.java:59)
            at hudson.util.PluginServletFilter$1.doFilter(PluginServletFilter.java:129)
            at hudson.util.PluginServletFilter.doFilter(PluginServletFilter.java:123)
            at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1482)
            at hudson.security.csrf.CrumbFilter.doFilter(CrumbFilter.java:49)
            at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1482)
            at hudson.security.ChainedServletFilter$1.doFilter(ChainedServletFilter.java:84)
            at hudson.security.ChainedServletFilter.doFilter(ChainedServletFilter.java:76)
            at hudson.security.HudsonFilter.doFilter(HudsonFilter.java:171)
            at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1482)
            at org.kohsuke.stapler.compression.CompressionFilter.doFilter(CompressionFilter.java:49)
            at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1482)
            at hudson.util.CharacterEncodingFilter.doFilter(CharacterEncodingFilter.java:81)
            at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1482)
            at org.kohsuke.stapler.DiagnosticThreadNameFilter.doFilter(DiagnosticThreadNameFilter.java:30)
            at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1474)
            at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:499)
            at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:137)
            at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:533)
            at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:231)
            at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1086)
            at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:428)
            at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:193)
            at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1020)
            at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135)
            at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116)
            at org.eclipse.jetty.server.Server.handle(Server.java:370)
            at org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:489)
            at org.eclipse.jetty.server.AbstractHttpConnection.headerComplete(AbstractHttpConnection.java:949)
            at org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.headerComplete(AbstractHttpConnection.java:1011)
            at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:644)
            at org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:235)
            at org.eclipse.jetty.server.AsyncHttpConnection.handle(AsyncHttpConnection.java:82)
            at org.eclipse.jetty.io.nio.SelectChannelEndPoint.handle(SelectChannelEndPoint.java:668)
            at org.eclipse.jetty.io.nio.SelectChannelEndPoint$1.run(SelectChannelEndPoint.java:52)
            at winstone.BoundedExecutorService$1.run(BoundedExecutorService.java:77)
            at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
            at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
            at java.lang.Thread.run(Unknown Source)

            Mär 21, 2016 11:09:09 AM org.jenkinsci.plugins.workflow.job.WorkflowRun finish
            INFO: jobname/master #258 completed: ABORTED

            Show
            cobexer Ing. Christoph Obexer added a comment - Jesse Glick I hit the same issue by aborting a build (abort, then clicking the link at the end of the console). After the weekend I noticed that a new build got stuck waiting on an aborted build. Searching for Cleaning up apparently deleted showed no results (in jenkins.err.log) In the log I found: Build 259 is the most recent build, triggered after the a restart(LTS patch update, should be unrelated), still waiting. Build 258 hung waiting on 256 for two days (I only noticed that after the fact because waiting builds are only shown on the jobs page and not on /). The logs below document my cancelling of 258, not sure they are helpful. Mär 21, 2016 11:04:22 AM org.jenkinsci.plugins.workflow.support.steps.StageStepExecution cancel WARNING: cannot cancel dead CpsStepContext [223] :Owner [jobname/master/258:null] or CpsStepContext [223] :Owner jobname/master/259:jobname/master #259 Mär 21, 2016 11:08:52 AM org.jenkinsci.plugins.workflow.cps.CpsFlowExecution$6 onSuccess WARNING: Failed to abort CpsFlowExecution Owner[jobname/master/258:jobname/master #258] java.lang.UnsupportedOperationException at org.jenkinsci.plugins.workflow.support.steps.StageStepExecution.stop(StageStepExecution.java:66) at org.jenkinsci.plugins.workflow.cps.CpsFlowExecution$6.onSuccess(CpsFlowExecution.java:760) at org.jenkinsci.plugins.workflow.cps.CpsFlowExecution$6.onSuccess(CpsFlowExecution.java:755) at org.jenkinsci.plugins.workflow.support.concurrent.Futures$1.run(Futures.java:150) at com.google.common.util.concurrent.MoreExecutors$SameThreadExecutorService.execute(MoreExecutors.java:253) at com.google.common.util.concurrent.ExecutionList$RunnableExecutorPair.execute(ExecutionList.java:149) at com.google.common.util.concurrent.ExecutionList.add(ExecutionList.java:105) at com.google.common.util.concurrent.AbstractFuture.addListener(AbstractFuture.java:155) at org.jenkinsci.plugins.workflow.support.concurrent.Futures.addCallback(Futures.java:160) at org.jenkinsci.plugins.workflow.support.concurrent.Futures.addCallback(Futures.java:90) at org.jenkinsci.plugins.workflow.cps.CpsFlowExecution.interrupt(CpsFlowExecution.java:755) at org.jenkinsci.plugins.workflow.job.WorkflowRun$2.interrupt(WorkflowRun.java:230) at hudson.model.Executor.interrupt(Executor.java:227) at hudson.model.Executor.interrupt(Executor.java:197) at hudson.model.Executor.interrupt(Executor.java:187) at hudson.model.Executor.interrupt(Executor.java:173) at hudson.model.Executor.doStop(Executor.java:867) at org.jenkinsci.plugins.workflow.job.WorkflowRun.doStop(WorkflowRun.java:629) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source) at java.lang.reflect.Method.invoke(Unknown Source) at org.kohsuke.stapler.Function$InstanceFunction.invoke(Function.java:298) at org.kohsuke.stapler.interceptor.RequirePOST$Processor.invoke(RequirePOST.java:46) at org.kohsuke.stapler.Function$InterceptedFunction.invoke(Function.java:399) at org.kohsuke.stapler.Function.bindAndInvoke(Function.java:161) at org.kohsuke.stapler.Function.bindAndInvokeAndServeResponse(Function.java:96) at org.kohsuke.stapler.MetaClass$1.doDispatch(MetaClass.java:121) at org.kohsuke.stapler.NameBasedDispatcher.dispatch(NameBasedDispatcher.java:53) at org.kohsuke.stapler.Stapler.tryInvoke(Stapler.java:746) at org.kohsuke.stapler.Stapler.invoke(Stapler.java:876) at org.kohsuke.stapler.MetaClass$13.dispatch(MetaClass.java:411) at org.kohsuke.stapler.Stapler.tryInvoke(Stapler.java:746) at org.kohsuke.stapler.Stapler.invoke(Stapler.java:876) at org.kohsuke.stapler.MetaClass$6.doDispatch(MetaClass.java:249) at org.kohsuke.stapler.NameBasedDispatcher.dispatch(NameBasedDispatcher.java:53) at org.kohsuke.stapler.Stapler.tryInvoke(Stapler.java:746) at org.kohsuke.stapler.Stapler.invoke(Stapler.java:876) at org.kohsuke.stapler.MetaClass$6.doDispatch(MetaClass.java:249) at org.kohsuke.stapler.NameBasedDispatcher.dispatch(NameBasedDispatcher.java:53) at org.kohsuke.stapler.Stapler.tryInvoke(Stapler.java:746) at org.kohsuke.stapler.Stapler.invoke(Stapler.java:876) at org.kohsuke.stapler.Stapler.invoke(Stapler.java:649) at org.kohsuke.stapler.Stapler.service(Stapler.java:238) at javax.servlet.http.HttpServlet.service(HttpServlet.java:848) at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:686) at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1494) at hudson.util.PluginServletFilter$1.doFilter(PluginServletFilter.java:132) at com.smartcodeltd.jenkinsci.plugin.assetbundler.filters.LessCSS.doFilter(LessCSS.java:47) at hudson.util.PluginServletFilter$1.doFilter(PluginServletFilter.java:129) at hudson.plugins.scm_sync_configuration.extensions.ScmSyncConfigurationFilter$1.call(ScmSyncConfigurationFilter.java:49) at hudson.plugins.scm_sync_configuration.extensions.ScmSyncConfigurationFilter$1.call(ScmSyncConfigurationFilter.java:44) at hudson.plugins.scm_sync_configuration.ScmSyncConfigurationDataProvider.provideRequestDuring(ScmSyncConfigurationDataProvider.java:106) at hudson.plugins.scm_sync_configuration.extensions.ScmSyncConfigurationFilter.doFilter(ScmSyncConfigurationFilter.java:44) at hudson.util.PluginServletFilter$1.doFilter(PluginServletFilter.java:129) at hudson.plugins.greenballs.GreenBallFilter.doFilter(GreenBallFilter.java:59) at hudson.util.PluginServletFilter$1.doFilter(PluginServletFilter.java:129) at hudson.util.PluginServletFilter.doFilter(PluginServletFilter.java:123) at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1482) at hudson.security.csrf.CrumbFilter.doFilter(CrumbFilter.java:49) at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1482) at hudson.security.ChainedServletFilter$1.doFilter(ChainedServletFilter.java:84) at hudson.security.ChainedServletFilter.doFilter(ChainedServletFilter.java:76) at hudson.security.HudsonFilter.doFilter(HudsonFilter.java:171) at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1482) at org.kohsuke.stapler.compression.CompressionFilter.doFilter(CompressionFilter.java:49) at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1482) at hudson.util.CharacterEncodingFilter.doFilter(CharacterEncodingFilter.java:81) at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1482) at org.kohsuke.stapler.DiagnosticThreadNameFilter.doFilter(DiagnosticThreadNameFilter.java:30) at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1474) at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:499) at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:137) at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:533) at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:231) at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1086) at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:428) at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:193) at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1020) at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135) at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116) at org.eclipse.jetty.server.Server.handle(Server.java:370) at org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:489) at org.eclipse.jetty.server.AbstractHttpConnection.headerComplete(AbstractHttpConnection.java:949) at org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.headerComplete(AbstractHttpConnection.java:1011) at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:644) at org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:235) at org.eclipse.jetty.server.AsyncHttpConnection.handle(AsyncHttpConnection.java:82) at org.eclipse.jetty.io.nio.SelectChannelEndPoint.handle(SelectChannelEndPoint.java:668) at org.eclipse.jetty.io.nio.SelectChannelEndPoint$1.run(SelectChannelEndPoint.java:52) at winstone.BoundedExecutorService$1.run(BoundedExecutorService.java:77) at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at java.lang.Thread.run(Unknown Source) Mär 21, 2016 11:09:09 AM org.jenkinsci.plugins.workflow.job.WorkflowRun finish INFO: jobname/master #258 completed: ABORTED
            Hide
            amuniz Antonio Muñiz added a comment -

            Reproduced.

            Script:

            stage concurrency: 1, name: 'start'
            node {
             sh 'sleep 600'
            }
            
            1. Trigger a build
            2. Call build/1/doDelete
            3. Trigger a second build (#2)
            4. It prints: Waiting for builds 1 - which is wrong as #1 is not running

            And the following actions fixed the unstable state:

            1. Cancel #2 (the build waiting) using the red cross behind the build (not by doDelete)
            2. It will refuse to stop, so go to the build log and click on Click here to forcibly terminate running steps (requires Pipeline plugin 1.11 or higher)
            3. At this point $JENKINS_HOME/org.jenkinsci.plugins.workflow.support.steps.StageStep.xml is cleared up and any subsequent build will run normally
            Show
            amuniz Antonio Muñiz added a comment - Reproduced. Script: stage concurrency: 1, name: 'start' node { sh 'sleep 600' } Trigger a build Call build/1/doDelete Trigger a second build (#2) It prints: Waiting for builds 1 - which is wrong as #1 is not running And the following actions fixed the unstable state: Cancel #2 (the build waiting) using the red cross behind the build (not by doDelete ) It will refuse to stop, so go to the build log and click on Click here to forcibly terminate running steps (requires Pipeline plugin 1.11 or higher) At this point $JENKINS_HOME/org.jenkinsci.plugins.workflow.support.steps.StageStep.xml is cleared up and any subsequent build will run normally
            Hide
            ssbarnea Sorin Sbarnea added a comment -

            How can I kill these jobs? I have one such job that rendered my entire Jenkins service totally useless, it does reply very slow to requests (only to some) even if it does not have any load and the memory usage on the JVM is about 20%.

            Apr 25, 2016 8:51:46 PM org.jenkinsci.plugins.workflow.cps.CpsFlowExecution$6 onSuccess
            WARNING: Failed to abort CpsFlowExecution[Owner[... #16]]
            java.lang.NullPointerException
            	at org.jenkinsci.plugins.workflow.steps.durable_task.DurableTaskStep$Execution.stop(DurableTaskStep.java:140)
            	at org.jenkinsci.plugins.workflow.cps.CpsFlowExecution$6.onSuccess(CpsFlowExecution.java:760)
            	at org.jenkinsci.plugins.workflow.cps.CpsFlowExecution$6.onSuccess(CpsFlowExecution.java:755)
            

            I am unable to kill the job, the console log is empty, restarting the server re-spawn the job!!! What should we do?

            Show
            ssbarnea Sorin Sbarnea added a comment - How can I kill these jobs? I have one such job that rendered my entire Jenkins service totally useless, it does reply very slow to requests (only to some) even if it does not have any load and the memory usage on the JVM is about 20%. Apr 25, 2016 8:51:46 PM org.jenkinsci.plugins.workflow.cps.CpsFlowExecution$6 onSuccess WARNING: Failed to abort CpsFlowExecution[Owner[... #16]] java.lang.NullPointerException at org.jenkinsci.plugins.workflow.steps.durable_task.DurableTaskStep$Execution.stop(DurableTaskStep.java:140) at org.jenkinsci.plugins.workflow.cps.CpsFlowExecution$6.onSuccess(CpsFlowExecution.java:760) at org.jenkinsci.plugins.workflow.cps.CpsFlowExecution$6.onSuccess(CpsFlowExecution.java:755) I am unable to kill the job, the console log is empty, restarting the server re-spawn the job!!! What should we do?
            Hide
            ssbarnea Sorin Sbarnea added a comment -

            I raised the priority of this bug to blocker because it does render Jenkins completely useless, and restart does not solve the problem.

            In certain conditions, this bug does block Jenkins completely, is persistent even system restarts and the affected job may not even have a console. It seems that is impossible to get rid of the blocked jobs and Jenkins doesn't do anything

            Show
            ssbarnea Sorin Sbarnea added a comment - I raised the priority of this bug to blocker because it does render Jenkins completely useless, and restart does not solve the problem. In certain conditions, this bug does block Jenkins completely, is persistent even system restarts and the affected job may not even have a console. It seems that is impossible to get rid of the blocked jobs and Jenkins doesn't do anything
            Hide
            amuniz Antonio Muñiz added a comment - - edited

            Sorin Sbarnea I'm not sure if your issue is related to this one. At any rate, if you remove the affected build from the filesystem ($JENKINS_HOME/jobs/[job_name]/builds/[build_number] and restart Jenkins the resuming process will not happen anymore.

            Show
            amuniz Antonio Muñiz added a comment - - edited Sorin Sbarnea I'm not sure if your issue is related to this one. At any rate, if you remove the affected build from the filesystem ( $JENKINS_HOME/jobs/ [job_name] /builds/ [build_number] and restart Jenkins the resuming process will not happen anymore.
            Hide
            ssbarnea Sorin Sbarnea added a comment -

            Sorry but this job doesn't even have a directory on the filesystem! I checked and this build is #16 and the latest directory is #15.

            Show
            ssbarnea Sorin Sbarnea added a comment - Sorry but this job doesn't even have a directory on the filesystem! I checked and this build is #16 and the latest directory is #15.
            Hide
            amuniz Antonio Muñiz added a comment -

            Can you paste the full stack trace (in a gist preferably) and link it here?

            Show
            amuniz Antonio Muñiz added a comment - Can you paste the full stack trace (in a gist preferably) and link it here?
            Hide
            ssbarnea Sorin Sbarnea added a comment -

            Sure, here is the link https://gist.github.com/ssbarnea/938dfce40b88cb5b1cb592a04e943621 – and a correction, it seems that are jobs are still triggered.

            Show
            ssbarnea Sorin Sbarnea added a comment - Sure, here is the link https://gist.github.com/ssbarnea/938dfce40b88cb5b1cb592a04e943621 – and a correction, it seems that are jobs are still triggered.
            Hide
            amuniz Antonio Muñiz added a comment -

            Is there some control file (with format .a3a3dew or something like that) in the job workspace? if so, remove it.
            Anyway, all this is just to stop the durable task, I don't know why your Jenkins instance is not responsive (the durable task check should not make it unresponsive, it's a java call every few seconds).

            Show
            amuniz Antonio Muñiz added a comment - Is there some control file (with format .a3a3dew or something like that) in the job workspace? if so, remove it. Anyway, all this is just to stop the durable task, I don't know why your Jenkins instance is not responsive (the durable task check should not make it unresponsive, it's a java call every few seconds).
            Hide
            ssbarnea Sorin Sbarnea added a comment -

            I was able to get rid of the immortal job by removing `workspace@tmp/durable-*` directory. Now the question is how to avoid this in the future.

            Show
            ssbarnea Sorin Sbarnea added a comment - I was able to get rid of the immortal job by removing `workspace@tmp/durable-*` directory. Now the question is how to avoid this in the future.
            Hide
            amuniz Antonio Muñiz added a comment -

            how to avoid this in the future.

            Don't delete a running job, if you want to kill it use the red cross and follow the actions described in a previous comment in this thread.

            Show
            amuniz Antonio Muñiz added a comment - how to avoid this in the future. Don't delete a running job, if you want to kill it use the red cross and follow the actions described in a previous comment in this thread.
            Hide
            ssbarnea Sorin Sbarnea added a comment -

            Antonio Muñiz please be aware about the fact that this job cannot be stopped using the red cross, as this action does nothing in its case (other than throwing the reported NPE exception in the logs.). Also this job doesn't even have a console created for it so the workaround of trying to stop it using the console link doesn't also work.

            The only way to get rid of it was to shutdown jenkins, remove the directory, and to start back jenkins.

            Show
            ssbarnea Sorin Sbarnea added a comment - Antonio Muñiz please be aware about the fact that this job cannot be stopped using the red cross, as this action does nothing in its case (other than throwing the reported NPE exception in the logs.). Also this job doesn't even have a console created for it so the workaround of trying to stop it using the console link doesn't also work. The only way to get rid of it was to shutdown jenkins, remove the directory, and to start back jenkins.
            Hide
            amuniz Antonio Muñiz added a comment -

            I guess you ended up with a zombie build (without a build directory in the filesystem) because you firstly tried to delete it and then (when it didn't work) tried to use the red cross. I don't know how you reached to that inconsistent state otherwise.

            Show
            amuniz Antonio Muñiz added a comment - I guess you ended up with a zombie build (without a build directory in the filesystem) because you firstly tried to delete it and then (when it didn't work) tried to use the red cross. I don't know how you reached to that inconsistent state otherwise.
            Hide
            ssbarnea Sorin Sbarnea added a comment -

            Please have a look at https://issues.jenkins-ci.org/browse/JENKINS-34021 which describes the NPE issue. We already encountered this twice in the last month and I am sure that nobody did some filesystem changes before these bug was triggered.

            Show
            ssbarnea Sorin Sbarnea added a comment - Please have a look at https://issues.jenkins-ci.org/browse/JENKINS-34021 which describes the NPE issue. We already encountered this twice in the last month and I am sure that nobody did some filesystem changes before these bug was triggered.
            Hide
            amuniz Antonio Muñiz added a comment -

            Ok. Thanks for reporting a separate issue.

            Show
            amuniz Antonio Muñiz added a comment - Ok. Thanks for reporting a separate issue.
            Hide
            ddaumiller Dorian Daumiller added a comment -

            Had the same issue.
            Worked around it like this:

            • aborted the waiting build (forcibly)
            • renamed the blocking build's folder on the file system
            • saw that ```$JENKINS_HOME/org.jenkinsci.plugins.workflow.support.steps.StageStep.xml``` wasn't cleared because already a new build had started
            • aborted the new build, too
            • found the StageStep xml file empty and was able to restart the job.
            Show
            ddaumiller Dorian Daumiller added a comment - Had the same issue. Worked around it like this: aborted the waiting build (forcibly) renamed the blocking build's folder on the file system saw that ```$JENKINS_HOME/org.jenkinsci.plugins.workflow.support.steps.StageStep.xml``` wasn't cleared because already a new build had started aborted the new build, too found the StageStep xml file empty and was able to restart the job.
            Hide
            tcole Tavin Cole added a comment - - edited

            instead of renaming folders, deleting the StageStep.xml file also works (I did this with jenkins shut down)

            Show
            tcole Tavin Cole added a comment - - edited instead of renaming folders, deleting the StageStep.xml file also works (I did this with jenkins shut down)
            Hide
            jglick Jesse Glick added a comment -

            The NullPointerException has since been fixed IIRC.

            Unlikely to be fixed since concurrency of stage is slated for deprecation. Use lock instead, and we will work on JENKINS-36479.

            Show
            jglick Jesse Glick added a comment - The NullPointerException has since been fixed IIRC. Unlikely to be fixed since concurrency of stage is slated for deprecation. Use lock instead, and we will work on JENKINS-36479 .
            Hide
            abayer Andrew Bayer added a comment -

            Once JENKINS-26107 is released, stage concurrency will be deprecated. So this won't actually get fixed - instead, the recommendation will be to use lockable-resources, which is getting a fix for at least some of this scenario over at JENKINS-36479.

            Show
            abayer Andrew Bayer added a comment - Once JENKINS-26107 is released, stage concurrency will be deprecated. So this won't actually get fixed - instead, the recommendation will be to use lockable-resources , which is getting a fix for at least some of this scenario over at JENKINS-36479 .
            Hide
            recampbell Ryan Campbell added a comment -

            As per the discussion above, this issue will not be fixed. The concurrency option of the stage step has been deprecated. Instead, users are advised to use the lock step of the Lockable Resource plugin.

            Show
            recampbell Ryan Campbell added a comment - As per the discussion above, this issue will not be fixed. The concurrency option of the stage step has been deprecated. Instead, users are advised to use the lock step of the Lockable Resource plugin .
            Hide
            sag47 Sam Gleske added a comment -

            For those who encounter this issue and DON'T want to restart your Jenkins instance. This can be cleaned up via script console.

            import jenkins.model.Jenkins
            import org.jenkinsci.plugins.workflow.job.WorkflowRun
            import org.jenkinsci.plugins.workflow.support.steps.StageStepExecution
            
            jobByFullName = 'folder/job'
            jobBuildNumber = '3'
            
            //kill it
            Jenkins j = Jenkins.instance
            WorkflowRun b = j.getItemByFullName(jobByFullName).getBuild(jobBuildNumber)
            b.doKill()
            StageStepExecution.exit(b)
            
            Show
            sag47 Sam Gleske added a comment - For those who encounter this issue and DON'T want to restart your Jenkins instance. This can be cleaned up via script console. import jenkins.model.Jenkins import org.jenkinsci.plugins.workflow.job.WorkflowRun import org.jenkinsci.plugins.workflow.support.steps.StageStepExecution jobByFullName = 'folder/job' jobBuildNumber = '3' //kill it Jenkins j = Jenkins.instance WorkflowRun b = j.getItemByFullName(jobByFullName).getBuild(jobBuildNumber) b.doKill() StageStepExecution.exit(b)
            Hide
            jgrant216 Jeff G added a comment -

            Sam Gleske, thank you for that script.

            Latest pipeline plugins on current LTS and I had a pair of multi-branch pipeline jobs stuck between the master and slave assignment.  Even a restart caused the jobs to resume from where they had stalled, but still not continue and still not respond to abort requests (no force kill showed up either).  The script you provided allowed me to kill those and the subsequent builds worked correctly.

            Show
            jgrant216 Jeff G added a comment - Sam Gleske , thank you for that script. Latest pipeline plugins on current LTS and I had a pair of multi-branch pipeline jobs stuck between the master and slave assignment.  Even a restart caused the jobs to resume from where they had stalled, but still not continue and still not respond to abort requests (no force kill showed up either).  The script you provided allowed me to kill those and the subsequent builds worked correctly.
            Hide
            sag47 Sam Gleske added a comment -

            Jeff G, glad it helped. I have other kill-all-*.groovy scripts which make it even easier. Refer to https://github.com/samrocketman/jenkins-script-console-scripts

            Show
            sag47 Sam Gleske added a comment - Jeff G , glad it helped. I have other kill-all-*.groovy scripts which make it even easier. Refer to https://github.com/samrocketman/jenkins-script-console-scripts

              People

              • Assignee:
                jglick Jesse Glick
                Reporter:
                anshuarya Anshu Arya
              • Votes:
                7 Vote for this issue
                Watchers:
                16 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: