Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-48824

Fast mode: declarative job persists after attempted kills. Jenkins restart shows com.thoughtworks.xstream.io.StreamException on bringup

    Details

    • Type: Bug
    • Status: Closed (View Workflow)
    • Priority: Major
    • Resolution: Fixed
    • Component/s: workflow-job-plugin
    • Labels:
    • Environment:
    • Similar Issues:

      Description

      Summary:
      While doing beta testing of the Pipeline Storage Performance improvements, I'm using a Declarative pipeline job, which is set to run in PERFORMANCE_OPTIMIZED mode. This job is hanging, and I cannot stop it. It persists across Jenkins restarts and even full system restarts.

      Steps to recreate:
      1. Create a multibranch project which contains a number of different branches. One of the branches should contain the Declarative Jenkinsfile, which I'll provide the code for below. We'll call it DeclarativeParallels.

      2. Set the Global default for Speed/Durability to "None: Use pipeline default"

      3. Set the override for your multibranch project to Performance-optimized

      4. Run the DeclarativeParallels job.

      5. For reasons which are not clear to me, this Jenkins instance (which is a t2.large on AWS) went unresponsive. So, I bounced it from the AWS console.

      6. Once the system was back up, I noticed that DeclarativeParallels job attempted to resume. It was making no progress though, so I attempted to kill it, using the (X) icon next to it. This had no apparent effect.

      7. From the Script Console, I ran the following to attempt to kill every running job. This also had no effect:

      public int cancelRunning() {
              // Cancel running builds.
              def numCancels = 0;
              for (job in Hudson.instance.getAllItems(Job.class)) {
                  for (build in job.builds) {
                      if (!build.hasProperty('causes')) { continue; }
                      if (!build.isBuilding()) { continue; }
                      
                    out.println('Stopping ' + build.toString());
                              build.doStop();
                              out.println(build.toString() + ' stopped.');
                              numCancels++;
                              
                  }
              }
              return numCancels;
          }
      
      cancelRunning()
      

      8. I then started to tail -f /var/log/jenkins/jenkins.log, and restarted Jenkins gracefully via http://1.2.3.4:8080/restart. On bringup, I saw the following exception while Jenkins was coming up:

      INFO: Jenkins is fully up and running
      [01/05/18 15:53:59] SSH Launch of 113-agent on 172.18.40.113 completed in 3,323 ms
      Jan 05, 2018 3:54:01 PM org.jenkinsci.plugins.workflow.job.WorkflowRun onLoad
      WARNING: null
      com.thoughtworks.xstream.io.StreamException:  : input contained no data
      	at com.thoughtworks.xstream.io.xml.XppReader.pullNextEvent(XppReader.java:126)
      	at com.thoughtworks.xstream.io.xml.AbstractPullReader.readRealEvent(AbstractPullReader.java:148)
      	at com.thoughtworks.xstream.io.xml.AbstractPullReader.readEvent(AbstractPullReader.java:141)
      	at com.thoughtworks.xstream.io.xml.AbstractPullReader.move(AbstractPullReader.java:118)
      	at com.thoughtworks.xstream.io.xml.AbstractPullReader.moveDown(AbstractPullReader.java:103)
      	at com.thoughtworks.xstream.io.xml.XppReader.<init>(XppReader.java:63)
      	at com.thoughtworks.xstream.io.xml.AbstractXppDriver.createReader(AbstractXppDriver.java:54)
      	at com.thoughtworks.xstream.io.xml.AbstractXppDriver.createReader(AbstractXppDriver.java:65)
      	at com.thoughtworks.xstream.io.AbstractDriver.createReader(AbstractDriver.java:74)
      	at com.thoughtworks.xstream.XStream.fromXML(XStream.java:1135)
      	at com.thoughtworks.xstream.XStream.fromXML(XStream.java:1079)
      	at org.jenkinsci.plugins.workflow.support.storage.BulkFlowNodeStorage.getOrLoadNodes(BulkFlowNodeStorage.java:100)
      	at org.jenkinsci.plugins.workflow.support.storage.BulkFlowNodeStorage.getNode(BulkFlowNodeStorage.java:131)
      	at org.jenkinsci.plugins.workflow.cps.CpsFlowExecution$TimingFlowNodeStorage.getNode(CpsFlowExecution.java:1618)
      	at org.jenkinsci.plugins.workflow.cps.CpsFlowExecution.initializeStorage(CpsFlowExecution.java:627)
      	at org.jenkinsci.plugins.workflow.cps.CpsFlowExecution.onLoad(CpsFlowExecution.java:673)
      	at org.jenkinsci.plugins.workflow.job.WorkflowRun.onLoad(WorkflowRun.java:635)
      	at hudson.model.RunMap.retrieve(RunMap.java:225)
      	at hudson.model.RunMap.retrieve(RunMap.java:57)
      	at jenkins.model.lazy.AbstractLazyLoadRunMap.load(AbstractLazyLoadRunMap.java:500)
      	at jenkins.model.lazy.AbstractLazyLoadRunMap.load(AbstractLazyLoadRunMap.java:482)
      	at jenkins.model.lazy.AbstractLazyLoadRunMap.getByNumber(AbstractLazyLoadRunMap.java:380)
      	at jenkins.model.lazy.AbstractLazyLoadRunMap.search(AbstractLazyLoadRunMap.java:345)
      	at jenkins.model.lazy.AbstractLazyLoadRunMap.newestBuild(AbstractLazyLoadRunMap.java:275)
      	at jenkins.model.lazy.LazyBuildMixIn.getLastBuild(LazyBuildMixIn.java:245)
      	at org.jenkinsci.plugins.workflow.job.WorkflowJob.getLastBuild(WorkflowJob.java:240)
      	at org.jenkinsci.plugins.workflow.job.WorkflowJob.getLastBuild(WorkflowJob.java:105)
      	at hudson.model.Job.getBuildHealthReports(Job.java:1209)
      	at hudson.model.Job.getBuildHealth(Job.java:1202)
      	at com.cloudbees.hudson.plugins.folder.health.FolderHealthMetric.getHealthReport(FolderHealthMetric.java:48)
      	at com.cloudbees.hudson.plugins.folder.health.WorstChildHealthMetric$ReporterImpl.observe(WorstChildHealthMetric.java:86)
      	at com.cloudbees.hudson.plugins.folder.AbstractFolder.getBuildHealthReports(AbstractFolder.java:928)
      	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
      	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
      	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
      	at java.lang.reflect.Method.invoke(Method.java:497)
      	at org.apache.commons.jexl.util.PropertyExecutor.execute(PropertyExecutor.java:125)
      	at org.apache.commons.jexl.util.introspection.UberspectImpl$VelGetterImpl.invoke(UberspectImpl.java:314)
      	at org.apache.commons.jexl.parser.ASTArrayAccess.evaluateExpr(ASTArrayAccess.java:185)
      	at org.apache.commons.jexl.parser.ASTIdentifier.execute(ASTIdentifier.java:75)
      	at org.apache.commons.jexl.parser.ASTReference.execute(ASTReference.java:83)
      	at org.apache.commons.jexl.parser.ASTReference.value(ASTReference.java:57)
      	at org.apache.commons.jexl.parser.ASTReferenceExpression.value(ASTReferenceExpression.java:51)
      	at org.apache.commons.jexl.ExpressionImpl.evaluate(ExpressionImpl.java:80)
      	at hudson.ExpressionFactory2$JexlExpression.evaluate(ExpressionFactory2.java:74)
      	at org.apache.commons.jelly.tags.core.CoreTagLibrary$3.run(CoreTagLibrary.java:134)
      	at org.apache.commons.jelly.impl.ScriptBlock.run(ScriptBlock.java:95)
      	at org.apache.commons.jelly.tags.core.CoreTagLibrary$2.run(CoreTagLibrary.java:105)
      	at org.kohsuke.stapler.jelly.CallTagLibScript.run(CallTagLibScript.java:120)
      	at org.apache.commons.jelly.tags.core.CoreTagLibrary$2.run(CoreTagLibrary.java:105)
      	at org.kohsuke.stapler.jelly.JellyViewScript.run(JellyViewScript.java:95)
      	at org.kohsuke.stapler.jelly.IncludeTag.doTag(IncludeTag.java:147)
      	at org.apache.commons.jelly.impl.TagScript.run(TagScript.java:269)
      	at org.apache.commons.jelly.TagSupport.invokeBody(TagSupport.java:161)
      	at org.apache.commons.jelly.tags.core.ForEachTag.doTag(ForEachTag.java:150)
      	at org.apache.commons.jelly.impl.TagScript.run(TagScript.java:269)
      	at org.apache.commons.jelly.impl.ScriptBlock.run(ScriptBlock.java:95)
      	at org.kohsuke.stapler.jelly.ReallyStaticTagLibrary$1.run(ReallyStaticTagLibrary.java:99)
      	at org.apache.commons.jelly.tags.core.CoreTagLibrary$2.run(CoreTagLibrary.java:105)
      	at org.kohsuke.stapler.jelly.CallTagLibScript.run(CallTagLibScript.java:120)
      	at org.apache.commons.jelly.impl.ScriptBlock.run(ScriptBlock.java:95)
      	at org.apache.commons.jelly.TagSupport.invokeBody(TagSupport.java:161)
      	at org.apache.commons.jelly.tags.core.ForEachTag.doTag(ForEachTag.java:150)
      	at org.apache.commons.jelly.impl.TagScript.run(TagScript.java:269)
      	at org.apache.commons.jelly.impl.ScriptBlock.run(ScriptBlock.java:95)
      	at org.kohsuke.stapler.jelly.ReallyStaticTagLibrary$1.run(ReallyStaticTagLibrary.java:99)
      	at org.kohsuke.stapler.jelly.ReallyStaticTagLibrary$1.run(ReallyStaticTagLibrary.java:99)
      	at org.apache.commons.jelly.impl.ScriptBlock.run(ScriptBlock.java:95)
      	at org.apache.commons.jelly.tags.core.CoreTagLibrary$1.run(CoreTagLibrary.java:98)
      	at org.apache.commons.jelly.impl.ScriptBlock.run(ScriptBlock.java:95)
      	at org.kohsuke.stapler.jelly.ReallyStaticTagLibrary$1.run(ReallyStaticTagLibrary.java:99)
      	at org.apache.commons.jelly.impl.ScriptBlock.run(ScriptBlock.java:95)
      	at org.apache.commons.jelly.tags.core.CoreTagLibrary$2.run(CoreTagLibrary.java:105)
      	at org.kohsuke.stapler.jelly.CallTagLibScript.run(CallTagLibScript.java:120)
      	at org.kohsuke.stapler.jelly.groovy.JellyBuilder.doInvokeMethod(JellyBuilder.java:276)
      	at org.kohsuke.stapler.jelly.groovy.Namespace$ProxyImpl.invoke(Namespace.java:92)
      	at com.sun.proxy.$Proxy68.projectView(Unknown Source)
      	at lib.JenkinsTagLib$projectView$1.call(Unknown Source)
      	at org.codehaus.groovy.runtime.callsite.CallSiteArray.defaultCall(CallSiteArray.java:48)
      	at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:113)
      	at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:133)
      	at hudson.model.View.main.run(main.groovy:16)
      	at org.kohsuke.stapler.jelly.groovy.GroovierJellyScript.run(GroovierJellyScript.java:74)
      	at org.kohsuke.stapler.jelly.groovy.GroovierJellyScript.run(GroovierJellyScript.java:62)
      	at org.kohsuke.stapler.jelly.IncludeTag.doTag(IncludeTag.java:147)
      	at org.apache.commons.jelly.impl.TagScript.run(TagScript.java:269)
      	at org.apache.commons.jelly.impl.ScriptBlock.run(ScriptBlock.java:95)
      	at org.kohsuke.stapler.jelly.CallTagLibScript$1.run(CallTagLibScript.java:99)
      	at org.apache.commons.jelly.tags.define.InvokeBodyTag.doTag(InvokeBodyTag.java:91)
      	at org.apache.commons.jelly.impl.TagScript.run(TagScript.java:269)
      	at org.apache.commons.jelly.impl.ScriptBlock.run(ScriptBlock.java:95)
      	at org.apache.commons.jelly.tags.core.CoreTagLibrary$1.run(CoreTagLibrary.java:98)
      	at org.apache.commons.jelly.impl.ScriptBlock.run(ScriptBlock.java:95)
      	at org.apache.commons.jelly.tags.core.CoreTagLibrary$2.run(CoreTagLibrary.java:105)
      	at org.kohsuke.stapler.jelly.CallTagLibScript.run(CallTagLibScript.java:120)
      	at org.apache.commons.jelly.impl.ScriptBlock.run(ScriptBlock.java:95)
      	at org.kohsuke.stapler.jelly.CallTagLibScript$1.run(CallTagLibScript.java:99)
      	at org.apache.commons.jelly.tags.define.InvokeBodyTag.doTag(InvokeBodyTag.java:91)
      	at org.apache.commons.jelly.impl.TagScript.run(TagScript.java:269)
      	at org.apache.commons.jelly.impl.ScriptBlock.run(ScriptBlock.java:95)
      	at org.kohsuke.stapler.jelly.ReallyStaticTagLibrary$1.run(ReallyStaticTagLibrary.java:99)
      	at org.apache.commons.jelly.impl.ScriptBlock.run(ScriptBlock.java:95)
      	at org.kohsuke.stapler.jelly.ReallyStaticTagLibrary$1.run(ReallyStaticTagLibrary.java:99)
      	at org.apache.commons.jelly.impl.ScriptBlock.run(ScriptBlock.java:95)
      	at org.kohsuke.stapler.jelly.ReallyStaticTagLibrary$1.run(ReallyStaticTagLibrary.java:99)
      	at org.apache.commons.jelly.impl.ScriptBlock.run(ScriptBlock.java:95)
      	at org.kohsuke.stapler.jelly.ReallyStaticTagLibrary$1.run(ReallyStaticTagLibrary.java:99)
      	at org.apache.commons.jelly.impl.ScriptBlock.run(ScriptBlock.java:95)
      	at org.apache.commons.jelly.tags.core.CoreTagLibrary$2.run(CoreTagLibrary.java:105)
      	at org.kohsuke.stapler.jelly.CallTagLibScript.run(CallTagLibScript.java:120)
      	at org.kohsuke.stapler.jelly.CompressTag.doTag(CompressTag.java:44)
      	at org.apache.commons.jelly.impl.TagScript.run(TagScript.java:269)
      	at org.kohsuke.stapler.jelly.JellyViewScript.run(JellyViewScript.java:95)
      	at org.kohsuke.stapler.jelly.DefaultScriptInvoker.invokeScript(DefaultScriptInvoker.java:63)
      	at org.kohsuke.stapler.jelly.DefaultScriptInvoker.invokeScript(DefaultScriptInvoker.java:53)
      	at org.kohsuke.stapler.jelly.JellyClassTearOff.serveIndexJelly(JellyClassTearOff.java:112)
      	at org.kohsuke.stapler.jelly.JellyFacet.handleIndexRequest(JellyFacet.java:140)
      	at org.kohsuke.stapler.IndexViewDispatcher.dispatch(IndexViewDispatcher.java:30)
      	at org.kohsuke.stapler.Stapler.tryInvoke(Stapler.java:715)
      	at org.kohsuke.stapler.Stapler.invoke(Stapler.java:845)
      	at org.kohsuke.stapler.Stapler.tryInvoke(Stapler.java:782)
      	at org.kohsuke.stapler.Stapler.invoke(Stapler.java:845)
      	at org.kohsuke.stapler.Stapler.invoke(Stapler.java:649)
      	at org.kohsuke.stapler.Stapler.service(Stapler.java:238)
      	at javax.servlet.http.HttpServlet.service(HttpServlet.java:790)
      	at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:841)
      	at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1650)
      	at hudson.util.PluginServletFilter$1.doFilter(PluginServletFilter.java:154)
      	at org.jenkinsci.plugins.ssegateway.Endpoint$SSEListenChannelFilter.doFilter(Endpoint.java:225)
      	at hudson.util.PluginServletFilter$1.doFilter(PluginServletFilter.java:151)
      	at io.jenkins.blueocean.auth.jwt.impl.JwtAuthenticationFilter.doFilter(JwtAuthenticationFilter.java:61)
      	at hudson.util.PluginServletFilter$1.doFilter(PluginServletFilter.java:151)
      	at com.cloudbees.jenkins.support.slowrequest.SlowRequestFilter.doFilter(SlowRequestFilter.java:37)
      	at hudson.util.PluginServletFilter$1.doFilter(PluginServletFilter.java:151)
      	at io.jenkins.blueocean.ResourceCacheControl.doFilter(ResourceCacheControl.java:134)
      	at hudson.util.PluginServletFilter$1.doFilter(PluginServletFilter.java:151)
      	at jenkins.metrics.impl.MetricsFilter.doFilter(MetricsFilter.java:125)
      	at hudson.util.PluginServletFilter$1.doFilter(PluginServletFilter.java:151)
      	at hudson.util.PluginServletFilter.doFilter(PluginServletFilter.java:157)
      	at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1637)
      	at hudson.security.csrf.CrumbFilter.doFilter(CrumbFilter.java:105)
      	at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1637)
      	at hudson.security.ChainedServletFilter$1.doFilter(ChainedServletFilter.java:84)
      	at hudson.security.UnwrapSecurityExceptionFilter.doFilter(UnwrapSecurityExceptionFilter.java:51)
      	at hudson.security.ChainedServletFilter$1.doFilter(ChainedServletFilter.java:87)
      	at jenkins.security.ExceptionTranslationFilter.doFilter(ExceptionTranslationFilter.java:117)
      	at hudson.security.ChainedServletFilter$1.doFilter(ChainedServletFilter.java:87)
      	at org.acegisecurity.providers.anonymous.AnonymousProcessingFilter.doFilter(AnonymousProcessingFilter.java:125)
      	at hudson.security.ChainedServletFilter$1.doFilter(ChainedServletFilter.java:87)
      	at org.acegisecurity.ui.rememberme.RememberMeProcessingFilter.doFilter(RememberMeProcessingFilter.java:135)
      	at hudson.security.ChainedServletFilter$1.doFilter(ChainedServletFilter.java:87)
      	at org.acegisecurity.ui.AbstractProcessingFilter.doFilter(AbstractProcessingFilter.java:271)
      	at hudson.security.ChainedServletFilter$1.doFilter(ChainedServletFilter.java:87)
      	at jenkins.security.BasicHeaderProcessor.doFilter(BasicHeaderProcessor.java:93)
      	at hudson.security.ChainedServletFilter$1.doFilter(ChainedServletFilter.java:87)
      	at org.acegisecurity.context.HttpSessionContextIntegrationFilter.doFilter(HttpSessionContextIntegrationFilter.java:249)
      	at hudson.security.HttpSessionContextIntegrationFilter2.doFilter(HttpSessionContextIntegrationFilter2.java:67)
      	at hudson.security.ChainedServletFilter$1.doFilter(ChainedServletFilter.java:87)
      	at hudson.security.ChainedServletFilter.doFilter(ChainedServletFilter.java:90)
      	at hudson.security.HudsonFilter.doFilter(HudsonFilter.java:171)
      	at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1637)
      	at org.kohsuke.stapler.compression.CompressionFilter.doFilter(CompressionFilter.java:49)
      	at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1637)
      	at hudson.util.CharacterEncodingFilter.doFilter(CharacterEncodingFilter.java:82)
      	at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1637)
      	at org.kohsuke.stapler.DiagnosticThreadNameFilter.doFilter(DiagnosticThreadNameFilter.java:30)
      	at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1637)
      	at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:533)
      	at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
      	at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:524)
      	at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
      	at org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:190)
      	at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:1595)
      	at org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:188)
      	at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1253)
      	at org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:168)
      	at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:473)
      	at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:1564)
      	at org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:166)
      	at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1155)
      	at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
      	at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
      	at org.eclipse.jetty.server.Server.handle(Server.java:564)
      	at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:317)
      	at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:251)
      	at org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:279)
      	at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:110)
      	at org.eclipse.jetty.io.ChannelEndPoint$2.run(ChannelEndPoint.java:124)
      	at org.eclipse.jetty.util.thread.Invocable.invokePreferred(Invocable.java:128)
      	at org.eclipse.jetty.util.thread.Invocable$InvocableExecutor.invoke(Invocable.java:222)
      	at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:294)
      	at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.run(EatWhatYouKill.java:199)
      	at winstone.BoundedExecutorService$1.run(BoundedExecutorService.java:77)
      	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
      	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
      	at java.lang.Thread.run(Thread.java:745)
      Caused by: java.io.EOFException: input contained no data
      	at org.xmlpull.mxp1.MXParser.fillBuf(MXParser.java:3003)
      	at org.xmlpull.mxp1.MXParser.more(MXParser.java:3046)
      	at org.xmlpull.mxp1.MXParser.parseProlog(MXParser.java:1410)
      	at org.xmlpull.mxp1.MXParser.nextImpl(MXParser.java:1395)
      	at org.xmlpull.mxp1.MXParser.next(MXParser.java:1093)
      	at com.thoughtworks.xstream.io.xml.XppReader.pullNextEvent(XppReader.java:109)
      	... 195 more
      

      There is no further output from the Jenkins log after this.

      9. The job doesn't show up as running when I look at Build Executor Status on the home page for Jenkins:

      10. But once I drill down into the branch whose job I'm talking about, it looks like it's still trying to run:

      11. Retrying the Script Console trick, it thinks the job has been killed:

      12. But, it hasn't been. That same script console thing can be attempted, and will return the same message each time:

      Stopping pipeline-perf-testcases/DeclarativeParallels #4
      pipeline-perf-testcases/DeclarativeParallels #4 stopped.
      Result: 1
      {code]
      
      *Pipeline Code:*
      This is residing in a multibranch project on a Bitbucket server, in the form of a Jenkinsfile.
      {code:none}
      pipeline {
      
          agent any
      
          environment {
              SOMETHING_TO_INHERIT  = "This has been inherited!"
              SOMETHING_TO_OVERRIDE = "This should be overriden, if you see it, that's wrong."
          }
      
          options {
              buildDiscarder(logRotator(numToKeepStr: '20')) 
          }
      
          stages {
              stage ('Parallel Wrapper') {  // start of parallel wrapper
                  parallel {
                      stage('parallel-1') {
                          steps {
                              echo "Let's check our environment variables"
                              echo SOMETHING_TO_INHERIT
                              echo SOMETHING_TO_OVERRIDE
                              sh 'for i in `seq 1 100`; do cat /dev/urandom | env LC_CTYPE=c tr -dc \\\'[:alpha:]\\\' | head -c 50000; done'
                          }
                      }
                      stage('parallel-2 OVERRIDES') {
                          environment {
                              SOMETHING_TO_OVERRIDE = "YES --> PARALLEL-2"
                          }
                          steps {
                              echo "Let's check our environment variables"
                              echo SOMETHING_TO_INHERIT
                              echo SOMETHING_TO_OVERRIDE
                              sh 'for i in `seq 1 100`; do cat /dev/urandom | env LC_CTYPE=c tr -dc \\\'[:alpha:]\\\' | head -c 50000; done'
                          }
                      }
                      stage('parallel-3 OVERRIDES') {
                          environment {
                              SOMETHING_TO_OVERRIDE = "YES --> PARALLEL-3"
                          }
                          steps {
                              echo "Let's check our environment variables"
                              echo SOMETHING_TO_INHERIT
                              echo SOMETHING_TO_OVERRIDE
                              sh 'for i in `seq 1 100`; do cat /dev/urandom | env LC_CTYPE=c tr -dc \\\'[:alpha:]\\\' | head -c 50000; done'
                          }
                      }
                      stage('parallel-4 OVERRIDES') {
                          environment {
                              SOMETHING_TO_OVERRIDE = "YES --> PARALLEL-4"
                          }
                          steps {
                              echo "Let's check our environment variables"
                              echo SOMETHING_TO_INHERIT
                              echo SOMETHING_TO_OVERRIDE
                              sh 'for i in `seq 1 100`; do cat /dev/urandom | env LC_CTYPE=c tr -dc \\\'[:alpha:]\\\' | head -c 50000; done'
                          }
                      }
                      stage('parallel-5 OVERRIDES') {
                          environment {
                              SOMETHING_TO_OVERRIDE = "YES --> PARALLEL-5"
                          }
                          steps {
                              echo "Let's check our environment variables"
                              echo SOMETHING_TO_INHERIT
                              echo SOMETHING_TO_OVERRIDE
                              sh 'for i in `seq 1 100`; do cat /dev/urandom | env LC_CTYPE=c tr -dc \\\'[:alpha:]\\\' | head -c 50000; done'
                          }
                      }
                      stage('parallel-6 OVERRIDES') {
                          environment {
                              SOMETHING_TO_OVERRIDE = "YES --> PARALLEL-6"
                          }
                          steps {
                              echo "Let's check our environment variables"
                              echo SOMETHING_TO_INHERIT
                              echo SOMETHING_TO_OVERRIDE
                              sh 'for i in `seq 1 100`; do cat /dev/urandom | env LC_CTYPE=c tr -dc \\\'[:alpha:]\\\' | head -c 50000; done'
                          }
                      }
                      stage('parallel-7 OVERRIDES') {
                          environment {
                              SOMETHING_TO_OVERRIDE = "YES --> PARALLEL-7"
                          }
                          steps {
                              echo "Let's check our environment variables"
                              echo SOMETHING_TO_INHERIT
                              echo SOMETHING_TO_OVERRIDE
                              sh 'for i in `seq 1 100`; do cat /dev/urandom | env LC_CTYPE=c tr -dc \\\'[:alpha:]\\\' | head -c 50000; done'
                          }
                      }
      
                      stage('parallel-8 OVERRIDES') {
                          environment {
                              SOMETHING_TO_OVERRIDE = "YES --> PARALLEL8"
                          }
                          steps {
                              echo "Let's check our environment variables"
                              echo SOMETHING_TO_INHERIT
                              echo SOMETHING_TO_OVERRIDE
                              sh 'for i in `seq 1 100`; do cat /dev/urandom | env LC_CTYPE=c tr -dc \\\'[:alpha:]\\\' | head -c 50000; done'
                          }
                      }
                      stage('parallel-9 OVERRIDES') {
                          environment {
                              SOMETHING_TO_OVERRIDE = "YES --> PARALLEL-9"
                          }
                          steps {
                              echo "Let's check our environment variables"
                              echo SOMETHING_TO_INHERIT
                              echo SOMETHING_TO_OVERRIDE
                              sh 'for i in `seq 1 100`; do cat /dev/urandom | env LC_CTYPE=c tr -dc \\\'[:alpha:]\\\' | head -c 50000; done'
                          }
                      }
                      stage('parallel-10 OVERRIDES') {
                          environment {
                              SOMETHING_TO_OVERRIDE = "YES --> PARALLEL-10"
                          }
                          steps {
                              echo "Let's check our environment variables"
                              echo SOMETHING_TO_INHERIT
                              echo SOMETHING_TO_OVERRIDE
                              sh 'for i in `seq 1 100`; do cat /dev/urandom | env LC_CTYPE=c tr -dc \\\'[:alpha:]\\\' | head -c 50000; done'
                          }
                      }
                  } // end of parallel
              } // end of wrapper stage
          } // end stages
      
          /* 
          This section is evaluated after the stages are all done.
          */
          post {
              always {
                  echo "ALWAYS --> Runs all the time."
                  // Delete our work directory to save on disk space.
                  deleteDir()
              }
              success {
                  echo "SUCCESS --> Whatever we did, it worked. Yay!"
              }
              failure {
                  echo "FAILURE --> Failed. Womp womp."
              }
          }
      
       }
      

        Attachments

          Activity

          Hide
          kshultz Karl Shultz added a comment - - edited

          I was able to successfully delete the job using the Jenkins CLI:

          ✔ ~/Documents/CloudBees/Work/Pipeline-Storage-Improvements/bitbucket-repos-for-testing/pipeline-perf-testcases [DeclarativeParallels|✔] 
          12:49 $ java -jar ~/bin/jenkins-cli.jar -auth creds:creds -s http://1.2.3.4:8080/ delete-builds pipeline-perf-testcases/DeclarativeParallels 4 
          Deleted 1 builds
          

          Finally gone:

          Show
          kshultz Karl Shultz added a comment - - edited I was able to successfully delete the job using the Jenkins CLI: ✔ ~/Documents/CloudBees/Work/Pipeline-Storage-Improvements/bitbucket-repos-for-testing/pipeline-perf-testcases [DeclarativeParallels|✔] 12:49 $ java -jar ~/bin/jenkins-cli.jar -auth creds:creds -s http://1.2.3.4:8080/ delete-builds pipeline-perf-testcases/DeclarativeParallels 4 Deleted 1 builds Finally gone:
          Hide
          svanoort Sam Van Oort added a comment -

          This actually points to a need to harden the whole CpsFlowExecution against FlowNode storage linked IOExceptions - basically disk issues can cause all sorts of weird states in the build loading.

          Not all of that is new, of course – probably some previous bizarrely hung pipelines were due to poor handling of IOExceptions here (or during load/storage initialization).   But it certainly does help harden this, and prevent against a more common error mode.  We've already had to harden once against related issues during durability work.

          Anyway, the error is reproduced in test (though it doesn't generate a hung build for me).  Should be enough to prove the testcase covers this issue though. 

          Show
          svanoort Sam Van Oort added a comment - This actually points to a need to harden the whole CpsFlowExecution against FlowNode storage linked IOExceptions - basically disk issues can cause all sorts of weird states in the build loading. Not all of that is new, of course – probably some previous bizarrely hung pipelines were due to poor handling of IOExceptions here (or during load/storage initialization).   But it certainly does help harden this, and prevent against a more common error mode.  We've already had to harden once against related issues during durability work. Anyway, the error is reproduced in test (though it doesn't generate a hung build for me).  Should be enough to prove the testcase covers this issue though. 
          Hide
          svanoort Sam Van Oort added a comment -

          So, after much gnashing of teeth this one should be solved by the workflow-support beta-2 and workflow-cps beta-2 – the build won't stick around in a weird hung state, it will be hard-failed because it can't read the flownodes (with original flownode data retained and a new folder used) – with test inclusion. 

          Also probably a whole bunch of other weird hangs with similar causes will be patched-over now. 

          Show
          svanoort Sam Van Oort added a comment - So, after much gnashing of teeth this one should be solved by the workflow-support beta-2 and workflow-cps beta-2 – the build won't stick around in a weird hung state, it will be hard-failed because it can't read the flownodes (with original flownode data retained and a new folder used) – with test inclusion.  Also probably a whole bunch of other weird hangs with similar causes will be patched-over now. 
          Hide
          scm_issue_link SCM/JIRA link daemon added a comment -

          Code changed in jenkins
          User: Sam Van Oort
          Path:
          pom.xml
          src/main/java/org/jenkinsci/plugins/workflow/cps/CpsFlowExecution.java
          src/test/java/org/jenkinsci/plugins/workflow/cps/FlowDurabilityTest.java
          http://jenkins-ci.org/commit/workflow-cps-plugin/b24dbd00312f4e37692564f657c6adaf24a42e24
          Log:
          Commit a giant wad of changes that makes CpsFlowExecution more bulletproof against flownode storage failures and solves JENKINS-48824

          Show
          scm_issue_link SCM/JIRA link daemon added a comment - Code changed in jenkins User: Sam Van Oort Path: pom.xml src/main/java/org/jenkinsci/plugins/workflow/cps/CpsFlowExecution.java src/test/java/org/jenkinsci/plugins/workflow/cps/FlowDurabilityTest.java http://jenkins-ci.org/commit/workflow-cps-plugin/b24dbd00312f4e37692564f657c6adaf24a42e24 Log: Commit a giant wad of changes that makes CpsFlowExecution more bulletproof against flownode storage failures and solves JENKINS-48824
          Hide
          kshultz Karl Shultz added a comment - - edited

          I can confirm this is fixed with beta-2. Same job, same instance, same everything, but running on beta-2 code:

          I started the big nasty parallel declarative job, paused it, and bounced the entire machine from the AWS console. Upon bringup, I saw:

          Resuming build at Thu Jan 18 21:22:12 UTC 2018 after Jenkins restart
          Still paused
          Resuming
          [parallel-7 OVERRIDES] AiwNQbRNgxPyDVenLUngLBZEKFC
          (snip...)
          

          Job completed successfully. Very nice!

          Alternate method of killing Jenkins mid-job:

          ubuntu@ip-172-18-40-95:~$ ps -ef | grep java
          jenkins   2306     1  0 21:35 ?        00:00:00 /usr/bin/daemon --name=jenkins --inherit --env=JENKINS_HOME=/var/lib/jenkins --output=/var/log/jenkins/jenkins.log --pidfile=/var/run/jenkins/jenkins.pid -- /usr/bin/java -Djava.awt.headless=true -jar /usr/share/jenkins/jenkins.war --webroot=/var/cache/jenkins/war --httpPort=8080
          jenkins   2308  2306 41 21:35 ?        00:00:44 /usr/bin/java -Djava.awt.headless=true -jar /usr/share/jenkins/jenkins.war --webroot=/var/cache/jenkins/war --httpPort=8080
          ubuntu    2656  1656  0 21:36 pts/0    00:00:00 grep --color=auto java
          ##
          ## Right now the job is paused
          ##
          ubuntu@ip-172-18-40-95:~$ sudo kill 2306
          ubuntu@ip-172-18-40-95:~$ ps -ef | grep java
          ubuntu    2727  1656  0 21:36 pts/0    00:00:00 grep --color=auto java
          ##
          ## F5 from browser shows Jenkins is down. Now restart it.
          ## 
          ubuntu@ip-172-18-40-95:~$ sudo service jenkins restart
          

          Job completes successfully. Tail end of the console output does show a ??? at the end of the Waiting to resume line. Harmless?

          Pausing
          Resuming build at Thu Jan 18 21:37:25 UTC 2018 after Jenkins restart
          Waiting to resume part of bitbucket-pipeline-perf-testcases » DeclarativeParallelsSleep #9: ???
          Still paused
          Resuming
          [Pipeline] [parallel-1] }
          [Pipeline] [parallel-2 OVERRIDES] }
          [Pipeline] [parallel-3 OVERRIDES] }
          [Pipeline] [parallel-4 OVERRIDES] }
          [Pipeline] [parallel-5 OVERRIDES] }
          [Pipeline] [parallel-6 OVERRIDES] }
          [Pipeline] [parallel-7 OVERRIDES] }
          [Pipeline] [parallel-8 OVERRIDES] }
          [Pipeline] [parallel-9 OVERRIDES] }
          [Pipeline] [parallel-10 OVERRIDES] }
          [Pipeline] [parallel-1] // stage
          [Pipeline] [parallel-2 OVERRIDES] // withEnv
          [Pipeline] [parallel-3 OVERRIDES] // withEnv
          [Pipeline] [parallel-4 OVERRIDES] // withEnv
          [Pipeline] [parallel-5 OVERRIDES] // withEnv
          [Pipeline] [parallel-6 OVERRIDES] // withEnv
          [Pipeline] [parallel-7 OVERRIDES] // withEnv
          [Pipeline] [parallel-8 OVERRIDES] // withEnv
          [Pipeline] [parallel-9 OVERRIDES] // withEnv
          [Pipeline] [parallel-10 OVERRIDES] // withEnv
          [Pipeline] [parallel-1] }
          [Pipeline] [parallel-2 OVERRIDES] }
          [Pipeline] [parallel-3 OVERRIDES] }
          [Pipeline] [parallel-4 OVERRIDES] }
          [Pipeline] [parallel-5 OVERRIDES] }
          [Pipeline] [parallel-6 OVERRIDES] }
          [Pipeline] [parallel-7 OVERRIDES] }
          [Pipeline] [parallel-8 OVERRIDES] }
          [Pipeline] [parallel-9 OVERRIDES] }
          [Pipeline] [parallel-10 OVERRIDES] }
          [Pipeline] [parallel-2 OVERRIDES] // stage
          [Pipeline] [parallel-3 OVERRIDES] // stage
          [Pipeline] [parallel-4 OVERRIDES] // stage
          [Pipeline] [parallel-5 OVERRIDES] // stage
          [Pipeline] [parallel-6 OVERRIDES] // stage
          [Pipeline] [parallel-7 OVERRIDES] // stage
          [Pipeline] [parallel-8 OVERRIDES] // stage
          [Pipeline] [parallel-9 OVERRIDES] // stage
          [Pipeline] [parallel-10 OVERRIDES] // stage
          [Pipeline] [parallel-2 OVERRIDES] }
          [Pipeline] [parallel-3 OVERRIDES] }
          [Pipeline] [parallel-4 OVERRIDES] }
          [Pipeline] [parallel-5 OVERRIDES] }
          [Pipeline] [parallel-6 OVERRIDES] }
          [Pipeline] [parallel-7 OVERRIDES] }
          [Pipeline] [parallel-8 OVERRIDES] }
          [Pipeline] [parallel-9 OVERRIDES] }
          [Pipeline] [parallel-10 OVERRIDES] }
          [Pipeline] // parallel
          [Pipeline] }
          [Pipeline] // stage
          [Pipeline] stage
          [Pipeline] { (Declarative: Post Actions)
          [Pipeline] echo
          ALWAYS --> Runs all the time.
          [Pipeline] deleteDir
          [Pipeline] echo
          SUCCESS --> Whatever we did, it worked. Yay!
          [Pipeline] }
          [Pipeline] // stage
          [Pipeline] }
          [Pipeline] // withEnv
          [Pipeline] }
          [Pipeline] // withEnv
          [Pipeline] }
          [Pipeline] // node
          [Pipeline] End of Pipeline
          [Bitbucket] Notifying commit build result
          
          Show
          kshultz Karl Shultz added a comment - - edited I can confirm this is fixed with beta-2. Same job, same instance, same everything, but running on beta-2 code: I started the big nasty parallel declarative job, paused it, and bounced the entire machine from the AWS console. Upon bringup, I saw: Resuming build at Thu Jan 18 21:22:12 UTC 2018 after Jenkins restart Still paused Resuming [parallel-7 OVERRIDES] AiwNQbRNgxPyDVenLUngLBZEKFC (snip...) Job completed successfully. Very nice! Alternate method of killing Jenkins mid-job: ubuntu@ip-172-18-40-95:~$ ps -ef | grep java jenkins 2306 1 0 21:35 ? 00:00:00 /usr/bin/daemon --name=jenkins --inherit --env=JENKINS_HOME=/var/lib/jenkins --output=/var/log/jenkins/jenkins.log --pidfile=/var/run/jenkins/jenkins.pid -- /usr/bin/java -Djava.awt.headless=true -jar /usr/share/jenkins/jenkins.war --webroot=/var/cache/jenkins/war --httpPort=8080 jenkins 2308 2306 41 21:35 ? 00:00:44 /usr/bin/java -Djava.awt.headless=true -jar /usr/share/jenkins/jenkins.war --webroot=/var/cache/jenkins/war --httpPort=8080 ubuntu 2656 1656 0 21:36 pts/0 00:00:00 grep --color=auto java ## ## Right now the job is paused ## ubuntu@ip-172-18-40-95:~$ sudo kill 2306 ubuntu@ip-172-18-40-95:~$ ps -ef | grep java ubuntu 2727 1656 0 21:36 pts/0 00:00:00 grep --color=auto java ## ## F5 from browser shows Jenkins is down. Now restart it. ## ubuntu@ip-172-18-40-95:~$ sudo service jenkins restart Job completes successfully. Tail end of the console output does show a ??? at the end of the Waiting to resume line. Harmless? Pausing Resuming build at Thu Jan 18 21:37:25 UTC 2018 after Jenkins restart Waiting to resume part of bitbucket-pipeline-perf-testcases » DeclarativeParallelsSleep #9: ??? Still paused Resuming [Pipeline] [parallel-1] } [Pipeline] [parallel-2 OVERRIDES] } [Pipeline] [parallel-3 OVERRIDES] } [Pipeline] [parallel-4 OVERRIDES] } [Pipeline] [parallel-5 OVERRIDES] } [Pipeline] [parallel-6 OVERRIDES] } [Pipeline] [parallel-7 OVERRIDES] } [Pipeline] [parallel-8 OVERRIDES] } [Pipeline] [parallel-9 OVERRIDES] } [Pipeline] [parallel-10 OVERRIDES] } [Pipeline] [parallel-1] // stage [Pipeline] [parallel-2 OVERRIDES] // withEnv [Pipeline] [parallel-3 OVERRIDES] // withEnv [Pipeline] [parallel-4 OVERRIDES] // withEnv [Pipeline] [parallel-5 OVERRIDES] // withEnv [Pipeline] [parallel-6 OVERRIDES] // withEnv [Pipeline] [parallel-7 OVERRIDES] // withEnv [Pipeline] [parallel-8 OVERRIDES] // withEnv [Pipeline] [parallel-9 OVERRIDES] // withEnv [Pipeline] [parallel-10 OVERRIDES] // withEnv [Pipeline] [parallel-1] } [Pipeline] [parallel-2 OVERRIDES] } [Pipeline] [parallel-3 OVERRIDES] } [Pipeline] [parallel-4 OVERRIDES] } [Pipeline] [parallel-5 OVERRIDES] } [Pipeline] [parallel-6 OVERRIDES] } [Pipeline] [parallel-7 OVERRIDES] } [Pipeline] [parallel-8 OVERRIDES] } [Pipeline] [parallel-9 OVERRIDES] } [Pipeline] [parallel-10 OVERRIDES] } [Pipeline] [parallel-2 OVERRIDES] // stage [Pipeline] [parallel-3 OVERRIDES] // stage [Pipeline] [parallel-4 OVERRIDES] // stage [Pipeline] [parallel-5 OVERRIDES] // stage [Pipeline] [parallel-6 OVERRIDES] // stage [Pipeline] [parallel-7 OVERRIDES] // stage [Pipeline] [parallel-8 OVERRIDES] // stage [Pipeline] [parallel-9 OVERRIDES] // stage [Pipeline] [parallel-10 OVERRIDES] // stage [Pipeline] [parallel-2 OVERRIDES] } [Pipeline] [parallel-3 OVERRIDES] } [Pipeline] [parallel-4 OVERRIDES] } [Pipeline] [parallel-5 OVERRIDES] } [Pipeline] [parallel-6 OVERRIDES] } [Pipeline] [parallel-7 OVERRIDES] } [Pipeline] [parallel-8 OVERRIDES] } [Pipeline] [parallel-9 OVERRIDES] } [Pipeline] [parallel-10 OVERRIDES] } [Pipeline] // parallel [Pipeline] } [Pipeline] // stage [Pipeline] stage [Pipeline] { (Declarative: Post Actions) [Pipeline] echo ALWAYS --> Runs all the time. [Pipeline] deleteDir [Pipeline] echo SUCCESS --> Whatever we did, it worked. Yay! [Pipeline] } [Pipeline] // stage [Pipeline] } [Pipeline] // withEnv [Pipeline] } [Pipeline] // withEnv [Pipeline] } [Pipeline] // node [Pipeline] End of Pipeline [Bitbucket] Notifying commit build result

            People

            • Assignee:
              svanoort Sam Van Oort
              Reporter:
              kshultz Karl Shultz
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: