Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-51215

Timeouts suspending pipeline executions (that are already suspended?)

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Unresolved
    • Icon: Minor Minor
    • workflow-cps-plugin
    • None
    • core and plugins fully up-to-date (e.g. jenkins core 2.120)

      As suggested by svanoort – based upon:

      1. https://issues.jenkins-ci.org/browse/JENKINS-34256?focusedCommentId=337057&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-337057

        I would open a separate issue for Timeouts suspending executions - especially if you can come up with a consistent way to reproduce it. I saw it from time to time with Pipelines doing very complex processing (where we can't block the shutdown forever and shouldn't).

        My suspicion is that there's a subtle bug around the halt-at-shutdown logic, which may have been pre-existing but is visible now because the process is more closely monitored and logged now (also because we actually have some test coverage for it). Unfortunately

      2. https://issues.jenkins-ci.org/browse/JENKINS-34256?focusedCommentId=336885&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-336885

      Steps to re-produce:

      1. Start scripted pipeline, e.g.:
        #!/usr/bin/env groovy
        
        stage('stage') {
          timestamps {
            echo('In stage "stage"...')
            node {
              sh 'sleep 90'
            }
            echo('After sleeping...')
          }
        }
        
      2. Enter shutdown mode during "sh 'sleep ...'" step
      3. Restart Jenkins
      4. Now I am not sure if this is relevant or not, but: in my case there is an init.d groovy hook script that is setting Jenkins in shutdown mode right at the start-up of Jenkins...
        • Not sure if it is sufficient to put it manually in shutdown mode as well, but I could imagine!?
      5. After successful (first) Jenkins restart (i.e. Jenkins logs "Jenkins is fully up and running" and is in shutdown mode) trigger the second restart
        • Thread dump before: thread-dump-before.txt
        • Thread dump a few seconds after triggering restart ( thread-dump-after.txt ), the interesting threads:
          "Thread-1" #19 prio=5 os_prio=0 tid=0x00007f982804f800 nid=0x6b3c waiting on condition [0x00007f97bdde5000]
             java.lang.Thread.State: TIMED_WAITING (parking)
                  at sun.misc.Unsafe.park(Native Method)
                  - parking to wait for  <0x00000000fedec7e0> (a com.google.common.util.concurrent.AbstractFuture$Sync)
                  at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
                  at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedNanos(AbstractQueuedSynchronizer.java:1037)
                  at java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1328)
                  at com.google.common.util.concurrent.AbstractFuture$Sync.get(AbstractFuture.java:258)
                  at com.google.common.util.concurrent.AbstractFuture.get(AbstractFuture.java:91)
                  at org.jenkinsci.plugins.workflow.cps.CpsFlowExecution.suspendAll(CpsFlowExecution.java:1555)
                  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
                  at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
                  at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
                  at java.lang.reflect.Method.invoke(Method.java:498)
                  at hudson.init.TaskMethodFinder.invoke(TaskMethodFinder.java:104)
                  at hudson.init.TaskMethodFinder$TaskImpl.run(TaskMethodFinder.java:175)
                  at org.jvnet.hudson.reactor.Reactor.runTask(Reactor.java:296)
                  at org.jvnet.hudson.reactor.Reactor$2.run(Reactor.java:214)
                  at org.jvnet.hudson.reactor.Reactor$Node.run(Reactor.java:117)
                  at jenkins.model.Jenkins$18.execute(Jenkins.java:3338)
                  at org.jvnet.hudson.reactor.Reactor$Node.runIfPossible(Reactor.java:139)
                  at org.jvnet.hudson.reactor.Reactor$Node.run(Reactor.java:128)
                  - locked <0x00000000e399c808> (a org.jvnet.hudson.reactor.Reactor)
                  at jenkins.model.Jenkins$18.execute(Jenkins.java:3338)
                  at org.jvnet.hudson.reactor.Reactor$Node.runIfPossible(Reactor.java:139)
                  at org.jvnet.hudson.reactor.Reactor.execute(Reactor.java:276)
                  - locked <0x00000000e399c808> (a org.jvnet.hudson.reactor.Reactor)
                  at jenkins.model.Jenkins._cleanUpRunTerminators(Jenkins.java:3335)
                  at jenkins.model.Jenkins.cleanUp(Jenkins.java:3256)
                  at hudson.WebAppMain.contextDestroyed(WebAppMain.java:379)
                  at org.eclipse.jetty.server.handler.ContextHandler.callContextDestroyed(ContextHandler.java:898)
                  at org.eclipse.jetty.servlet.ServletContextHandler.callContextDestroyed(ServletContextHandler.java:545)
                  at org.eclipse.jetty.server.handler.ContextHandler.stopContext(ContextHandler.java:873)
                  at org.eclipse.jetty.servlet.ServletContextHandler.stopContext(ServletContextHandler.java:355)
                  at org.eclipse.jetty.webapp.WebAppContext.stopWebapp(WebAppContext.java:1521)
                  at org.eclipse.jetty.webapp.WebAppContext.stopContext(WebAppContext.java:1485)
                  at org.eclipse.jetty.server.handler.ContextHandler.doStop(ContextHandler.java:927)
                  at org.eclipse.jetty.servlet.ServletContextHandler.doStop(ServletContextHandler.java:271)
                  at org.eclipse.jetty.webapp.WebAppContext.doStop(WebAppContext.java:569)
                  at org.eclipse.jetty.util.component.AbstractLifeCycle.stop(AbstractLifeCycle.java:89)
                  - locked <0x00000000865c0ed0> (a java.lang.Object)
                  at org.eclipse.jetty.util.component.ContainerLifeCycle.stop(ContainerLifeCycle.java:144)
                  at org.eclipse.jetty.util.component.ContainerLifeCycle.doStop(ContainerLifeCycle.java:162)
                  at org.eclipse.jetty.server.handler.AbstractHandler.doStop(AbstractHandler.java:124)
                  at org.eclipse.jetty.server.Server.doStop(Server.java:489)
                  at org.eclipse.jetty.util.component.AbstractLifeCycle.stop(AbstractLifeCycle.java:89)
                  - locked <0x00000000865b0c78> (a java.lang.Object)
                  at winstone.Launcher.shutdown(Launcher.java:307)
                  at winstone.ShutdownHook.run(ShutdownHook.java:25)
          
          "SIGTERM handler" #105 daemon prio=9 os_prio=0 tid=0x00007f9820003000 nid=0x6b38 in Object.wait() [0x00007f982ccbb000]
             java.lang.Thread.State: WAITING (on object monitor)
                  at java.lang.Object.wait(Native Method)
                  at java.lang.Thread.join(Thread.java:1252)
                  - locked <0x00000000863b7778> (a winstone.ShutdownHook)
                  at java.lang.Thread.join(Thread.java:1326)
                  at java.lang.ApplicationShutdownHooks.runHooks(ApplicationShutdownHooks.java:106)
                  at java.lang.ApplicationShutdownHooks$1.run(ApplicationShutdownHooks.java:46)
                  at java.lang.Shutdown.runHooks(Shutdown.java:123)
                  at java.lang.Shutdown.sequence(Shutdown.java:167)
                  at java.lang.Shutdown.exit(Shutdown.java:212)
                  - locked <0x0000000086319db8> (a java.lang.Class for java.lang.Shutdown)
                  at java.lang.Terminator$1.handle(Terminator.java:52)
                  at sun.misc.Signal$1.run(Signal.java:212)
                  at java.lang.Thread.run(Thread.java:748)
          

      So my guess was "... maybe because of running into timeouts when suspending (the more or less still suspended pipelines; because first action in init.d hook scripts is setting Jenkins in shutdown mode)!?"

        1. thread-dump-after.txt
          44 kB
          Reinhold Füreder
        2. thread-dump-before.txt
          45 kB
          Reinhold Füreder

            Unassigned Unassigned
            reinholdfuereder Reinhold Füreder
            Votes:
            2 Vote for this issue
            Watchers:
            4 Start watching this issue

              Created:
              Updated: