Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-32986

hard killing a pipeline leaves the JVM CPS thread running.

    Details

    • Type: Improvement
    • Status: Open (View Workflow)
    • Priority: Minor
    • Resolution: Unresolved
    • Component/s: workflow-cps-plugin
    • Labels:
      None
    • Environment:
      pipeline 1.13
      jenkins 1.642.1
    • Similar Issues:

      Description

      In the event a pipeline build will not die you can hard kill it - however hard killing it will leave the JVMs CPS thread still running on the master.

      e.g. with the script

      def spin() {
          while (true) {}
      }
      
      def map = [:]
      map ["spin_it"] = { spin() } 
      }
      parallel map
      

      you will need to hard kill it to stop it (on windows at least) - but inspecting the JVM threads you can see the CPS thread is still running in a tight loop.
      A hard kill should probably (if it is safe without causing deadlocks elsewhere) brutally kill the thread as well. After a while you may run out of handles or other native resources due to the thread usage, meaning you need to restart Jenkins to get it working again.

        Attachments

          Issue Links

            Activity

            Hide
            teilo James Nord added a comment -

            Not sure it can be blocking something that is fixed but here goes.

            Show
            teilo James Nord added a comment - Not sure it can be blocking something that is fixed but here goes.
            Hide
            jglick Jesse Glick added a comment -

            Picking up some stuff from JENKINS-25623:

            • If the CPS VM is running native code, Thread.interrupt should be called. It should be given a limited grace period—say, a few seconds—to terminate; after that, resort to Thread.stop, making sure we are able to provide a fresh Thread for the pool so we can still run finally blocks or whatever.
            • We may also need some sort of per-build CPS VM CPU quota, distinct from timeout in that we do not care about wall clock time spent running a shell script on an agent, we just care about not overloading the master. Alternately, if a given build starts taking too much CPU time (measurable via System.nanoTime around runNextChunk), gradually being delaying its chunk execution (i.e., CpsThreadGroup.scheduleRun may call schedule rather than submit) so that it does not hog the system, and also institute a hard time limit for individual chunks (such as slow native methods).
            Show
            jglick Jesse Glick added a comment - Picking up some stuff from JENKINS-25623 : If the CPS VM is running native code, Thread.interrupt should be called. It should be given a limited grace period—say, a few seconds—to terminate; after that, resort to Thread.stop , making sure we are able to provide a fresh Thread for the pool so we can still run finally blocks or whatever. We may also need some sort of per-build CPS VM CPU quota, distinct from timeout in that we do not care about wall clock time spent running a shell script on an agent, we just care about not overloading the master. Alternately, if a given build starts taking too much CPU time (measurable via System.nanoTime around runNextChunk ), gradually being delaying its chunk execution (i.e., CpsThreadGroup.scheduleRun may call schedule rather than submit ) so that it does not hog the system, and also institute a hard time limit for individual chunks (such as slow native methods).
            Hide
            jglick Jesse Glick added a comment -

            Ran across a situation where a build started but did not print any output other than its causes and had to be hard-killed. Turned out its CPS VM thread was consuming 100% CPU indefinitely:

            	at org.jboss.marshalling.reflect.UnlockedHashMap.doPut(UnlockedHashMap.java:201)
            	at org.jboss.marshalling.reflect.UnlockedHashMap.putIfAbsent(UnlockedHashMap.java:300)
            	at org.jboss.marshalling.reflect.SerializableClassRegistry.lookup(SerializableClassRegistry.java:73)
            	at org.jboss.marshalling.river.RiverMarshaller.doWriteObject(RiverMarshaller.java:177)
            	at org.jboss.marshalling.river.RiverMarshaller.doWriteFields(RiverMarshaller.java:1032)
            	at org.jboss.marshalling.river.RiverMarshaller.doWriteSerializableObject(RiverMarshaller.java:988)
            	at org.jboss.marshalling.river.RiverMarshaller.doWriteSerializableObject(RiverMarshaller.java:967)
            	at org.jboss.marshalling.river.RiverMarshaller.doWriteSerializableObject(RiverMarshaller.java:967)
            	at org.jboss.marshalling.river.RiverMarshaller.doWriteObject(RiverMarshaller.java:854)
            	at org.jboss.marshalling.river.BlockMarshaller.doWriteObject(BlockMarshaller.java:65)
            	at org.jboss.marshalling.river.BlockMarshaller.writeObject(BlockMarshaller.java:56)
            	at org.jboss.marshalling.MarshallerObjectOutputStream.writeObjectOverride(MarshallerObjectOutputStream.java:50)
            	at org.jboss.marshalling.river.RiverObjectOutputStream.writeObjectOverride(RiverObjectOutputStream.java:179)
            	at java.io.ObjectOutputStream.writeObject(ObjectOutputStream.java:344)
            	at java.util.HashMap.internalWriteEntries(HashMap.java:1777)
            	at java.util.HashMap.writeObject(HashMap.java:1354)
            	at sun.reflect.GeneratedMethodAccessor113.invoke(Unknown Source)
            	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
            	at java.lang.reflect.Method.invoke(Method.java:498)
            	at org.jboss.marshalling.reflect.SerializableClass.callWriteObject(SerializableClass.java:271)
            	at org.jboss.marshalling.river.RiverMarshaller.doWriteSerializableObject(RiverMarshaller.java:976)
            	at org.jboss.marshalling.river.RiverMarshaller.doWriteObject(RiverMarshaller.java:854)
            	at org.jboss.marshalling.river.RiverMarshaller.doWriteFields(RiverMarshaller.java:1032)
            	at org.jboss.marshalling.river.RiverMarshaller.doWriteSerializableObject(RiverMarshaller.java:988)
            	at org.jboss.marshalling.river.RiverMarshaller.doWriteObject(RiverMarshaller.java:854)
            	at org.jboss.marshalling.AbstractObjectOutput.writeObject(AbstractObjectOutput.java:58)
            	at org.jboss.marshalling.AbstractMarshaller.writeObject(AbstractMarshaller.java:111)
            	at org.jenkinsci.plugins.workflow.support.pickles.serialization.RiverWriter.writeObject(RiverWriter.java:132)
            	at org.jenkinsci.plugins.workflow.cps.CpsThreadGroup.saveProgram(CpsThreadGroup.java:465)
            	at org.jenkinsci.plugins.workflow.cps.CpsThreadGroup.saveProgram(CpsThreadGroup.java:444)
            	at org.jenkinsci.plugins.workflow.cps.CpsThreadGroup.run(CpsThreadGroup.java:394)
            	at …
            

            Looking at the code

                        while (threshold < Integer.MAX_VALUE && newSize > threshold) {
                            if (sizeUpdater.compareAndSet(table, newSize, newSize | 0x80000000)) { // ← HERE
                                resize(table);
                                return nonexistent();
                            }
                        }
            

            I am guessing we hit an infinite loop somehow. Seems to be JBMAR-189 which I guess will go into 1.4.12.Final.

            Show
            jglick Jesse Glick added a comment - Ran across a situation where a build started but did not print any output other than its causes and had to be hard-killed. Turned out its CPS VM thread was consuming 100% CPU indefinitely: at org.jboss.marshalling.reflect.UnlockedHashMap.doPut(UnlockedHashMap.java:201) at org.jboss.marshalling.reflect.UnlockedHashMap.putIfAbsent(UnlockedHashMap.java:300) at org.jboss.marshalling.reflect.SerializableClassRegistry.lookup(SerializableClassRegistry.java:73) at org.jboss.marshalling.river.RiverMarshaller.doWriteObject(RiverMarshaller.java:177) at org.jboss.marshalling.river.RiverMarshaller.doWriteFields(RiverMarshaller.java:1032) at org.jboss.marshalling.river.RiverMarshaller.doWriteSerializableObject(RiverMarshaller.java:988) at org.jboss.marshalling.river.RiverMarshaller.doWriteSerializableObject(RiverMarshaller.java:967) at org.jboss.marshalling.river.RiverMarshaller.doWriteSerializableObject(RiverMarshaller.java:967) at org.jboss.marshalling.river.RiverMarshaller.doWriteObject(RiverMarshaller.java:854) at org.jboss.marshalling.river.BlockMarshaller.doWriteObject(BlockMarshaller.java:65) at org.jboss.marshalling.river.BlockMarshaller.writeObject(BlockMarshaller.java:56) at org.jboss.marshalling.MarshallerObjectOutputStream.writeObjectOverride(MarshallerObjectOutputStream.java:50) at org.jboss.marshalling.river.RiverObjectOutputStream.writeObjectOverride(RiverObjectOutputStream.java:179) at java.io.ObjectOutputStream.writeObject(ObjectOutputStream.java:344) at java.util.HashMap.internalWriteEntries(HashMap.java:1777) at java.util.HashMap.writeObject(HashMap.java:1354) at sun.reflect.GeneratedMethodAccessor113.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.jboss.marshalling.reflect.SerializableClass.callWriteObject(SerializableClass.java:271) at org.jboss.marshalling.river.RiverMarshaller.doWriteSerializableObject(RiverMarshaller.java:976) at org.jboss.marshalling.river.RiverMarshaller.doWriteObject(RiverMarshaller.java:854) at org.jboss.marshalling.river.RiverMarshaller.doWriteFields(RiverMarshaller.java:1032) at org.jboss.marshalling.river.RiverMarshaller.doWriteSerializableObject(RiverMarshaller.java:988) at org.jboss.marshalling.river.RiverMarshaller.doWriteObject(RiverMarshaller.java:854) at org.jboss.marshalling.AbstractObjectOutput.writeObject(AbstractObjectOutput.java:58) at org.jboss.marshalling.AbstractMarshaller.writeObject(AbstractMarshaller.java:111) at org.jenkinsci.plugins.workflow.support.pickles.serialization.RiverWriter.writeObject(RiverWriter.java:132) at org.jenkinsci.plugins.workflow.cps.CpsThreadGroup.saveProgram(CpsThreadGroup.java:465) at org.jenkinsci.plugins.workflow.cps.CpsThreadGroup.saveProgram(CpsThreadGroup.java:444) at org.jenkinsci.plugins.workflow.cps.CpsThreadGroup.run(CpsThreadGroup.java:394) at … Looking at the code while (threshold < Integer .MAX_VALUE && newSize > threshold) { if (sizeUpdater.compareAndSet(table, newSize, newSize | 0x80000000)) { // ← HERE resize(table); return nonexistent(); } } I am guessing we hit an infinite loop somehow. Seems to be JBMAR-189 which I guess will go into 1.4.12.Final.
            Show
            scm_issue_link SCM/JIRA link daemon added a comment - Code changed in jenkins User: Jesse Glick Path: pom.xml http://jenkins-ci.org/commit/workflow-support-plugin/e05ab1249db1fa63bd5dcfcbd55c689cb63af36e Log: JENKINS-32986 Noting need for https://github.com/jboss-remoting/jboss-marshalling/pull/48 .
            Hide
            scm_issue_link SCM/JIRA link daemon added a comment -

            Code changed in jenkins
            User: Jesse Glick
            Path:
            src/main/java/org/jenkinsci/plugins/workflow/support/concurrent/Timeout.java
            src/test/java/org/jenkinsci/plugins/workflow/support/concurrent/TimeoutTest.java
            http://jenkins-ci.org/commit/workflow-support-plugin/c810b3874134f60be670d1205b6673fde5003c14
            Log:
            JENKINS-32986 Introducing a general-purpose Timeout utility.

            Show
            scm_issue_link SCM/JIRA link daemon added a comment - Code changed in jenkins User: Jesse Glick Path: src/main/java/org/jenkinsci/plugins/workflow/support/concurrent/Timeout.java src/test/java/org/jenkinsci/plugins/workflow/support/concurrent/TimeoutTest.java http://jenkins-ci.org/commit/workflow-support-plugin/c810b3874134f60be670d1205b6673fde5003c14 Log: JENKINS-32986 Introducing a general-purpose Timeout utility.
            Hide
            scm_issue_link SCM/JIRA link daemon added a comment -

            Code changed in jenkins
            User: Jesse Glick
            Path:
            src/main/java/org/jenkinsci/plugins/workflow/support/concurrent/Timeout.java
            src/test/java/org/jenkinsci/plugins/workflow/support/concurrent/TimeoutTest.java
            http://jenkins-ci.org/commit/workflow-support-plugin/957d76a5538747f85db6f9ae33f076ee435f534b
            Log:
            Merge pull request #29 from jglick/Timeout-JENKINS-32986

            JENKINS-32986 Introducing a general-purpose Timeout utility

            Compare: https://github.com/jenkinsci/workflow-support-plugin/compare/eb031e04d6e3...957d76a55387

            Show
            scm_issue_link SCM/JIRA link daemon added a comment - Code changed in jenkins User: Jesse Glick Path: src/main/java/org/jenkinsci/plugins/workflow/support/concurrent/Timeout.java src/test/java/org/jenkinsci/plugins/workflow/support/concurrent/TimeoutTest.java http://jenkins-ci.org/commit/workflow-support-plugin/957d76a5538747f85db6f9ae33f076ee435f534b Log: Merge pull request #29 from jglick/Timeout- JENKINS-32986 JENKINS-32986 Introducing a general-purpose Timeout utility Compare: https://github.com/jenkinsci/workflow-support-plugin/compare/eb031e04d6e3...957d76a55387
            Hide
            scm_issue_link SCM/JIRA link daemon added a comment -

            Code changed in jenkins
            User: Jesse Glick
            Path:
            pom.xml
            src/main/java/org/jenkinsci/plugins/workflow/cps/CpsBodyExecution.java
            src/main/java/org/jenkinsci/plugins/workflow/cps/CpsThread.java
            http://jenkins-ci.org/commit/workflow-cps-plugin/c0deed0a3b546ebcb59ea25681ed3ac8b13fe6bb
            Log:
            JENKINS-32986 Apply a timeout to some hang-prone operations in the CPS VM thread.

            Show
            scm_issue_link SCM/JIRA link daemon added a comment - Code changed in jenkins User: Jesse Glick Path: pom.xml src/main/java/org/jenkinsci/plugins/workflow/cps/CpsBodyExecution.java src/main/java/org/jenkinsci/plugins/workflow/cps/CpsThread.java http://jenkins-ci.org/commit/workflow-cps-plugin/c0deed0a3b546ebcb59ea25681ed3ac8b13fe6bb Log: JENKINS-32986 Apply a timeout to some hang-prone operations in the CPS VM thread.
            Hide
            scm_issue_link SCM/JIRA link daemon added a comment -

            Code changed in jenkins
            User: Jesse Glick
            Path:
            pom.xml
            src/main/java/org/jenkinsci/plugins/workflow/cps/CpsBodyExecution.java
            src/main/java/org/jenkinsci/plugins/workflow/cps/CpsThread.java
            http://jenkins-ci.org/commit/workflow-cps-plugin/51c02d40783bdc2be4e825d29c4c28286aa8c1dc
            Log:
            Merge pull request #102 from jglick/Timeout-JENKINS-32986

            JENKINS-32986 Apply a timeout to some hang-prone operations in the CPS VM thread

            Compare: https://github.com/jenkinsci/workflow-cps-plugin/compare/8da4ed31126f...51c02d40783b

            Show
            scm_issue_link SCM/JIRA link daemon added a comment - Code changed in jenkins User: Jesse Glick Path: pom.xml src/main/java/org/jenkinsci/plugins/workflow/cps/CpsBodyExecution.java src/main/java/org/jenkinsci/plugins/workflow/cps/CpsThread.java http://jenkins-ci.org/commit/workflow-cps-plugin/51c02d40783bdc2be4e825d29c4c28286aa8c1dc Log: Merge pull request #102 from jglick/Timeout- JENKINS-32986 JENKINS-32986 Apply a timeout to some hang-prone operations in the CPS VM thread Compare: https://github.com/jenkinsci/workflow-cps-plugin/compare/8da4ed31126f...51c02d40783b
            Hide
            jglick Jesse Glick added a comment -

            workflow-support PR 37 should fix the SerializableClassRegistry issue.

            Show
            jglick Jesse Glick added a comment - workflow-support PR 37 should fix the SerializableClassRegistry issue.

              People

              • Assignee:
                jglick Jesse Glick
                Reporter:
                teilo James Nord
              • Votes:
                4 Vote for this issue
                Watchers:
                14 Start watching this issue

                Dates

                • Created:
                  Updated: