Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-48685

Deadlock when running a Multijob with multiple slaves

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Resolved (View Workflow)
    • Priority: Critical
    • Resolution: Duplicate
    • Component/s: multijob-plugin
    • Labels:
      None
    • Environment:
    • Similar Issues:

      Description

      After upgrading from 2.73.3 to 2.89.2 our Jenkins has started to experience deadlock.

      We use the Multijob plugin to run any number of other jobs that extend a common template. When the Multijob kicks off, it will spin up as many AWS slaves as it needs to run all of the child jobs in parallel (Test-Suites in the stack trace). Every time we run one of these Multijob jobs, Jenkins locks up.

      Attached is the deadlock stack traces from a thread dump.

      Executor #4 for Big Box (r4.2xlarge) (i-05a4635a2e6e063cf) : executing Test-Suites/test-suite-1 #1165 is in deadlock with Executor #2 for Big Box (r4.2xlarge) (i-057d9fdd7076c7c10) : executing Test-Suites/test-suite-2 #1307
      
      Executor #4 for Big Box (r4.2xlarge) (i-05a4635a2e6e063cf) : executing Test-Suites/test-suite-1 #1165 - priority:5 - threadId:0x00007f8fe4118800 - nativeId:0x3455 - state:BLOCKED
      stackTrace:
      java.lang.Thread.State: BLOCKED (on object monitor)
      at jenkins.model.lazy.AbstractLazyLoadRunMap.getByNumber(AbstractLazyLoadRunMap.java:369)
      - waiting to lock <0x000000008cdab698> (a hudson.model.RunMap)
      at jenkins.model.lazy.LazyBuildMixIn.getBuildByNumber(LazyBuildMixIn.java:231)
      at hudson.model.AbstractProject.getBuildByNumber(AbstractProject.java:926)
      at hudson.model.AbstractProject.getBuildByNumber(AbstractProject.java:137)
      at hudson.model.Run.fromExternalizableId(Run.java:2345)
      at hudson.model.Run$Replacer.readResolve(Run.java:1937)
      
      Executor #2 for Big Box (r4.2xlarge) (i-057d9fdd7076c7c10) : executing Test-Suites/test-suite-2 #1307 - priority:5 - threadId:0x00007f8ff868e000 - nativeId:0x32e9 - state:BLOCKED
      stackTrace:
      java.lang.Thread.State: BLOCKED (on object monitor)
      at jenkins.model.lazy.AbstractLazyLoadRunMap.getByNumber(AbstractLazyLoadRunMap.java:369)
      - waiting to lock <0x000000008d744a90> (a hudson.model.RunMap)
      at jenkins.model.lazy.LazyBuildMixIn.getBuildByNumber(LazyBuildMixIn.java:231)
      at hudson.model.AbstractProject.getBuildByNumber(AbstractProject.java:926)
      at hudson.model.AbstractProject.getBuildByNumber(AbstractProject.java:137)
      at hudson.model.Run.fromExternalizableId(Run.java:2345)
      at hudson.model.Run$Replacer.readResolve(Run.java:1937)
      at sun.reflect.GeneratedMethodAccessor51.invoke(Unknown Source)

      We tried downgrading Jenkins again, but we had already updated all of the other plugins and after downgrading the majority of the plugins were not compatible.

        Attachments

          Issue Links

            Activity

            Hide
            oleg_nenashev Oleg Nenashev added a comment -

            Here is a root cause thread from the dump:

            Executor #2 for Big Box (r4.2xlarge) (i-057d9fdd7076c7c10) : executing Test-Suites/test-suite-2 #1307 - priority:5 - threadId:0x00007f8ff868e000 - nativeId:0x32e9 - state:BLOCKED
            stackTrace:
            java.lang.Thread.State: BLOCKED (on object monitor)
            at jenkins.model.lazy.AbstractLazyLoadRunMap.getByNumber(AbstractLazyLoadRunMap.java:369)
            - waiting to lock <0x000000008d744a90> (a hudson.model.RunMap)
            at jenkins.model.lazy.LazyBuildMixIn.getBuildByNumber(LazyBuildMixIn.java:231)
            at hudson.model.AbstractProject.getBuildByNumber(AbstractProject.java:926)
            at hudson.model.AbstractProject.getBuildByNumber(AbstractProject.java:137)
            at hudson.model.Run.fromExternalizableId(Run.java:2345)
            at hudson.model.Run$Replacer.readResolve(Run.java:1937)
            at sun.reflect.GeneratedMethodAccessor51.invoke(Unknown Source)
            at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
            at java.lang.reflect.Method.invoke(Method.java:498)
            at com.thoughtworks.xstream.converters.reflection.SerializationMethodInvoker.callReadResolve(SerializationMethodInvoker.java:66)
            at hudson.util.RobustReflectionConverter.unmarshal(RobustReflectionConverter.java:271)
            at com.thoughtworks.xstream.core.TreeUnmarshaller.convert(TreeUnmarshaller.java:72)
            at com.thoughtworks.xstream.core.AbstractReferenceUnmarshaller.convert(AbstractReferenceUnmarshaller.java:65)
            at com.thoughtworks.xstream.core.TreeUnmarshaller.convertAnother(TreeUnmarshaller.java:66)
            at hudson.util.RobustReflectionConverter.unmarshalField(RobustReflectionConverter.java:393)
            at hudson.util.RobustReflectionConverter.doUnmarshal(RobustReflectionConverter.java:331)
            at hudson.util.RobustReflectionConverter.unmarshal(RobustReflectionConverter.java:270)
            at com.thoughtworks.xstream.core.TreeUnmarshaller.convert(TreeUnmarshaller.java:72)
            at com.thoughtworks.xstream.core.AbstractReferenceUnmarshaller.convert(AbstractReferenceUnmarshaller.java:65)
            at com.thoughtworks.xstream.core.TreeUnmarshaller.convertAnother(TreeUnmarshaller.java:66)
            at com.thoughtworks.xstream.core.TreeUnmarshaller.convertAnother(TreeUnmarshaller.java:50)
            at com.thoughtworks.xstream.converters.collections.AbstractCollectionConverter.readItem(AbstractCollectionConverter.java:71)
            at hudson.util.RobustCollectionConverter.populateCollection(RobustCollectionConverter.java:85)
            at com.thoughtworks.xstream.converters.collections.CollectionConverter.unmarshal(CollectionConverter.java:80)
            at hudson.util.RobustCollectionConverter.unmarshal(RobustCollectionConverter.java:76)
            at com.thoughtworks.xstream.core.TreeUnmarshaller.convert(TreeUnmarshaller.java:72)
            at com.thoughtworks.xstream.core.AbstractReferenceUnmarshaller.convert(AbstractReferenceUnmarshaller.java:65)
            at com.thoughtworks.xstream.core.TreeUnmarshaller.convertAnother(TreeUnmarshaller.java:66)
            at hudson.util.RobustReflectionConverter.unmarshalField(RobustReflectionConverter.java:393)
            at hudson.util.RobustReflectionConverter.doUnmarshal(RobustReflectionConverter.java:331)
            at hudson.util.RobustReflectionConverter.unmarshal(RobustReflectionConverter.java:270)
            at com.thoughtworks.xstream.core.TreeUnmarshaller.convert(TreeUnmarshaller.java:72)
            at com.thoughtworks.xstream.core.AbstractReferenceUnmarshaller.convert(AbstractReferenceUnmarshaller.java:65)
            at com.thoughtworks.xstream.core.TreeUnmarshaller.convertAnother(TreeUnmarshaller.java:66)
            at com.thoughtworks.xstream.core.TreeUnmarshaller.convertAnother(TreeUnmarshaller.java:50)
            at com.thoughtworks.xstream.core.TreeUnmarshaller.start(TreeUnmarshaller.java:134)
            at com.thoughtworks.xstream.core.AbstractTreeMarshallingStrategy.unmarshal(AbstractTreeMarshallingStrategy.java:32)
            at com.thoughtworks.xstream.XStream.unmarshal(XStream.java:1189)
            at hudson.util.XStream2.unmarshal(XStream2.java:114)
            at com.thoughtworks.xstream.XStream.unmarshal(XStream.java:1173)
            at hudson.XmlFile.unmarshal(XmlFile.java:167)
            at hudson.model.Run.reload(Run.java:336)
            at hudson.model.Run.<init>(Run.java:324)
            at hudson.model.AbstractBuild.<init>(AbstractBuild.java:173)
            at hudson.model.Build.<init>(Build.java:104)
            at com.tikal.jenkins.plugins.multijob.MultiJobBuild.<init>(MultiJobBuild.java:59)
            at sun.reflect.GeneratedConstructorAccessor501.newInstance(Unknown Source)
            at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
            at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
            at jenkins.model.lazy.LazyBuildMixIn.loadBuild(LazyBuildMixIn.java:165)
            at jenkins.model.lazy.LazyBuildMixIn$1.create(LazyBuildMixIn.java:142)
            at hudson.model.RunMap.retrieve(RunMap.java:224)
            at hudson.model.RunMap.retrieve(RunMap.java:57)
            at jenkins.model.lazy.AbstractLazyLoadRunMap.load(AbstractLazyLoadRunMap.java:500)
            at jenkins.model.lazy.AbstractLazyLoadRunMap.load(AbstractLazyLoadRunMap.java:482)
            at jenkins.model.lazy.AbstractLazyLoadRunMap.getByNumber(AbstractLazyLoadRunMap.java:380)
            - locked <0x000000008cdab698> (a hudson.model.RunMap)
            at jenkins.model.lazy.LazyBuildMixIn.getBuildByNumber(LazyBuildMixIn.java:231)
            at hudson.model.AbstractProject.getBuildByNumber(AbstractProject.java:926)
            at hudson.model.AbstractProject.getBuildByNumber(AbstractProject.java:137)
            at hudson.model.Run.fromExternalizableId(Run.java:2345)
            at hudson.model.Run$Replacer.readResolve(Run.java:1937)
            at sun.reflect.GeneratedMethodAccessor51.invoke(Unknown
            

            Needs investigation

            Show
            oleg_nenashev Oleg Nenashev added a comment - Here is a root cause thread from the dump: Executor #2 for Big Box (r4.2xlarge) (i-057d9fdd7076c7c10) : executing Test-Suites/test-suite-2 #1307 - priority:5 - threadId:0x00007f8ff868e000 - nativeId:0x32e9 - state:BLOCKED stackTrace: java.lang. Thread .State: BLOCKED (on object monitor) at jenkins.model.lazy.AbstractLazyLoadRunMap.getByNumber(AbstractLazyLoadRunMap.java:369) - waiting to lock <0x000000008d744a90> (a hudson.model.RunMap) at jenkins.model.lazy.LazyBuildMixIn.getBuildByNumber(LazyBuildMixIn.java:231) at hudson.model.AbstractProject.getBuildByNumber(AbstractProject.java:926) at hudson.model.AbstractProject.getBuildByNumber(AbstractProject.java:137) at hudson.model.Run.fromExternalizableId(Run.java:2345) at hudson.model.Run$Replacer.readResolve(Run.java:1937) at sun.reflect.GeneratedMethodAccessor51.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at com.thoughtworks.xstream.converters.reflection.SerializationMethodInvoker.callReadResolve(SerializationMethodInvoker.java:66) at hudson.util.RobustReflectionConverter.unmarshal(RobustReflectionConverter.java:271) at com.thoughtworks.xstream.core.TreeUnmarshaller.convert(TreeUnmarshaller.java:72) at com.thoughtworks.xstream.core.AbstractReferenceUnmarshaller.convert(AbstractReferenceUnmarshaller.java:65) at com.thoughtworks.xstream.core.TreeUnmarshaller.convertAnother(TreeUnmarshaller.java:66) at hudson.util.RobustReflectionConverter.unmarshalField(RobustReflectionConverter.java:393) at hudson.util.RobustReflectionConverter.doUnmarshal(RobustReflectionConverter.java:331) at hudson.util.RobustReflectionConverter.unmarshal(RobustReflectionConverter.java:270) at com.thoughtworks.xstream.core.TreeUnmarshaller.convert(TreeUnmarshaller.java:72) at com.thoughtworks.xstream.core.AbstractReferenceUnmarshaller.convert(AbstractReferenceUnmarshaller.java:65) at com.thoughtworks.xstream.core.TreeUnmarshaller.convertAnother(TreeUnmarshaller.java:66) at com.thoughtworks.xstream.core.TreeUnmarshaller.convertAnother(TreeUnmarshaller.java:50) at com.thoughtworks.xstream.converters.collections.AbstractCollectionConverter.readItem(AbstractCollectionConverter.java:71) at hudson.util.RobustCollectionConverter.populateCollection(RobustCollectionConverter.java:85) at com.thoughtworks.xstream.converters.collections.CollectionConverter.unmarshal(CollectionConverter.java:80) at hudson.util.RobustCollectionConverter.unmarshal(RobustCollectionConverter.java:76) at com.thoughtworks.xstream.core.TreeUnmarshaller.convert(TreeUnmarshaller.java:72) at com.thoughtworks.xstream.core.AbstractReferenceUnmarshaller.convert(AbstractReferenceUnmarshaller.java:65) at com.thoughtworks.xstream.core.TreeUnmarshaller.convertAnother(TreeUnmarshaller.java:66) at hudson.util.RobustReflectionConverter.unmarshalField(RobustReflectionConverter.java:393) at hudson.util.RobustReflectionConverter.doUnmarshal(RobustReflectionConverter.java:331) at hudson.util.RobustReflectionConverter.unmarshal(RobustReflectionConverter.java:270) at com.thoughtworks.xstream.core.TreeUnmarshaller.convert(TreeUnmarshaller.java:72) at com.thoughtworks.xstream.core.AbstractReferenceUnmarshaller.convert(AbstractReferenceUnmarshaller.java:65) at com.thoughtworks.xstream.core.TreeUnmarshaller.convertAnother(TreeUnmarshaller.java:66) at com.thoughtworks.xstream.core.TreeUnmarshaller.convertAnother(TreeUnmarshaller.java:50) at com.thoughtworks.xstream.core.TreeUnmarshaller.start(TreeUnmarshaller.java:134) at com.thoughtworks.xstream.core.AbstractTreeMarshallingStrategy.unmarshal(AbstractTreeMarshallingStrategy.java:32) at com.thoughtworks.xstream.XStream.unmarshal(XStream.java:1189) at hudson.util.XStream2.unmarshal(XStream2.java:114) at com.thoughtworks.xstream.XStream.unmarshal(XStream.java:1173) at hudson.XmlFile.unmarshal(XmlFile.java:167) at hudson.model.Run.reload(Run.java:336) at hudson.model.Run.<init>(Run.java:324) at hudson.model.AbstractBuild.<init>(AbstractBuild.java:173) at hudson.model.Build.<init>(Build.java:104) at com.tikal.jenkins.plugins.multijob.MultiJobBuild.<init>(MultiJobBuild.java:59) at sun.reflect.GeneratedConstructorAccessor501.newInstance(Unknown Source) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:423) at jenkins.model.lazy.LazyBuildMixIn.loadBuild(LazyBuildMixIn.java:165) at jenkins.model.lazy.LazyBuildMixIn$1.create(LazyBuildMixIn.java:142) at hudson.model.RunMap.retrieve(RunMap.java:224) at hudson.model.RunMap.retrieve(RunMap.java:57) at jenkins.model.lazy.AbstractLazyLoadRunMap.load(AbstractLazyLoadRunMap.java:500) at jenkins.model.lazy.AbstractLazyLoadRunMap.load(AbstractLazyLoadRunMap.java:482) at jenkins.model.lazy.AbstractLazyLoadRunMap.getByNumber(AbstractLazyLoadRunMap.java:380) - locked <0x000000008cdab698> (a hudson.model.RunMap) at jenkins.model.lazy.LazyBuildMixIn.getBuildByNumber(LazyBuildMixIn.java:231) at hudson.model.AbstractProject.getBuildByNumber(AbstractProject.java:926) at hudson.model.AbstractProject.getBuildByNumber(AbstractProject.java:137) at hudson.model.Run.fromExternalizableId(Run.java:2345) at hudson.model.Run$Replacer.readResolve(Run.java:1937) at sun.reflect.GeneratedMethodAccessor51.invoke(Unknown Needs investigation
            Hide
            allan_burdajewicz Allan BURDAJEWICZ added a comment -

            Oleg Nenashev Jesse Glick Could this be related to JENKINS-49328 and nested references ? We see the `hudson.model.Run$Replacer.readResolve` in the stacktrace. I have also seen a case were the hudson.model.RunMap#retrieve fails with a StackOverflowError, showing the same stacktrace as we can see here.

            Show
            allan_burdajewicz Allan BURDAJEWICZ added a comment - Oleg Nenashev Jesse Glick Could this be related to JENKINS-49328 and nested references ? We see the `hudson.model.Run$Replacer.readResolve` in the stacktrace. I have also seen a case were the hudson.model.RunMap#retrieve fails with a StackOverflowError, showing the same stacktrace as we can see here.

              People

              • Assignee:
                Unassigned
                Reporter:
                ketchumm Mark Ketchum
              • Votes:
                1 Vote for this issue
                Watchers:
                6 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: