Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-44564

Deadlock restarting jenkins

    XMLWordPrintable

    Details

    • Similar Issues:
    • Released As:
      Jenkins 2.163

      Description

      I'm getting errors uploading plugins to Jenkins and then restarting. This is pretty reliable but doesn't happen on every restart, I have seen it happen about 5 times in the last week, restarting about 20 times a day.

      I was able to kill -3 and get a thread-dump (full text below) – the relevant part seems to be a deadlock described here:

       

      Found one Java-level deadlock:
      =============================
      "Handling GET /static/5adaeac1/images/headshot.png from 172.30.0.86 : RequestHandlerThread[#35]":
      waiting to lock monitor 0x00007f104c0047f8 (object 0x0000000091715c10, a hudson.DescriptorExtensionList),
      which is held by "Handling GET /static/5adaeac1/images/headshot.png from 172.30.0.86 : RequestHandlerThread[#11]"
      "Handling GET /static/5adaeac1/images/headshot.png from 172.30.0.86 : RequestHandlerThread[#11]":
      waiting to lock monitor 0x00007f1070c0d018 (object 0x0000000099f9b518, a hudson.ExtensionList$Lock),
      which is held by "Initializing plugin copyartifact"
      "Initializing plugin copyartifact":
      waiting to lock monitor 0x00007f104c0047f8 (object 0x0000000091715c10, a hudson.DescriptorExtensionList),
      which is held by "Handling GET /static/5adaeac1/images/headshot.png from 172.30.0.86 : RequestHandlerThread[#11]"

      Our work around is to stop the service sudo service jenkins stop - the process will still be running so we kill -9 it, and then start jenkins again.

      Full thread dump – cannot include inline because it exceeds text limits:

      https://gist.github.com/dalvizu/02071f3c8baf30c71cb3df7d2a53ab64

       

        Attachments

          Issue Links

            Activity

            Hide
            dalvizu Dan Alvizu added a comment -

            I am running Jenkins 2.46.2 LTS

            Show
            dalvizu Dan Alvizu added a comment - I am running Jenkins 2.46.2 LTS
            Hide
            oleg_nenashev Oleg Nenashev added a comment -

            Nice one. I have attached the root cause stacktraces.

            • copy-artifact initialization logic performs initialization of BuildSelectors, which seems to trigger the internal migration logic with save(). It blocks the ExtensionList
            • save() triggers jobConfigHistory, which loads users
            • User loading iterates through extensions and needs a lock of Descriptor Extension list
            • Common User loading operation (second stacktrace) locks DescriptorExtensionList and then ExtensionList$Lock => deadlock

            As a fix, I would expect Jenkins.getDescriptorByType() to also lock DescriptorExtensionList  before going forward. Or maybe not... In order to fix the issue right now, the best way is to prevent creation of users in .jobConfigHistory.PluginUtils.isUserExcluded() . It does not seem to make much sense

            {noformat}

            at hudson.ExtensionList.ensureLoaded(ExtensionList.java:302)
              - waiting to lock <0x0000000091715c10> (a hudson.DescriptorExtensionList)
               at hudson.ExtensionList.iterator(ExtensionList.java:157)
              at hudson.model.User.load(User.java:201)
             
            • locked <0x00000000917498a0> (a hudson.model.User)
              at hudson.model.User.<init>(User.java:155)
              at hudson.model.User.getOrCreate(User.java:464)
              at hudson.model.User.getById(User.java:535)
              at hudson.model.User.get(User.java:519)
              at hudson.model.User.current(User.java:503)
              at hudson.plugins.jobConfigHistory.PluginUtils.isUserExcluded(PluginUtils.java:136)
              at hudson.plugins.jobConfigHistory.JobConfigHistorySaveableListener.onChange(JobConfigHistorySaveableListener.java:52)
              at hudson.model.listeners.SaveableListener.fireOnChange(SaveableListener.java:81)
              at hudson.model.Descriptor.save(Descriptor.java:872)
             
            • locked <0x0000000091749a88> (a hudson.plugins.sonar.SonarPublisher$DescriptorImpl)
              at hudson.plugins.sonar.SonarPublisher$DescriptorImpl.<init>(SonarPublisher.java:420)
              at hudson.plugins.sonar.SonarPublisher$DescriptorImpl$$FastClassByGuice$$46a6cf4e.newInstance(<generated>)
              at com.google.inject.internal.cglib.reflect.$FastConstructor.newInstance(FastConstructor.java:40)
              at com.google.inject.internal.DefaultConstructionProxyFactory$1.newInstance(DefaultConstructionProxyFactory.java:61)
              at com.google.inject.internal.ConstructorInjector.provision(ConstructorInjector.java:105)
              at com.google.inject.internal.ConstructorInjector.construct(ConstructorInjector.java:85)
              at com.google.inject.internal.ConstructorBindingImpl$Factory.get(ConstructorBindingImpl.java:267)
              at com.google.inject.internal.ProviderToInternalFactoryAdapter$1.call(ProviderToInternalFactoryAdapter.java:46)
              at com.google.inject.internal.InjectorImpl.callInContext(InjectorImpl.java:1103)
              at com.google.inject.internal.ProviderToInternalFactoryAdapter.get(ProviderToInternalFactoryAdapter.java:40)
              at com.google.inject.internal.SingletonScope$1.get(SingletonScope.java:145)
              at hudson.ExtensionFinder$GuiceFinder$FaultTolerantScope$1.get(ExtensionFinder.java:424)
              at com.google.inject.internal.InternalFactoryToProviderAdapter.get(InternalFactoryToProviderAdapter.java:41)
              at com.google.inject.internal.InjectorImpl$2$1.call(InjectorImpl.java:1016)
              at com.google.inject.internal.InjectorImpl.callInContext(InjectorImpl.java:1092)
              at com.google.inject.internal.InjectorImpl$2.get(InjectorImpl.java:1012)
              at hudson.ExtensionFinder$GuiceFinder._find(ExtensionFinder.java:386)
              at hudson.ExtensionFinder$GuiceFinder.find(ExtensionFinder.java:377)
              at hudson.ClassicPluginStrategy.findComponents(ClassicPluginStrategy.java:472)
              at hudson.ExtensionList.load(ExtensionList.java:365)
              at hudson.ExtensionList.ensureLoaded(ExtensionList.java:303)
              - locked <0x0000000099f9b518> (a hudson.ExtensionList$Lock)
               at hudson.ExtensionList.iterator(ExtensionList.java:157)
              at jenkins.model.Jenkins.getDescriptorByType(Jenkins.java:1545)
              at hudson.plugins.copyartifact.BuildSelectorParameter.initAliases(BuildSelectorParameter.java:152)
              at hudson.plugins.copyartifact.CopyArtifactPlugin.postInitialize(CopyArtifactPlugin.java:35)
              at hudson.PluginManager$2$1$2.run(PluginManager.java:543)

            {noformat}

             

            {noformat}

            "Handling GET /static/5adaeac1/images/headshot.png from 172.30.0.86 : RequestHandlerThread11" #60 daemon prio=5 os_prio=0 tid=0x00007f1024042800 nid=0x33bf waiting for monitor entry [0x00007f101a8d3000]
              java.lang.Thread.State: BLOCKED (on object monitor)
              at hudson.ExtensionList.ensureLoaded(ExtensionList.java:302)
              - waiting to lock <0x0000000099f9b518> (a hudson.ExtensionList$Lock)
               at hudson.ExtensionList.getComponents(ExtensionList.java:168)
              at hudson.DescriptorExtensionList.load(DescriptorExtensionList.java:191)
              at hudson.ExtensionList.ensureLoaded(ExtensionList.java:303)
              - locked <0x0000000091715c10> (a hudson.DescriptorExtensionList)
               at hudson.ExtensionList.iterator(ExtensionList.java:157)
              at hudson.model.User.load(User.java:201)
             
            • locked <0x0000000091715d08> (a hudson.model.User)
              at hudson.model.User.<init>(User.java:155)
              at hudson.model.User.getOrCreate(User.java:464)
              at hudson.model.User.getAll(User.java:558)
              at hudson.model.User$FullNameIdResolver.resolveCanonicalId(User.java:1053)
              at hudson.model.User.get(User.java:405)
              at hudson.model.User.get(User.java:374)
              at jenkins.model.Jenkins.getUser(Jenkins.java:2898)
              at hudson.plugins.greenballs.GreenBallFilter.mapImage(GreenBallFilter.java:64)
              at hudson.plugins.greenballs.GreenBallFilter.doFilter(GreenBallFilter.java:47)
              at hudson.util.PluginServletFilter$1.doFilter(PluginServletFilter.java:132)

            {noformat}

            Show
            oleg_nenashev Oleg Nenashev added a comment - Nice one. I have attached the root cause stacktraces. copy-artifact initialization logic performs initialization of BuildSelectors, which seems to trigger the internal migration logic with save(). It blocks the ExtensionList save() triggers jobConfigHistory, which loads users User loading iterates through extensions and needs a lock of Descriptor Extension list Common User loading operation (second stacktrace) locks DescriptorExtensionList and then ExtensionList$Lock => deadlock As a fix, I would expect Jenkins.getDescriptorByType() to also lock DescriptorExtensionList  before going forward. Or maybe not... In order to fix the issue right now, the best way is to prevent creation of users in .jobConfigHistory.PluginUtils.isUserExcluded() . It does not seem to make much sense {noformat} at hudson.ExtensionList.ensureLoaded(ExtensionList.java:302)   - waiting to lock <0x0000000091715c10> (a hudson.DescriptorExtensionList)     at hudson.ExtensionList.iterator(ExtensionList.java:157)   at hudson.model.User.load(User.java:201)   locked <0x00000000917498a0> (a hudson.model.User)   at hudson.model.User.<init>(User.java:155)   at hudson.model.User.getOrCreate(User.java:464)   at hudson.model.User.getById(User.java:535)   at hudson.model.User.get(User.java:519)   at hudson.model.User.current(User.java:503)   at hudson.plugins.jobConfigHistory.PluginUtils.isUserExcluded(PluginUtils.java:136)   at hudson.plugins.jobConfigHistory.JobConfigHistorySaveableListener.onChange(JobConfigHistorySaveableListener.java:52)   at hudson.model.listeners.SaveableListener.fireOnChange(SaveableListener.java:81)   at hudson.model.Descriptor.save(Descriptor.java:872)   locked <0x0000000091749a88> (a hudson.plugins.sonar.SonarPublisher$DescriptorImpl)   at hudson.plugins.sonar.SonarPublisher$DescriptorImpl.<init>(SonarPublisher.java:420)   at hudson.plugins.sonar.SonarPublisher$DescriptorImpl$$FastClassByGuice$$46a6cf4e.newInstance(<generated>)   at com.google.inject.internal.cglib.reflect.$FastConstructor.newInstance(FastConstructor.java:40)   at com.google.inject.internal.DefaultConstructionProxyFactory$1.newInstance(DefaultConstructionProxyFactory.java:61)   at com.google.inject.internal.ConstructorInjector.provision(ConstructorInjector.java:105)   at com.google.inject.internal.ConstructorInjector.construct(ConstructorInjector.java:85)   at com.google.inject.internal.ConstructorBindingImpl$Factory.get(ConstructorBindingImpl.java:267)   at com.google.inject.internal.ProviderToInternalFactoryAdapter$1.call(ProviderToInternalFactoryAdapter.java:46)   at com.google.inject.internal.InjectorImpl.callInContext(InjectorImpl.java:1103)   at com.google.inject.internal.ProviderToInternalFactoryAdapter.get(ProviderToInternalFactoryAdapter.java:40)   at com.google.inject.internal.SingletonScope$1.get(SingletonScope.java:145)   at hudson.ExtensionFinder$GuiceFinder$FaultTolerantScope$1.get(ExtensionFinder.java:424)   at com.google.inject.internal.InternalFactoryToProviderAdapter.get(InternalFactoryToProviderAdapter.java:41)   at com.google.inject.internal.InjectorImpl$2$1.call(InjectorImpl.java:1016)   at com.google.inject.internal.InjectorImpl.callInContext(InjectorImpl.java:1092)   at com.google.inject.internal.InjectorImpl$2.get(InjectorImpl.java:1012)   at hudson.ExtensionFinder$GuiceFinder._find(ExtensionFinder.java:386)   at hudson.ExtensionFinder$GuiceFinder.find(ExtensionFinder.java:377)   at hudson.ClassicPluginStrategy.findComponents(ClassicPluginStrategy.java:472)   at hudson.ExtensionList.load(ExtensionList.java:365)   at hudson.ExtensionList.ensureLoaded(ExtensionList.java:303)   - locked <0x0000000099f9b518> (a hudson.ExtensionList$Lock)     at hudson.ExtensionList.iterator(ExtensionList.java:157)   at jenkins.model.Jenkins.getDescriptorByType(Jenkins.java:1545)   at hudson.plugins.copyartifact.BuildSelectorParameter.initAliases(BuildSelectorParameter.java:152)   at hudson.plugins.copyartifact.CopyArtifactPlugin.postInitialize(CopyArtifactPlugin.java:35)   at hudson.PluginManager$2$1$2.run(PluginManager.java:543) {noformat}   {noformat} "Handling GET /static/5adaeac1/images/headshot.png from 172.30.0.86 : RequestHandlerThread 11 " #60 daemon prio=5 os_prio=0 tid=0x00007f1024042800 nid=0x33bf waiting for monitor entry [0x00007f101a8d3000]   java.lang.Thread.State: BLOCKED (on object monitor)   at hudson.ExtensionList.ensureLoaded(ExtensionList.java:302)   - waiting to lock <0x0000000099f9b518> (a hudson.ExtensionList$Lock)     at hudson.ExtensionList.getComponents(ExtensionList.java:168)   at hudson.DescriptorExtensionList.load(DescriptorExtensionList.java:191)   at hudson.ExtensionList.ensureLoaded(ExtensionList.java:303)   - locked <0x0000000091715c10> (a hudson.DescriptorExtensionList)     at hudson.ExtensionList.iterator(ExtensionList.java:157)   at hudson.model.User.load(User.java:201)   locked <0x0000000091715d08> (a hudson.model.User)   at hudson.model.User.<init>(User.java:155)   at hudson.model.User.getOrCreate(User.java:464)   at hudson.model.User.getAll(User.java:558)   at hudson.model.User$FullNameIdResolver.resolveCanonicalId(User.java:1053)   at hudson.model.User.get(User.java:405)   at hudson.model.User.get(User.java:374)   at jenkins.model.Jenkins.getUser(Jenkins.java:2898)   at hudson.plugins.greenballs.GreenBallFilter.mapImage(GreenBallFilter.java:64)   at hudson.plugins.greenballs.GreenBallFilter.doFilter(GreenBallFilter.java:47)   at hudson.util.PluginServletFilter$1.doFilter(PluginServletFilter.java:132) {noformat}
            Hide
            jochenafuerbacher Jochen A. Fürbacher added a comment -

            Dan Alvizu Which version of JobConfigHistory do you use?

            Show
            jochenafuerbacher Jochen A. Fürbacher added a comment - Dan Alvizu Which version of JobConfigHistory do you use?
            Hide
            dalvizu Dan Alvizu added a comment -

            Using version 2.15 of JobConfigHistory

            Show
            dalvizu Dan Alvizu added a comment - Using version 2.15 of JobConfigHistory
            Hide
            oleg_nenashev Oleg Nenashev added a comment -

            JENKINS-55361 should fix it in Jenkins 2.163.

            Show
            oleg_nenashev Oleg Nenashev added a comment - JENKINS-55361 should fix it in Jenkins 2.163.

              People

              • Assignee:
                stefanbrausch Stefan Brausch
                Reporter:
                dalvizu Dan Alvizu
              • Votes:
                0 Vote for this issue
                Watchers:
                4 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: