Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-47430

SandboxResolvingClassLoader use of Guava cache can cause classloading bottleneck/deadlock

    XMLWordPrintable

    Details

    • Similar Issues:
    • Released As:
      script-security 1.61

      Description

      Noted the following when investigating a system burning a lot of CPU with pipelines.  They were traced from native thread IDs in top with high CPU use to Java threads in stack traces that were using the SandboxResolvingClassloader.  System also exhibited very high classloading/parsing times for some pipelines. 

      java.lang.Thread.State: WAITING (parking)
      at sun.misc.Unsafe.park(Native Method)

      • parking to wait for <0x000000075b9264f8> (a com.google.common.util.concurrent.AbstractFuture$Sync)
        at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
        at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
        at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:997)
        at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1304)
        at com.google.common.util.concurrent.AbstractFuture$Sync.get(AbstractFuture.java:275)
        at com.google.common.util.concurrent.AbstractFuture.get(AbstractFuture.java:111)
        at com.google.common.util.concurrent.Uninterruptibles.getUninterruptibly(Uninterruptibles.java:132)
        at com.google.common.cache.LocalCache$LoadingValueReference.waitForValue(LocalCache.java:3586)
        at com.google.common.cache.LocalCache$Segment.waitForLoadingValue(LocalCache.java:2333)
        at com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2222)
        at com.google.common.cache.LocalCache.get(LocalCache.java:3965)
        at com.google.common.cache.LocalCache.getOrLoad(LocalCache.java:3969)
        at com.google.common.cache.LocalCache$LocalManualCache.get(LocalCache.java:4829)
        at com.google.common.cache.LocalCache$LocalManualCache.getUnchecked(LocalCache.java:4834)
        at org.jenkinsci.plugins.scriptsecurity.sandbox.groovy.SandboxResolvingClassLoader.loadClass(SandboxResolvingClassLoader.java:51)
      • locked <0x000000069c03be78> (a org.jenkinsci.plugins.scriptsecurity.sandbox.groovy.SandboxResolvingClassLoader)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:411)
      • locked <0x000000069fc00b48> (a org.jenkinsci.plugins.workflow.cps.CpsGroovyShell$TimingLoader)

      16 365 = 0x3FED = 44% CPU fetching from JAR with sandbox resolving classloader
      SandboxResolvingClassLoader$2.compute(SandboxResolvingClassLoader.java:39

      This is using a Guava LoadingCache rather than the much faster Caffeine cache, which can be a drop-in replacement.
       

        Attachments

          Activity

          Hide
          dnusbaum Devin Nusbaum added a comment -

          PR is stalled and would need to be updated to resolve merge conflicts. Would probably need some additional testing at that point as well to understand the impact.

          Show
          dnusbaum Devin Nusbaum added a comment - PR is stalled and would need to be updated to resolve merge conflicts. Would probably need some additional testing at that point as well to understand the impact.
          Hide
          dnusbaum Devin Nusbaum added a comment -

          Noting also that I have seen evidence of a bug in Guava (not just a performance issue) in some cases, where many threads are waiting to load a value from the cache but no thread is actually loading a value, which is described in this upstream issue.

          My best guess for the cause of the issue in the cases I have seen is that a StackOverflowError thrown by the loading thread was somehow swallowed by Guava. We should investigate to understand if that issue is reproducible and if it is a bug in the Pipeline-Groovy layer or in Guava itself.

          Show
          dnusbaum Devin Nusbaum added a comment - Noting also that I have seen evidence of a bug in Guava (not just a performance issue) in some cases, where many threads are waiting to load a value from the cache but no thread is actually loading a value, which is described in this upstream issue . My best guess for the cause of the issue in the cases I have seen is that a StackOverflowError thrown by the loading thread was somehow swallowed by Guava. We should investigate to understand if that issue is reproducible and if it is a bug in the Pipeline-Groovy layer or in Guava itself.
          Hide
          dnusbaum Devin Nusbaum added a comment -

          A fix for this issue was release in version 1.61 of Script Security Plugin.

          Show
          dnusbaum Devin Nusbaum added a comment - A fix for this issue was release in version 1.61 of Script Security Plugin.

            People

            • Assignee:
              Unassigned
              Reporter:
              svanoort Sam Van Oort
            • Votes:
              1 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: