Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-24050

All slaves disconnect and no new slaves can connect due to CancelledKeyException in org.jenkinsci.remoting

    Details

    • Type: Bug
    • Status: Resolved (View Workflow)
    • Priority: Major
    • Resolution: Fixed
    • Component/s: core
    • Environment:
      Enterprise Linux 5.x master, Windows and Linux slaves of varying releases. Slaves are added and removed reasonably frequently in a way similar to the EC2Plugin (although others have reported with snapshot reverting and even with regular slaves)
    • Similar Issues:

      Description

      We have an issue where we get a CancelledKeyException and 100% of our slaves disconnect and no new new slaves can connect until a restart happens. The issue seems to happen randomly.

      See: https://issues.jenkins-ci.org/browse/JENKINS-22932?focusedCommentId=205983&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-205983#JENKINS-22932 and later for some more context.

      The full error message in the build is:
      FATAL: hudson.remoting.RequestAbortedException: java.io.IOException: Failed to abort
      hudson.remoting.RequestAbortedException: hudson.remoting.RequestAbortedException: java.io.IOException: Failed to abort
      at hudson.remoting.RequestAbortedException.wrapForRethrow(RequestAbortedException.java:41)
      at hudson.remoting.RequestAbortedException.wrapForRethrow(RequestAbortedException.java:34)
      at hudson.remoting.Request.call(Request.java:174)
      at hudson.remoting.Channel.call(Channel.java:739)
      at hudson.remoting.RemoteInvocationHandler.invoke(RemoteInvocationHandler.java:168)
      at com.sun.proxy.$Proxy83.join(Unknown Source)
      at hudson.Launcher$RemoteLauncher$ProcImpl.join(Launcher.java:956)
      at hudson.tasks.CommandInterpreter.join(CommandInterpreter.java:137)
      at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:97)
      at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:66)
      at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:20)
      at hudson.model.AbstractBuild$AbstractBuildExecution.perform(AbstractBuild.java:772)
      at hudson.model.Build$BuildExecution.build(Build.java:199)
      at hudson.model.Build$BuildExecution.doRun(Build.java:160)
      at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:535)
      at hudson.model.Run.execute(Run.java:1732)
      at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:43)
      at hudson.model.ResourceController.execute(ResourceController.java:88)
      at hudson.model.Executor.run(Executor.java:234)
      Caused by: hudson.remoting.RequestAbortedException: java.io.IOException: Failed to abort
      at hudson.remoting.Request.abort(Request.java:299)
      at hudson.remoting.Channel.terminate(Channel.java:802)
      at hudson.remoting.Channel$2.terminate(Channel.java:483)
      at hudson.remoting.AbstractByteArrayCommandTransport$1.terminate(AbstractByteArrayCommandTransport.java:72)
      at org.jenkinsci.remoting.nio.NioChannelHub$NioTransport.abort(NioChannelHub.java:195)
      at org.jenkinsci.remoting.nio.NioChannelHub.abortAll(NioChannelHub.java:618)
      at org.jenkinsci.remoting.nio.NioChannelHub.run(NioChannelHub.java:592)
      at jenkins.util.ContextResettingExecutorService$1.run(ContextResettingExecutorService.java:28)
      at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
      at java.util.concurrent.FutureTask.run(FutureTask.java:262)
      at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
      at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
      at java.lang.Thread.run(Thread.java:744)
      Caused by: java.io.IOException: Failed to abort
      ... 9 more
      Caused by: java.nio.channels.CancelledKeyException
      at sun.nio.ch.SelectionKeyImpl.ensureValid(SelectionKeyImpl.java:73)
      at sun.nio.ch.SelectionKeyImpl.readyOps(SelectionKeyImpl.java:87)
      at java.nio.channels.SelectionKey.isReadable(SelectionKey.java:289)
      at org.jenkinsci.remoting.nio.NioChannelHub.run(NioChannelHub.java:513)
      ... 6 more

        Attachments

          Issue Links

            Activity

            kbrowder Kevin Browder created issue -
            Hide
            kbrowder Kevin Browder added a comment -

            JENKINS-24050 was opened since it's actually a different issue than JENKINS-22932 (which probably should be reclosed)

            Show
            kbrowder Kevin Browder added a comment - JENKINS-24050 was opened since it's actually a different issue than JENKINS-22932 (which probably should be reclosed)
            kbrowder Kevin Browder made changes -
            Field Original Value New Value
            Link This issue is related to JENKINS-22932 [ JENKINS-22932 ]
            Hide
            jnoonan33 James Noonan added a comment -

            I was going to raise a second defect, but I think this is similar enough.

            When the problem occurs, the Slaves Console shows 'Connected'. However, the master shows them all disconnected. The only way to recover so far is to restart Jenkins.
            We are running Master on WindowsServer2012, on VMWare. We are running about 70 slaves, a mix OSX10.9, Win7, and Linux Sled 11 on VMWare. There are some other variants. We are running Jenkins 1.563.

            This issue has occurred three times for us. Two cases are independent; one occurred shortly after the first and the JVM was not restarted, so perhaps recovery between the 1st and 2nd time was not complete. We have not identified a trigger cause for this problem.

            The thread count starts to increase linearly once the problem occurs, but we believe that this is a symptom. In the JavaMelody Monitoring Plugin, there may be a difference between the reported thread number on the machine in two different places. The graph showed 4000 (it was running but down for 30 hours). However, the thread count below showed 400. I believe that the first figure maybe the JVM's count while the second is Jenkins'. In normal operation, we see about 200 threads. (However, we restarted, so I am not 100% sure that this is correct).

            We see the following messages in the error log. The same exception occurs for each of our slaves within a short period of time.

            Jul 31, 2014 5:13:17 AM jenkins.slaves.JnlpSlaveAgentProtocol$Handler$1 onClosed
            WARNING: NioChannelHub keys=86 gen=1625477529: Computer.threadPoolForRemoting 58 for + XXXXXXXX terminated
            java.io.IOException: Failed to abort
            at org.jenkinsci.remoting.nio.NioChannelHub$NioTransport.abort(NioChannelHub.java:184)
            at org.jenkinsci.remoting.nio.NioChannelHub.abortAll(NioChannelHub.java:599)
            at org.jenkinsci.remoting.nio.NioChannelHub.run(NioChannelHub.java:481)
            at jenkins.util.ContextResettingExecutorService$1.run(ContextResettingExecutorService.java:28)
            at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
            at java.util.concurrent.FutureTask.run(Unknown Source)
            at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
            at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
            at java.lang.Thread.run(Unknown Source)
            Caused by: java.nio.channels.ClosedChannelException
            at sun.nio.ch.SocketChannelImpl.shutdownInput(Unknown Source)
            at sun.nio.ch.SocketAdaptor.shutdownInput(Unknown Source)
            at org.jenkinsci.remoting.nio.Closeables$1.close(Closeables.java:20)
            at org.jenkinsci.remoting.nio.NioChannelHub$MonoNioTransport.closeR(NioChannelHub.java:289)
            at org.jenkinsci.remoting.nio.NioChannelHub$NioTransport$1.call(NioChannelHub.java:226)
            at org.jenkinsci.remoting.nio.NioChannelHub$NioTransport$1.call(NioChannelHub.java:224)
            at org.jenkinsci.remoting.nio.NioChannelHub.run(NioChannelHub.java:474)
            ... 6 more

            In the first case, we also saw ping timeouts occur at about the same time as the problem. These were not present in the other case. On the latest case, there was a single slave losing network connectivity and we saw this exception in advance of when the 'crash' happened. However, I believe this to be a coincidence. The exception occurs in the logs without all slaves losing connectivity from time to time.

            We see other exceptions in the logs. However, these seem to be related to us shutting down idle machines, or the Disk Usage Util plugin, and seem unrelated.

            Last week, we increased the load on our machine from about 40-slaves to 70, and also increased the number of jobs. Before this, we had not seen this problem.

            We are planning to upgrade to take in the (now reopened) fix for 22932.

            Show
            jnoonan33 James Noonan added a comment - I was going to raise a second defect, but I think this is similar enough. When the problem occurs, the Slaves Console shows 'Connected'. However, the master shows them all disconnected. The only way to recover so far is to restart Jenkins. We are running Master on WindowsServer2012, on VMWare. We are running about 70 slaves, a mix OSX10.9, Win7, and Linux Sled 11 on VMWare. There are some other variants. We are running Jenkins 1.563. This issue has occurred three times for us. Two cases are independent; one occurred shortly after the first and the JVM was not restarted, so perhaps recovery between the 1st and 2nd time was not complete. We have not identified a trigger cause for this problem. The thread count starts to increase linearly once the problem occurs, but we believe that this is a symptom. In the JavaMelody Monitoring Plugin, there may be a difference between the reported thread number on the machine in two different places. The graph showed 4000 (it was running but down for 30 hours). However, the thread count below showed 400. I believe that the first figure maybe the JVM's count while the second is Jenkins'. In normal operation, we see about 200 threads. (However, we restarted, so I am not 100% sure that this is correct). We see the following messages in the error log. The same exception occurs for each of our slaves within a short period of time. Jul 31, 2014 5:13:17 AM jenkins.slaves.JnlpSlaveAgentProtocol$Handler$1 onClosed WARNING: NioChannelHub keys=86 gen=1625477529: Computer.threadPoolForRemoting 58 for + XXXXXXXX terminated java.io.IOException: Failed to abort at org.jenkinsci.remoting.nio.NioChannelHub$NioTransport.abort(NioChannelHub.java:184) at org.jenkinsci.remoting.nio.NioChannelHub.abortAll(NioChannelHub.java:599) at org.jenkinsci.remoting.nio.NioChannelHub.run(NioChannelHub.java:481) at jenkins.util.ContextResettingExecutorService$1.run(ContextResettingExecutorService.java:28) at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source) at java.util.concurrent.FutureTask.run(Unknown Source) at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at java.lang.Thread.run(Unknown Source) Caused by: java.nio.channels.ClosedChannelException at sun.nio.ch.SocketChannelImpl.shutdownInput(Unknown Source) at sun.nio.ch.SocketAdaptor.shutdownInput(Unknown Source) at org.jenkinsci.remoting.nio.Closeables$1.close(Closeables.java:20) at org.jenkinsci.remoting.nio.NioChannelHub$MonoNioTransport.closeR(NioChannelHub.java:289) at org.jenkinsci.remoting.nio.NioChannelHub$NioTransport$1.call(NioChannelHub.java:226) at org.jenkinsci.remoting.nio.NioChannelHub$NioTransport$1.call(NioChannelHub.java:224) at org.jenkinsci.remoting.nio.NioChannelHub.run(NioChannelHub.java:474) ... 6 more In the first case, we also saw ping timeouts occur at about the same time as the problem. These were not present in the other case. On the latest case, there was a single slave losing network connectivity and we saw this exception in advance of when the 'crash' happened. However, I believe this to be a coincidence. The exception occurs in the logs without all slaves losing connectivity from time to time. We see other exceptions in the logs. However, these seem to be related to us shutting down idle machines, or the Disk Usage Util plugin, and seem unrelated. Last week, we increased the load on our machine from about 40-slaves to 70, and also increased the number of jobs. Before this, we had not seen this problem. We are planning to upgrade to take in the (now reopened) fix for 22932.
            Hide
            kbrowder Kevin Browder added a comment -

            OK so I think the core issue is that org.jenkinsci.remoting.nio.NioChannelHub.java's line 513 is:
            if (key.isReadable()) {
            where as I think it should be:
            if (key.isValid() && key.isReadable()) {
            I guess this would fix the issue assuming that selectedKeys().iterator() is thread safe (I don't really know much about nio), actually it probably makes sense just to add a catch to one of the handlers in the same method (I think the one at http://git.io/VtniaQ).

            Basically my thoughts as to what's happening is that isReadable is generating a CancelledKeyException which ends up getting caught by the RuntimeException handler (at http://git.io/l-5MhA) which ends up killing the loop and attempts to abort everything, including the selector that's not-valid (which gives the message in the description).

            Show
            kbrowder Kevin Browder added a comment - OK so I think the core issue is that org.jenkinsci.remoting.nio.NioChannelHub.java's line 513 is: if (key.isReadable()) { where as I think it should be: if (key.isValid() && key.isReadable()) { I guess this would fix the issue assuming that selectedKeys().iterator() is thread safe (I don't really know much about nio), actually it probably makes sense just to add a catch to one of the handlers in the same method (I think the one at http://git.io/VtniaQ ). Basically my thoughts as to what's happening is that isReadable is generating a CancelledKeyException which ends up getting caught by the RuntimeException handler (at http://git.io/l-5MhA ) which ends up killing the loop and attempts to abort everything, including the selector that's not-valid (which gives the message in the description).
            jglick Jesse Glick made changes -
            Assignee Kohsuke Kawaguchi [ kohsuke ]
            Component/s core [ 15593 ]
            Component/s slave-status [ 15981 ]
            kbrowder Kevin Browder made changes -
            Summary All slaves disconnect and no new slaves can connect CancelledKeyException in org.jenkinsci.remoting All slaves disconnect and no new slaves can connect due to CancelledKeyException in org.jenkinsci.remoting
            Hide
            kbrowder Kevin Browder added a comment - - edited

            @James: So I think the closed channel exception is actually closer to the Jenkins-22932 bug (if so you should repopen, since I had reopened before realizing I had a different root cause I then closed). However one could argue that the "selector" loop should actually catch all NIO errors and try again instead of it's current behavior of killing the loop entirely so it might be the case that the fix ends up being the same.

            Additionally I've implemented a patch that implements the key.isValid() check above:
            https://github.com/kbrowder/remoting/commit/d52cef17a789bac0d1478c561c6696a82eb9ab6a
            Additionally I've got another change that captures CancelledKeyExceptions:
            https://github.com/kbrowder/remoting/commit/1dc29075e26c382b593d189a3a04cd1ab859f7c5

            Actually I think with some minor modification you could extend this last approach to catch a number of potential pitfalls

            Show
            kbrowder Kevin Browder added a comment - - edited @James: So I think the closed channel exception is actually closer to the Jenkins-22932 bug (if so you should repopen, since I had reopened before realizing I had a different root cause I then closed). However one could argue that the "selector" loop should actually catch all NIO errors and try again instead of it's current behavior of killing the loop entirely so it might be the case that the fix ends up being the same. Additionally I've implemented a patch that implements the key.isValid() check above: https://github.com/kbrowder/remoting/commit/d52cef17a789bac0d1478c561c6696a82eb9ab6a Additionally I've got another change that captures CancelledKeyExceptions: https://github.com/kbrowder/remoting/commit/1dc29075e26c382b593d189a3a04cd1ab859f7c5 Actually I think with some minor modification you could extend this last approach to catch a number of potential pitfalls
            Hide
            jglick Jesse Glick added a comment -

            Assuming the purported fix in JENKINS-22932 did in fact correct at least some variants of the bug, it should be left closed; if this issue represents some other variants, then fine—a follow-up fix can close this one, and it can be backported separately if marked lts-candidate.

            Show
            jglick Jesse Glick added a comment - Assuming the purported fix in JENKINS-22932 did in fact correct at least some variants of the bug, it should be left closed; if this issue represents some other variants, then fine—a follow-up fix can close this one, and it can be backported separately if marked lts-candidate .
            Hide
            jnoonan33 James Noonan added a comment -

            We updated to take in fix 22932 today.

            If the issue reoccurs for us, I'll raise a new defect.

            Show
            jnoonan33 James Noonan added a comment - We updated to take in fix 22932 today. If the issue reoccurs for us, I'll raise a new defect.
            Hide
            kbrowder Kevin Browder added a comment -

            I have filed a pull request: https://github.com/jenkinsci/remoting/pull/24
            Some might consider avoiding the error and catching it a bit paranoid, but I'm not entirely sure about concurrency issues with NIO. Additionally I still don't have a good test for this, I guess you'd need to cancel the key before calling key.isReadable(), but there's probably a very narrow window there.

            Show
            kbrowder Kevin Browder added a comment - I have filed a pull request: https://github.com/jenkinsci/remoting/pull/24 Some might consider avoiding the error and catching it a bit paranoid, but I'm not entirely sure about concurrency issues with NIO. Additionally I still don't have a good test for this, I guess you'd need to cancel the key before calling key.isReadable() , but there's probably a very narrow window there.
            Hide
            kbrowder Kevin Browder added a comment -

            If it wasn't clear (re-reading my last message I guess it wasn't); yes this represents a different variant to issue initially presented in JENKINS-22932, basically a different exception get's thrown which I think I've fixed in the pull request above, it's probably possible to refactor the fix for both issues in such a way that other exceptions don't kill the main NIO/select loop in the future but this specific issue can be fixed without that (it's easier for you guys to review this anyways).

            Show
            kbrowder Kevin Browder added a comment - If it wasn't clear (re-reading my last message I guess it wasn't); yes this represents a different variant to issue initially presented in JENKINS-22932 , basically a different exception get's thrown which I think I've fixed in the pull request above, it's probably possible to refactor the fix for both issues in such a way that other exceptions don't kill the main NIO/select loop in the future but this specific issue can be fixed without that (it's easier for you guys to review this anyways).
            Hide
            kbrowder Kevin Browder added a comment -

            To fill this in I hear Koshuke is on break so a remoting release wont happen until he comes back (I don't know when that is). Is there any way to hack in a new remoting jar into our production Jenkins (without building all of Jenkins)? Is this wise (we're not on the latest jenkins so are these things cross compatible, additionally will update continue to work)? Basically we're getting crashes a 1-3 times a day, just wondering if there's anything else anyone would recommend so we could get back to normal faster.

            Show
            kbrowder Kevin Browder added a comment - To fill this in I hear Koshuke is on break so a remoting release wont happen until he comes back (I don't know when that is). Is there any way to hack in a new remoting jar into our production Jenkins (without building all of Jenkins)? Is this wise (we're not on the latest jenkins so are these things cross compatible, additionally will update continue to work)? Basically we're getting crashes a 1-3 times a day, just wondering if there's anything else anyone would recommend so we could get back to normal faster.
            Hide
            scm_issue_link SCM/JIRA link daemon added a comment -

            Code changed in jenkins
            User: Kohsuke Kawaguchi
            Path:
            src/main/java/org/jenkinsci/remoting/nio/NioChannelHub.java
            http://jenkins-ci.org/commit/remoting/1083a97145b83f88d9eee0a920a9495e192cd480
            Log:
            [FIXED JENKINS-24050] don't let canceled keys kill the selector thread

            In looking at the proposed PR #24 (https://github.com/jenkinsci/remoting/pull/24), I feel bit
            uneasy to mask the problem like it does.

            The code in question is looping through selected keys and processing it one by one.

            The only code that calls key.cancel() is done from the selector thread that runs this loop.
            So I don't understand how it is possible that the key picked up from selected key set is
            already cancelled here. I wonder if something more is going on.

            Regardless, I agree that this shouldn't kill the selector thread, which breaks all the slaves
            in one go. This change flags and reports the problem, kill the connection related to that key,
            then continue to serve other connections.

            Show
            scm_issue_link SCM/JIRA link daemon added a comment - Code changed in jenkins User: Kohsuke Kawaguchi Path: src/main/java/org/jenkinsci/remoting/nio/NioChannelHub.java http://jenkins-ci.org/commit/remoting/1083a97145b83f88d9eee0a920a9495e192cd480 Log: [FIXED JENKINS-24050] don't let canceled keys kill the selector thread In looking at the proposed PR #24 ( https://github.com/jenkinsci/remoting/pull/24 ), I feel bit uneasy to mask the problem like it does. The code in question is looping through selected keys and processing it one by one. The only code that calls key.cancel() is done from the selector thread that runs this loop. So I don't understand how it is possible that the key picked up from selected key set is already cancelled here. I wonder if something more is going on. Regardless, I agree that this shouldn't kill the selector thread, which breaks all the slaves in one go. This change flags and reports the problem, kill the connection related to that key, then continue to serve other connections.
            scm_issue_link SCM/JIRA link daemon made changes -
            Status Open [ 1 ] Resolved [ 5 ]
            Resolution Fixed [ 1 ]
            Hide
            growflet Patricia Wright added a comment -

            After this change, the slaves no longer disconnect. Instead, the underlying issue causes the slaves to just stop doing whatever they were doing. Running jobs on those slaves hang forever and can not be cancelled. The Jenkins server starts to spam this in the logs until the filesystem fills up:

            Sep 11, 2014 10:38:13 AM org.kohsuke.stapler.export.Property writeValue
            WARNING: null
            org.kohsuke.stapler.export.NotExportableException: class hudson.plugins.parameterizedtrigger.CapturedEnvironmentAction doesn't have @ExportedBean so cannot write hudson.model.Actionable.actions
            at org.kohsuke.stapler.export.Model.<init>(Model.java:73)
            at org.kohsuke.stapler.export.ModelBuilder.get(ModelBuilder.java:51)
            at org.kohsuke.stapler.export.Property.writeValue(Property.java:231)
            at org.kohsuke.stapler.export.Property.writeValue(Property.java:187)
            at org.kohsuke.stapler.export.Property.writeValue(Property.java:139)
            at org.kohsuke.stapler.export.Property.writeTo(Property.java:116)
            at org.kohsuke.stapler.export.Model.writeNestedObjectTo(Model.java:190)
            at org.kohsuke.stapler.export.Model.writeNestedObjectTo(Model.java:185)
            at org.kohsuke.stapler.export.Model.writeNestedObjectTo(Model.java:185)
            at org.kohsuke.stapler.export.Model.writeNestedObjectTo(Model.java:185)
            at org.kohsuke.stapler.export.Model.writeNestedObjectTo(Model.java:185)
            at org.kohsuke.stapler.export.Model.writeTo(Model.java:157)
            at org.kohsuke.stapler.ResponseImpl.serveExposedBean(ResponseImpl.java:267)
            at hudson.model.Api.doPython(Api.java:216)
            at sun.reflect.GeneratedMethodAccessor387.invoke(Unknown Source)
            at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
            at java.lang.reflect.Method.invoke(Method.java:622)
            at org.kohsuke.stapler.Function$InstanceFunction.invoke(Function.java:298)
            at org.kohsuke.stapler.Function.bindAndInvoke(Function.java:161)
            at org.kohsuke.stapler.Function.bindAndInvokeAndServeResponse(Function.java:96)
            at org.kohsuke.stapler.MetaClass$1.doDispatch(MetaClass.java:120)
            at org.kohsuke.stapler.NameBasedDispatcher.dispatch(NameBasedDispatcher.java:53)
            at org.kohsuke.stapler.Stapler.tryInvoke(Stapler.java:728)
            at org.kohsuke.stapler.Stapler.invoke(Stapler.java:858)
            at org.kohsuke.stapler.MetaClass$4.doDispatch(MetaClass.java:210)
            at org.kohsuke.stapler.NameBasedDispatcher.dispatch(NameBasedDispatcher.java:53)
            at org.kohsuke.stapler.Stapler.tryInvoke(Stapler.java:728)
            at org.kohsuke.stapler.Stapler.invoke(Stapler.java:858)
            at org.kohsuke.stapler.MetaClass$12.dispatch(MetaClass.java:390)
            at org.kohsuke.stapler.Stapler.tryInvoke(Stapler.java:728)
            at org.kohsuke.stapler.Stapler.invoke(Stapler.java:858)
            at org.kohsuke.stapler.MetaClass$6.doDispatch(MetaClass.java:248)
            at org.kohsuke.stapler.NameBasedDispatcher.dispatch(NameBasedDispatcher.java:53)
            at org.kohsuke.stapler.Stapler.tryInvoke(Stapler.java:728)
            at org.kohsuke.stapler.Stapler.invoke(Stapler.java:858)
            at org.kohsuke.stapler.Stapler.invoke(Stapler.java:631)
            at org.kohsuke.stapler.Stapler.service(Stapler.java:225)
            at javax.servlet.http.HttpServlet.service(HttpServlet.java:848)
            at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:686)
            at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1494)
            at hudson.util.PluginServletFilter$1.doFilter(PluginServletFilter.java:96)
            at hudson.plugins.greenballs.GreenBallFilter.doFilter(GreenBallFilter.java:58)
            at hudson.util.PluginServletFilter$1.doFilter(PluginServletFilter.java:99)
            at hudson.util.PluginServletFilter.doFilter(PluginServletFilter.java:88)
            at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1482)
            at hudson.security.csrf.CrumbFilter.doFilter(CrumbFilter.java:48)
            at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1482)
            at hudson.security.ChainedServletFilter$1.doFilter(ChainedServletFilter.java:84)
            at hudson.security.UnwrapSecurityExceptionFilter.doFilter(UnwrapSecurityExceptionFilter.java:51)
            at hudson.security.ChainedServletFilter$1.doFilter(ChainedServletFilter.java:87)
            at jenkins.security.ExceptionTranslationFilter.doFilter(ExceptionTranslationFilter.java:117)
            at hudson.security.ChainedServletFilter$1.doFilter(ChainedServletFilter.java:87)
            at org.acegisecurity.providers.anonymous.AnonymousProcessingFilter.doFilter(AnonymousProcessingFilter.java:125)
            at hudson.security.ChainedServletFilter$1.doFilter(ChainedServletFilter.java:87)
            at org.acegisecurity.ui.rememberme.RememberMeProcessingFilter.doFilter(RememberMeProcessingFilter.java:135)
            at hudson.security.ChainedServletFilter$1.doFilter(ChainedServletFilter.java:87)
            at org.acegisecurity.ui.AbstractProcessingFilter.doFilter(AbstractProcessingFilter.java:271)
            at hudson.security.ChainedServletFilter$1.doFilter(ChainedServletFilter.java:87)
            at jenkins.security.BasicHeaderProcessor.doFilter(BasicHeaderProcessor.java:86)
            at hudson.security.ChainedServletFilter$1.doFilter(ChainedServletFilter.java:87)
            at org.acegisecurity.context.HttpSessionContextIntegrationFilter.doFilter(HttpSessionContextIntegrationFilter.java:249)
            at hudson.security.HttpSessionContextIntegrationFilter2.doFilter(HttpSessionContextIntegrationFilter2.java:67)
            at hudson.security.ChainedServletFilter$1.doFilter(ChainedServletFilter.java:87)
            at hudson.security.ChainedServletFilter.doFilter(ChainedServletFilter.java:76)
            at hudson.security.HudsonFilter.doFilter(HudsonFilter.java:164)
            at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1482)
            at org.kohsuke.stapler.compression.CompressionFilter.doFilter(CompressionFilter.java:46)
            at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1482)
            at hudson.util.CharacterEncodingFilter.doFilter(CharacterEncodingFilter.java:81)
            at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1474)
            at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:499)
            at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:137)
            at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:533)
            at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:231)
            at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1086)
            at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:428)
            at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:193)
            at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1020)
            at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135)
            at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116)
            at org.eclipse.jetty.server.Server.handle(Server.java:370)
            at org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:489)
            at org.eclipse.jetty.server.AbstractHttpConnection.headerComplete(AbstractHttpConnection.java:949)
            at org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.headerComplete(AbstractHttpConnection.java:1011)
            at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:644)
            at org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:235)
            at org.eclipse.jetty.server.AsyncHttpConnection.handle(AsyncHttpConnection.java:82)
            at org.eclipse.jetty.io.nio.SelectChannelEndPoint.handle(SelectChannelEndPoint.java:668)
            at org.eclipse.jetty.io.nio.SelectChannelEndPoint$1.run(SelectChannelEndPoint.java:52)
            at winstone.BoundedExecutorService$1.run(BoundedExecutorService.java:77)
            at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1146)
            at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
            at java.lang.Thread.run(Thread.java:701)

            Show
            growflet Patricia Wright added a comment - After this change, the slaves no longer disconnect. Instead, the underlying issue causes the slaves to just stop doing whatever they were doing. Running jobs on those slaves hang forever and can not be cancelled. The Jenkins server starts to spam this in the logs until the filesystem fills up: Sep 11, 2014 10:38:13 AM org.kohsuke.stapler.export.Property writeValue WARNING: null org.kohsuke.stapler.export.NotExportableException: class hudson.plugins.parameterizedtrigger.CapturedEnvironmentAction doesn't have @ExportedBean so cannot write hudson.model.Actionable.actions at org.kohsuke.stapler.export.Model.<init>(Model.java:73) at org.kohsuke.stapler.export.ModelBuilder.get(ModelBuilder.java:51) at org.kohsuke.stapler.export.Property.writeValue(Property.java:231) at org.kohsuke.stapler.export.Property.writeValue(Property.java:187) at org.kohsuke.stapler.export.Property.writeValue(Property.java:139) at org.kohsuke.stapler.export.Property.writeTo(Property.java:116) at org.kohsuke.stapler.export.Model.writeNestedObjectTo(Model.java:190) at org.kohsuke.stapler.export.Model.writeNestedObjectTo(Model.java:185) at org.kohsuke.stapler.export.Model.writeNestedObjectTo(Model.java:185) at org.kohsuke.stapler.export.Model.writeNestedObjectTo(Model.java:185) at org.kohsuke.stapler.export.Model.writeNestedObjectTo(Model.java:185) at org.kohsuke.stapler.export.Model.writeTo(Model.java:157) at org.kohsuke.stapler.ResponseImpl.serveExposedBean(ResponseImpl.java:267) at hudson.model.Api.doPython(Api.java:216) at sun.reflect.GeneratedMethodAccessor387.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:622) at org.kohsuke.stapler.Function$InstanceFunction.invoke(Function.java:298) at org.kohsuke.stapler.Function.bindAndInvoke(Function.java:161) at org.kohsuke.stapler.Function.bindAndInvokeAndServeResponse(Function.java:96) at org.kohsuke.stapler.MetaClass$1.doDispatch(MetaClass.java:120) at org.kohsuke.stapler.NameBasedDispatcher.dispatch(NameBasedDispatcher.java:53) at org.kohsuke.stapler.Stapler.tryInvoke(Stapler.java:728) at org.kohsuke.stapler.Stapler.invoke(Stapler.java:858) at org.kohsuke.stapler.MetaClass$4.doDispatch(MetaClass.java:210) at org.kohsuke.stapler.NameBasedDispatcher.dispatch(NameBasedDispatcher.java:53) at org.kohsuke.stapler.Stapler.tryInvoke(Stapler.java:728) at org.kohsuke.stapler.Stapler.invoke(Stapler.java:858) at org.kohsuke.stapler.MetaClass$12.dispatch(MetaClass.java:390) at org.kohsuke.stapler.Stapler.tryInvoke(Stapler.java:728) at org.kohsuke.stapler.Stapler.invoke(Stapler.java:858) at org.kohsuke.stapler.MetaClass$6.doDispatch(MetaClass.java:248) at org.kohsuke.stapler.NameBasedDispatcher.dispatch(NameBasedDispatcher.java:53) at org.kohsuke.stapler.Stapler.tryInvoke(Stapler.java:728) at org.kohsuke.stapler.Stapler.invoke(Stapler.java:858) at org.kohsuke.stapler.Stapler.invoke(Stapler.java:631) at org.kohsuke.stapler.Stapler.service(Stapler.java:225) at javax.servlet.http.HttpServlet.service(HttpServlet.java:848) at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:686) at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1494) at hudson.util.PluginServletFilter$1.doFilter(PluginServletFilter.java:96) at hudson.plugins.greenballs.GreenBallFilter.doFilter(GreenBallFilter.java:58) at hudson.util.PluginServletFilter$1.doFilter(PluginServletFilter.java:99) at hudson.util.PluginServletFilter.doFilter(PluginServletFilter.java:88) at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1482) at hudson.security.csrf.CrumbFilter.doFilter(CrumbFilter.java:48) at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1482) at hudson.security.ChainedServletFilter$1.doFilter(ChainedServletFilter.java:84) at hudson.security.UnwrapSecurityExceptionFilter.doFilter(UnwrapSecurityExceptionFilter.java:51) at hudson.security.ChainedServletFilter$1.doFilter(ChainedServletFilter.java:87) at jenkins.security.ExceptionTranslationFilter.doFilter(ExceptionTranslationFilter.java:117) at hudson.security.ChainedServletFilter$1.doFilter(ChainedServletFilter.java:87) at org.acegisecurity.providers.anonymous.AnonymousProcessingFilter.doFilter(AnonymousProcessingFilter.java:125) at hudson.security.ChainedServletFilter$1.doFilter(ChainedServletFilter.java:87) at org.acegisecurity.ui.rememberme.RememberMeProcessingFilter.doFilter(RememberMeProcessingFilter.java:135) at hudson.security.ChainedServletFilter$1.doFilter(ChainedServletFilter.java:87) at org.acegisecurity.ui.AbstractProcessingFilter.doFilter(AbstractProcessingFilter.java:271) at hudson.security.ChainedServletFilter$1.doFilter(ChainedServletFilter.java:87) at jenkins.security.BasicHeaderProcessor.doFilter(BasicHeaderProcessor.java:86) at hudson.security.ChainedServletFilter$1.doFilter(ChainedServletFilter.java:87) at org.acegisecurity.context.HttpSessionContextIntegrationFilter.doFilter(HttpSessionContextIntegrationFilter.java:249) at hudson.security.HttpSessionContextIntegrationFilter2.doFilter(HttpSessionContextIntegrationFilter2.java:67) at hudson.security.ChainedServletFilter$1.doFilter(ChainedServletFilter.java:87) at hudson.security.ChainedServletFilter.doFilter(ChainedServletFilter.java:76) at hudson.security.HudsonFilter.doFilter(HudsonFilter.java:164) at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1482) at org.kohsuke.stapler.compression.CompressionFilter.doFilter(CompressionFilter.java:46) at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1482) at hudson.util.CharacterEncodingFilter.doFilter(CharacterEncodingFilter.java:81) at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1474) at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:499) at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:137) at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:533) at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:231) at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1086) at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:428) at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:193) at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1020) at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135) at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116) at org.eclipse.jetty.server.Server.handle(Server.java:370) at org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:489) at org.eclipse.jetty.server.AbstractHttpConnection.headerComplete(AbstractHttpConnection.java:949) at org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.headerComplete(AbstractHttpConnection.java:1011) at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:644) at org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:235) at org.eclipse.jetty.server.AsyncHttpConnection.handle(AsyncHttpConnection.java:82) at org.eclipse.jetty.io.nio.SelectChannelEndPoint.handle(SelectChannelEndPoint.java:668) at org.eclipse.jetty.io.nio.SelectChannelEndPoint$1.run(SelectChannelEndPoint.java:52) at winstone.BoundedExecutorService$1.run(BoundedExecutorService.java:77) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1146) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:701)
            Hide
            danielbeck Daniel Beck added a comment -

            Patricia Wright: The stack traces and log size are a completely unrelated issue (note that this is about HTTP and the Python API), see JENKINS-24458 and issues linked from there.

            Show
            danielbeck Daniel Beck added a comment - Patricia Wright : The stack traces and log size are a completely unrelated issue (note that this is about HTTP and the Python API), see JENKINS-24458 and issues linked from there.
            Hide
            scm_issue_link SCM/JIRA link daemon added a comment -

            Code changed in jenkins
            User: Kohsuke Kawaguchi
            Path:
            changelog.html
            pom.xml
            http://jenkins-ci.org/commit/jenkins/8fc609fe0952b285d5b26a59fd5ff4c29704d33d
            Log:
            [JENKINS-23471 JENKINS-24050]

            Integrated the fix in remoting to Jenkins 1.580.

            Show
            scm_issue_link SCM/JIRA link daemon added a comment - Code changed in jenkins User: Kohsuke Kawaguchi Path: changelog.html pom.xml http://jenkins-ci.org/commit/jenkins/8fc609fe0952b285d5b26a59fd5ff4c29704d33d Log: [JENKINS-23471 JENKINS-24050] Integrated the fix in remoting to Jenkins 1.580.
            Hide
            dogfood dogfood added a comment -

            Integrated in jenkins_main_trunk #3685
            [JENKINS-23471 JENKINS-24050] (Revision 8fc609fe0952b285d5b26a59fd5ff4c29704d33d)

            Result = SUCCESS
            kohsuke : 8fc609fe0952b285d5b26a59fd5ff4c29704d33d
            Files :

            • changelog.html
            • pom.xml
            Show
            dogfood dogfood added a comment - Integrated in jenkins_main_trunk #3685 [JENKINS-23471 JENKINS-24050] (Revision 8fc609fe0952b285d5b26a59fd5ff4c29704d33d) Result = SUCCESS kohsuke : 8fc609fe0952b285d5b26a59fd5ff4c29704d33d Files : changelog.html pom.xml
            stephenconnolly Stephen Connolly made changes -
            Labels CancelledKeyException JNLP remoting slaves CancelledKeyException JNLP lts-candidate remoting slaves
            olivergondza Oliver Gondža made changes -
            Labels CancelledKeyException JNLP lts-candidate remoting slaves 1.580.1-fixed CancelledKeyException JNLP remoting slaves
            Hide
            scm_issue_link SCM/JIRA link daemon added a comment -

            Code changed in jenkins
            User: Kohsuke Kawaguchi
            Path:
            changelog.html
            pom.xml
            http://jenkins-ci.org/commit/jenkins/9c82fc42eb08b89047c544aaa586291ad1485472
            Log:
            [JENKINS-23471 JENKINS-24050]

            Integrated the fix in remoting to Jenkins 1.580.

            (cherry picked from commit 8fc609fe0952b285d5b26a59fd5ff4c29704d33d)

            Show
            scm_issue_link SCM/JIRA link daemon added a comment - Code changed in jenkins User: Kohsuke Kawaguchi Path: changelog.html pom.xml http://jenkins-ci.org/commit/jenkins/9c82fc42eb08b89047c544aaa586291ad1485472 Log: [JENKINS-23471 JENKINS-24050] Integrated the fix in remoting to Jenkins 1.580. (cherry picked from commit 8fc609fe0952b285d5b26a59fd5ff4c29704d33d)
            Hide
            dogfood dogfood added a comment -

            Integrated in jenkins_main_trunk #4292
            [JENKINS-23471 JENKINS-24050] (Revision 9c82fc42eb08b89047c544aaa586291ad1485472)

            Result = UNSTABLE
            ogondza : 9c82fc42eb08b89047c544aaa586291ad1485472
            Files :

            • pom.xml
            • changelog.html
            Show
            dogfood dogfood added a comment - Integrated in jenkins_main_trunk #4292 [JENKINS-23471 JENKINS-24050] (Revision 9c82fc42eb08b89047c544aaa586291ad1485472) Result = UNSTABLE ogondza : 9c82fc42eb08b89047c544aaa586291ad1485472 Files : pom.xml changelog.html
            rtyler R. Tyler Croy made changes -
            Workflow JNJira [ 156921 ] JNJira + In-Review [ 195547 ]
            Hide
            scm_issue_link SCM/JIRA link daemon added a comment -

            Code changed in jenkins
            User: Kohsuke Kawaguchi
            Path:
            changelog.html
            pom.xml
            http://jenkins-ci.org/commit/jenkins/91c5551d4c7682d4adba28fe591fa7772eee62e0
            Log:
            [JENKINS-23471 JENKINS-24050]

            Integrated the fix in remoting to Jenkins 1.580.

            (cherry picked from commit 8fc609fe0952b285d5b26a59fd5ff4c29704d33d)

            Show
            scm_issue_link SCM/JIRA link daemon added a comment - Code changed in jenkins User: Kohsuke Kawaguchi Path: changelog.html pom.xml http://jenkins-ci.org/commit/jenkins/91c5551d4c7682d4adba28fe591fa7772eee62e0 Log: [JENKINS-23471 JENKINS-24050] Integrated the fix in remoting to Jenkins 1.580. (cherry picked from commit 8fc609fe0952b285d5b26a59fd5ff4c29704d33d)

              People

              • Assignee:
                kohsuke Kohsuke Kawaguchi
                Reporter:
                kbrowder Kevin Browder
              • Votes:
                5 Vote for this issue
                Watchers:
                13 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: