Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-14332

Repeated channel/timeout errors from Jenkins slave

    Details

    • Similar Issues:

      Description

      The issue appears on my custom build of the Jenkins core, but seems it could be reproduced on newest versions as well.

      We've experienced a network overloading, which has let to the exception in the PingThread on Jenkins master, which has closed the communication channel. However, the slave stills online and takes jobs, but any remote action fails (see logs above) => All scheduled builds fail with an error

      The issue affects ssh-slaves only:

      • Linux SSH slaves are "online", but all jobs on the fail with the error above
      • Windows services have reconnected automatically...
      • Windows JNLP slaves have reconnected as well

        Attachments

          Issue Links

            Activity

            Hide
            olamy Olivier Lamy added a comment -

            happened with recent version Jenkins ver. 1.475-SNAPSHOT stack trace:

            Started by timer
            Building remotely on ubuntu1 in workspace /home/jenkins/jenkins-slave/workspace/Qpid-Java-Java-Test-0.16
            hudson.util.IOException2: remote file operation failed: /home/jenkins/jenkins-slave/workspace/Qpid-Java-Java-Test-0.16 at hudson.remoting.Channel@22397ea8:ubuntu1
            	at hudson.FilePath.act(FilePath.java:838)
            	at hudson.FilePath.act(FilePath.java:824)
            	at hudson.FilePath.mkdirs(FilePath.java:890)
            	at hudson.model.AbstractProject.checkout(AbstractProject.java:1243)
            	at hudson.model.AbstractBuild$AbstractBuildExecution.defaultCheckout(AbstractBuild.java:589)
            	at jenkins.scm.SCMCheckoutStrategy.checkout(SCMCheckoutStrategy.java:88)
            	at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:494)
            	at hudson.model.Run.execute(Run.java:1488)
            	at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46)
            	at hudson.model.ResourceController.execute(ResourceController.java:88)
            	at hudson.model.Executor.run(Executor.java:236)
            Caused by: hudson.remoting.ChannelClosedException: channel is already closed
            	at hudson.remoting.Channel.send(Channel.java:492)
            	at hudson.remoting.Request.call(Request.java:129)
            	at hudson.remoting.Channel.call(Channel.java:663)
            	at hudson.FilePath.act(FilePath.java:831)
            	... 10 more
            Caused by: java.io.IOException
            	at hudson.remoting.Channel.close(Channel.java:895)
            	at hudson.slaves.ChannelPinger$1.onDead(ChannelPinger.java:110)
            	at hudson.remoting.PingThread.ping(PingThread.java:114)
            	at hudson.remoting.PingThread.run(PingThread.java:81)
            Caused by: java.util.concurrent.TimeoutException: Ping started on 1341557279424 hasn't completed at 1341557519424
            	... 2 more
            Caused by: java.util.concurrent.TimeoutException
            	at hudson.remoting.Request$1.get(Request.java:275)
            	at hudson.remoting.Request$1.get(Request.java:210)
            	at hudson.remoting.FutureAdapter.get(FutureAdapter.java:59)
            	at hudson.remoting.PingThread.ping(PingThread.java:107)
            	... 1 more
            Retrying after 10 seconds
            hudson.util.IOException2: remote file operation failed: /home/jenkins/jenkins-slave/workspace/Qpid-Java-Java-Test-0.16 at hudson.remoting.Channel@22397ea8:ubuntu1
            	at hudson.FilePath.act(FilePath.java:838)
            	at hudson.FilePath.act(FilePath.java:824)
            	at hudson.FilePath.mkdirs(FilePath.java:890)
            	at hudson.model.AbstractProject.checkout(AbstractProject.java:1243)
            	at hudson.model.AbstractBuild$AbstractBuildExecution.defaultCheckout(AbstractBuild.java:589)
            	at jenkins.scm.SCMCheckoutStrategy.checkout(SCMCheckoutStrategy.java:88)
            	at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:494)
            	at hudson.model.Run.execute(Run.java:1488)
            	at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46)
            	at hudson.model.ResourceController.execute(ResourceController.java:88)
            	at hudson.model.Executor.run(Executor.java:236)
            Caused by: hudson.remoting.ChannelClosedException: channel is already closed
            	at hudson.remoting.Channel.send(Channel.java:492)
            	at hudson.remoting.Request.call(Request.java:129)
            	at hudson.remoting.Channel.call(Channel.java:663)
            	at hudson.FilePath.act(FilePath.java:831)
            	... 10 more
            Caused by: java.io.IOException
            	at hudson.remoting.Channel.close(Channel.java:895)
            	at hudson.slaves.ChannelPinger$1.onDead(ChannelPinger.java:110)
            	at hudson.remoting.PingThread.ping(PingThread.java:114)
            	at hudson.remoting.PingThread.run(PingThread.java:81)
            Caused by: java.util.concurrent.TimeoutException: Ping started on 1341557279424 hasn't completed at 1341557519424
            	... 2 more
            Caused by: java.util.concurrent.TimeoutException
            	at hudson.remoting.Request$1.get(Request.java:275)
            	at hudson.remoting.Request$1.get(Request.java:210)
            	at hudson.remoting.FutureAdapter.get(FutureAdapter.java:59)
            	at hudson.remoting.PingThread.ping(PingThread.java:107)
            	... 1 more
            Retrying after 10 seconds
            hudson.util.IOException2: remote file operation failed: /home/jenkins/jenkins-slave/workspace/Qpid-Java-Java-Test-0.16 at hudson.remoting.Channel@22397ea8:ubuntu1
            	at hudson.FilePath.act(FilePath.java:838)
            	at hudson.FilePath.act(FilePath.java:824)
            	at hudson.FilePath.mkdirs(FilePath.java:890)
            	at hudson.model.AbstractProject.checkout(AbstractProject.java:1243)
            	at hudson.model.AbstractBuild$AbstractBuildExecution.defaultCheckout(AbstractBuild.java:589)
            	at jenkins.scm.SCMCheckoutStrategy.checkout(SCMCheckoutStrategy.java:88)
            	at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:494)
            	at hudson.model.Run.execute(Run.java:1488)
            	at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46)
            	at hudson.model.ResourceController.execute(ResourceController.java:88)
            	at hudson.model.Executor.run(Executor.java:236)
            Caused by: hudson.remoting.ChannelClosedException: channel is already closed
            	at hudson.remoting.Channel.send(Channel.java:492)
            	at hudson.remoting.Request.call(Request.java:129)
            	at hudson.remoting.Channel.call(Channel.java:663)
            	at hudson.FilePath.act(FilePath.java:831)
            	... 10 more
            Caused by: java.io.IOException
            	at hudson.remoting.Channel.close(Channel.java:895)
            	at hudson.slaves.ChannelPinger$1.onDead(ChannelPinger.java:110)
            	at hudson.remoting.PingThread.ping(PingThread.java:114)
            	at hudson.remoting.PingThread.run(PingThread.java:81)
            Caused by: java.util.concurrent.TimeoutException: Ping started on 1341557279424 hasn't completed at 1341557519424
            	... 2 more
            Caused by: java.util.concurrent.TimeoutException
            	at hudson.remoting.Request$1.get(Request.java:275)
            	at hudson.remoting.Request$1.get(Request.java:210)
            	at hudson.remoting.FutureAdapter.get(FutureAdapter.java:59)
            	at hudson.remoting.PingThread.ping(PingThread.java:107)
            	... 1 more
            Recording test results
            ERROR: Publisher hudson.tasks.junit.JUnitResultArchiver aborted due to exception
            hudson.remoting.ChannelClosedException: channel is already closed
            	at hudson.remoting.Channel.send(Channel.java:492)
            	at hudson.remoting.Request.call(Request.java:129)
            	at hudson.remoting.Channel.call(Channel.java:663)
            	at hudson.EnvVars.getRemote(EnvVars.java:202)
            	at hudson.model.Computer.getEnvironment(Computer.java:843)
            	at jenkins.model.CoreEnvironmentContributor.buildEnvironmentFor(CoreEnvironmentContributor.java:28)
            	at hudson.model.Run.getEnvironment(Run.java:1938)
            	at hudson.model.AbstractBuild.getEnvironment(AbstractBuild.java:843)
            	at hudson.tasks.junit.JUnitResultArchiver.perform(JUnitResultArchiver.java:131)
            	at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:19)
            	at hudson.model.AbstractBuild$AbstractBuildExecution.perform(AbstractBuild.java:717)
            	at hudson.model.AbstractBuild$AbstractBuildExecution.performAllBuildSteps(AbstractBuild.java:692)
            	at hudson.model.Build$BuildExecution.post2(Build.java:183)
            	at hudson.model.AbstractBuild$AbstractBuildExecution.post(AbstractBuild.java:639)
            	at hudson.model.Run.execute(Run.java:1513)
            	at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46)
            	at hudson.model.ResourceController.execute(ResourceController.java:88)
            	at hudson.model.Executor.run(Executor.java:236)
            Caused by: java.io.IOException
            	at hudson.remoting.Channel.close(Channel.java:895)
            	at hudson.slaves.ChannelPinger$1.onDead(ChannelPinger.java:110)
            	at hudson.remoting.PingThread.ping(PingThread.java:114)
            	at hudson.remoting.PingThread.run(PingThread.java:81)
            Caused by: java.util.concurrent.TimeoutException: Ping started on 1341557279424 hasn't completed at 1341557519424
            	... 2 more
            Caused by: java.util.concurrent.TimeoutException
            	at hudson.remoting.Request$1.get(Request.java:275)
            	at hudson.remoting.Request$1.get(Request.java:210)
            	at hudson.remoting.FutureAdapter.get(FutureAdapter.java:59)
            	at hudson.remoting.PingThread.ping(PingThread.java:107)
            	... 1 more
            Finished: FAILURE
            

            I don't understand this:

            Caused by: java.util.concurrent.TimeoutException: Ping started on 1341557279424 hasn't completed at 1341557519424
            
            Show
            olamy Olivier Lamy added a comment - happened with recent version Jenkins ver. 1.475-SNAPSHOT stack trace: Started by timer Building remotely on ubuntu1 in workspace /home/jenkins/jenkins-slave/workspace/Qpid-Java-Java-Test-0.16 hudson.util.IOException2: remote file operation failed: /home/jenkins/jenkins-slave/workspace/Qpid-Java-Java-Test-0.16 at hudson.remoting.Channel@22397ea8:ubuntu1 at hudson.FilePath.act(FilePath.java:838) at hudson.FilePath.act(FilePath.java:824) at hudson.FilePath.mkdirs(FilePath.java:890) at hudson.model.AbstractProject.checkout(AbstractProject.java:1243) at hudson.model.AbstractBuild$AbstractBuildExecution.defaultCheckout(AbstractBuild.java:589) at jenkins.scm.SCMCheckoutStrategy.checkout(SCMCheckoutStrategy.java:88) at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:494) at hudson.model.Run.execute(Run.java:1488) at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46) at hudson.model.ResourceController.execute(ResourceController.java:88) at hudson.model.Executor.run(Executor.java:236) Caused by: hudson.remoting.ChannelClosedException: channel is already closed at hudson.remoting.Channel.send(Channel.java:492) at hudson.remoting.Request.call(Request.java:129) at hudson.remoting.Channel.call(Channel.java:663) at hudson.FilePath.act(FilePath.java:831) ... 10 more Caused by: java.io.IOException at hudson.remoting.Channel.close(Channel.java:895) at hudson.slaves.ChannelPinger$1.onDead(ChannelPinger.java:110) at hudson.remoting.PingThread.ping(PingThread.java:114) at hudson.remoting.PingThread.run(PingThread.java:81) Caused by: java.util.concurrent.TimeoutException: Ping started on 1341557279424 hasn't completed at 1341557519424 ... 2 more Caused by: java.util.concurrent.TimeoutException at hudson.remoting.Request$1.get(Request.java:275) at hudson.remoting.Request$1.get(Request.java:210) at hudson.remoting.FutureAdapter.get(FutureAdapter.java:59) at hudson.remoting.PingThread.ping(PingThread.java:107) ... 1 more Retrying after 10 seconds hudson.util.IOException2: remote file operation failed: /home/jenkins/jenkins-slave/workspace/Qpid-Java-Java-Test-0.16 at hudson.remoting.Channel@22397ea8:ubuntu1 at hudson.FilePath.act(FilePath.java:838) at hudson.FilePath.act(FilePath.java:824) at hudson.FilePath.mkdirs(FilePath.java:890) at hudson.model.AbstractProject.checkout(AbstractProject.java:1243) at hudson.model.AbstractBuild$AbstractBuildExecution.defaultCheckout(AbstractBuild.java:589) at jenkins.scm.SCMCheckoutStrategy.checkout(SCMCheckoutStrategy.java:88) at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:494) at hudson.model.Run.execute(Run.java:1488) at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46) at hudson.model.ResourceController.execute(ResourceController.java:88) at hudson.model.Executor.run(Executor.java:236) Caused by: hudson.remoting.ChannelClosedException: channel is already closed at hudson.remoting.Channel.send(Channel.java:492) at hudson.remoting.Request.call(Request.java:129) at hudson.remoting.Channel.call(Channel.java:663) at hudson.FilePath.act(FilePath.java:831) ... 10 more Caused by: java.io.IOException at hudson.remoting.Channel.close(Channel.java:895) at hudson.slaves.ChannelPinger$1.onDead(ChannelPinger.java:110) at hudson.remoting.PingThread.ping(PingThread.java:114) at hudson.remoting.PingThread.run(PingThread.java:81) Caused by: java.util.concurrent.TimeoutException: Ping started on 1341557279424 hasn't completed at 1341557519424 ... 2 more Caused by: java.util.concurrent.TimeoutException at hudson.remoting.Request$1.get(Request.java:275) at hudson.remoting.Request$1.get(Request.java:210) at hudson.remoting.FutureAdapter.get(FutureAdapter.java:59) at hudson.remoting.PingThread.ping(PingThread.java:107) ... 1 more Retrying after 10 seconds hudson.util.IOException2: remote file operation failed: /home/jenkins/jenkins-slave/workspace/Qpid-Java-Java-Test-0.16 at hudson.remoting.Channel@22397ea8:ubuntu1 at hudson.FilePath.act(FilePath.java:838) at hudson.FilePath.act(FilePath.java:824) at hudson.FilePath.mkdirs(FilePath.java:890) at hudson.model.AbstractProject.checkout(AbstractProject.java:1243) at hudson.model.AbstractBuild$AbstractBuildExecution.defaultCheckout(AbstractBuild.java:589) at jenkins.scm.SCMCheckoutStrategy.checkout(SCMCheckoutStrategy.java:88) at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:494) at hudson.model.Run.execute(Run.java:1488) at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46) at hudson.model.ResourceController.execute(ResourceController.java:88) at hudson.model.Executor.run(Executor.java:236) Caused by: hudson.remoting.ChannelClosedException: channel is already closed at hudson.remoting.Channel.send(Channel.java:492) at hudson.remoting.Request.call(Request.java:129) at hudson.remoting.Channel.call(Channel.java:663) at hudson.FilePath.act(FilePath.java:831) ... 10 more Caused by: java.io.IOException at hudson.remoting.Channel.close(Channel.java:895) at hudson.slaves.ChannelPinger$1.onDead(ChannelPinger.java:110) at hudson.remoting.PingThread.ping(PingThread.java:114) at hudson.remoting.PingThread.run(PingThread.java:81) Caused by: java.util.concurrent.TimeoutException: Ping started on 1341557279424 hasn't completed at 1341557519424 ... 2 more Caused by: java.util.concurrent.TimeoutException at hudson.remoting.Request$1.get(Request.java:275) at hudson.remoting.Request$1.get(Request.java:210) at hudson.remoting.FutureAdapter.get(FutureAdapter.java:59) at hudson.remoting.PingThread.ping(PingThread.java:107) ... 1 more Recording test results ERROR: Publisher hudson.tasks.junit.JUnitResultArchiver aborted due to exception hudson.remoting.ChannelClosedException: channel is already closed at hudson.remoting.Channel.send(Channel.java:492) at hudson.remoting.Request.call(Request.java:129) at hudson.remoting.Channel.call(Channel.java:663) at hudson.EnvVars.getRemote(EnvVars.java:202) at hudson.model.Computer.getEnvironment(Computer.java:843) at jenkins.model.CoreEnvironmentContributor.buildEnvironmentFor(CoreEnvironmentContributor.java:28) at hudson.model.Run.getEnvironment(Run.java:1938) at hudson.model.AbstractBuild.getEnvironment(AbstractBuild.java:843) at hudson.tasks.junit.JUnitResultArchiver.perform(JUnitResultArchiver.java:131) at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:19) at hudson.model.AbstractBuild$AbstractBuildExecution.perform(AbstractBuild.java:717) at hudson.model.AbstractBuild$AbstractBuildExecution.performAllBuildSteps(AbstractBuild.java:692) at hudson.model.Build$BuildExecution.post2(Build.java:183) at hudson.model.AbstractBuild$AbstractBuildExecution.post(AbstractBuild.java:639) at hudson.model.Run.execute(Run.java:1513) at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46) at hudson.model.ResourceController.execute(ResourceController.java:88) at hudson.model.Executor.run(Executor.java:236) Caused by: java.io.IOException at hudson.remoting.Channel.close(Channel.java:895) at hudson.slaves.ChannelPinger$1.onDead(ChannelPinger.java:110) at hudson.remoting.PingThread.ping(PingThread.java:114) at hudson.remoting.PingThread.run(PingThread.java:81) Caused by: java.util.concurrent.TimeoutException: Ping started on 1341557279424 hasn't completed at 1341557519424 ... 2 more Caused by: java.util.concurrent.TimeoutException at hudson.remoting.Request$1.get(Request.java:275) at hudson.remoting.Request$1.get(Request.java:210) at hudson.remoting.FutureAdapter.get(FutureAdapter.java:59) at hudson.remoting.PingThread.ping(PingThread.java:107) ... 1 more Finished: FAILURE I don't understand this: Caused by: java.util.concurrent.TimeoutException: Ping started on 1341557279424 hasn't completed at 1341557519424
            Hide
            oldelvet Richard Mortimer added a comment -

            The message looks ok to me. There is 360 seconds difference between the two timestamps which would be consistent with a timeout.

            However I have a feeling that this is due to some additional changes that Koshuke made when he merged my remoting pull-3. See

            https://github.com/jenkinsci/remoting/commit/aceaf5c038e8179723f9a83be6f994f37e7cfa46

            Previously the TimeoutException (ping failed) message would only have been observed if a TimeoutException was received. Now the message will be logged when the loop ends. Maybe there is a real timeout/delay here that was previously hidden.

            One other note. The while(remaining>0) gets evaluated after a timeout exception but remaining will only get recalculated on the next loop so there will be an extra pass around the loop.

            Show
            oldelvet Richard Mortimer added a comment - The message looks ok to me. There is 360 seconds difference between the two timestamps which would be consistent with a timeout. However I have a feeling that this is due to some additional changes that Koshuke made when he merged my remoting pull-3. See https://github.com/jenkinsci/remoting/commit/aceaf5c038e8179723f9a83be6f994f37e7cfa46 Previously the TimeoutException (ping failed) message would only have been observed if a TimeoutException was received. Now the message will be logged when the loop ends. Maybe there is a real timeout/delay here that was previously hidden. One other note. The while(remaining>0) gets evaluated after a timeout exception but remaining will only get recalculated on the next loop so there will be an extra pass around the loop.
            Hide
            oldelvet Richard Mortimer added a comment -

            Ah I take some of that back. I didn't check the main jenkins build closely enough. Jenkins is using remoting 1.16 and this does not include the latest changes made by Kohsuke/me.

            The PingThread logic in 1.475-SNAPSHOT is unchanged since Sept/2011

            Show
            oldelvet Richard Mortimer added a comment - Ah I take some of that back. I didn't check the main jenkins build closely enough. Jenkins is using remoting 1.16 and this does not include the latest changes made by Kohsuke/me. The PingThread logic in 1.475-SNAPSHOT is unchanged since Sept/2011
            Hide
            oldelvet Richard Mortimer added a comment -

            Similar symptoms at JENKINS-14307

            Show
            oldelvet Richard Mortimer added a comment - Similar symptoms at JENKINS-14307
            Hide
            alikulin Alexander Likulin added a comment - - edited

            The same problems happens on 1.477

            hudson.util.IOException2: remote file operation failed: /home/slave/.jenkins/slave/workspace/4.0-SNAPSHOT-POSTGRESQL2 at hudson.remoting.Channel@7e053ab4:slave
            at hudson.FilePath.act(FilePath.java:838)
            at hudson.FilePath.act(FilePath.java:824)
            at hudson.FilePath.mkdirs(FilePath.java:890)
            at hudson.model.AbstractProject.checkout(AbstractProject.java:1254)
            at hudson.model.AbstractBuild$AbstractBuildExecution.defaultCheckout(AbstractBuild.java:589)
            at jenkins.scm.SCMCheckoutStrategy.checkout(SCMCheckoutStrategy.java:88)
            at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:494)
            at hudson.model.Run.execute(Run.java:1502)
            at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46)
            at hudson.model.ResourceController.execute(ResourceController.java:88)
            at hudson.model.Executor.run(Executor.java:236)
            Caused by: hudson.remoting.ChannelClosedException: channel is already closed
            at hudson.remoting.Channel.send(Channel.java:492)
            at hudson.remoting.Request.call(Request.java:129)
            at hudson.remoting.Channel.call(Channel.java:663)
            at hudson.FilePath.act(FilePath.java:831)
            ... 10 more
            Caused by: java.io.IOException
            at hudson.remoting.Channel.close(Channel.java:895)
            at hudson.slaves.ChannelPinger$1.onDead(ChannelPinger.java:110)
            at hudson.remoting.PingThread.ping(PingThread.java:114)
            at hudson.remoting.PingThread.run(PingThread.java:81)
            Caused by: java.util.concurrent.TimeoutException: Ping started on 1346241264458 hasn't completed at 1346241504458
            ... 2 more
            Caused by: java.util.concurrent.TimeoutException
            at hudson.remoting.Request$1.get(Request.java:275)
            at hudson.remoting.Request$1.get(Request.java:210)
            at hudson.remoting.FutureAdapter.get(FutureAdapter.java:59)
            at hudson.remoting.PingThread.ping(PingThread.java:107)
            ... 1 more

            Show
            alikulin Alexander Likulin added a comment - - edited The same problems happens on 1.477 hudson.util.IOException2: remote file operation failed: /home/slave/.jenkins/slave/workspace/4.0-SNAPSHOT-POSTGRESQL2 at hudson.remoting.Channel@7e053ab4:slave at hudson.FilePath.act(FilePath.java:838) at hudson.FilePath.act(FilePath.java:824) at hudson.FilePath.mkdirs(FilePath.java:890) at hudson.model.AbstractProject.checkout(AbstractProject.java:1254) at hudson.model.AbstractBuild$AbstractBuildExecution.defaultCheckout(AbstractBuild.java:589) at jenkins.scm.SCMCheckoutStrategy.checkout(SCMCheckoutStrategy.java:88) at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:494) at hudson.model.Run.execute(Run.java:1502) at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46) at hudson.model.ResourceController.execute(ResourceController.java:88) at hudson.model.Executor.run(Executor.java:236) Caused by: hudson.remoting.ChannelClosedException: channel is already closed at hudson.remoting.Channel.send(Channel.java:492) at hudson.remoting.Request.call(Request.java:129) at hudson.remoting.Channel.call(Channel.java:663) at hudson.FilePath.act(FilePath.java:831) ... 10 more Caused by: java.io.IOException at hudson.remoting.Channel.close(Channel.java:895) at hudson.slaves.ChannelPinger$1.onDead(ChannelPinger.java:110) at hudson.remoting.PingThread.ping(PingThread.java:114) at hudson.remoting.PingThread.run(PingThread.java:81) Caused by: java.util.concurrent.TimeoutException: Ping started on 1346241264458 hasn't completed at 1346241504458 ... 2 more Caused by: java.util.concurrent.TimeoutException at hudson.remoting.Request$1.get(Request.java:275) at hudson.remoting.Request$1.get(Request.java:210) at hudson.remoting.FutureAdapter.get(FutureAdapter.java:59) at hudson.remoting.PingThread.ping(PingThread.java:107) ... 1 more
            Hide
            jcarsique Julien Carsique added a comment - - edited

            Maybe not a Jenkins issue.

            Having the same issue, reproduced at every build for a given slave (but with nothing relevant in its logs), I tried to disconnect and reconnect the slave:

            [05/06/13 14:06:04] Launching slave agent
            $ ssh slavedns java -jar ~/bin/slave.jar
            <===[JENKINS REMOTING CAPACITY]===<===[JENKINS REMOTING CAPACITY]===>>channel started
            channel started
            Slave.jar version: 2.22
            This is a Unix slave
            Slave.jar version: 2.22
            This is a Unix slave
            Copied maven-agent.jar
            Copied maven3-agent.jar
            Copied maven3-interceptor.jar
            Copied maven-agent.jar
            Copied maven-interceptor.jar
            Copied maven2.1-interceptor.jar
            Copied plexus-classworld.jar
            Copied maven3-agent.jar
            Copied maven3-interceptor.jar
            Copied classworlds.jar
            Copied maven-interceptor.jar
            Copied maven2.1-interceptor.jar
            Copied plexus-classworld.jar
            Copied classworlds.jar
            Evacuated stdout
            Evacuated stdout
            ERROR: Unexpected error in launching a slave. This is probably a bug in Jenkins
            (...)java.lang.IllegalStateException: Already connected
            	at hudson.slaves.SlaveComputer.setChannel(SlaveComputer.java:459)
            	at hudson.slaves.SlaveComputer.setChannel(SlaveComputer.java:339)
            	at hudson.slaves.CommandLauncher.launch(CommandLauncher.java:122)
            	at hudson.slaves.SlaveComputer$1.call(SlaveComputer.java:222)
            	at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
            	at java.util.concurrent.FutureTask.run(FutureTask.java:138)
            	at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
            	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
            	at java.lang.Thread.run(Thread.java:662)
            Connection terminated
            channel stopped
            ERROR: Unexpected error in launching a slave. This is probably a bug in Jenkins
            (...)java.lang.NullPointerException
            	at org.jenkinsci.modules.slave_installer.impl.ComputerListenerImpl.onOnline(ComputerListenerImpl.java:32)
            	at hudson.slaves.SlaveComputer.setChannel(SlaveComputer.java:471)
            	at hudson.slaves.SlaveComputer.setChannel(SlaveComputer.java:339)
            	at hudson.slaves.CommandLauncher.launch(CommandLauncher.java:122)
            	at hudson.slaves.SlaveComputer$1.call(SlaveComputer.java:222)
            	at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
            	at java.util.concurrent.FutureTask.run(FutureTask.java:138)
            	at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
            	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
            	at java.lang.Thread.run(Thread.java:662)
            channel stopped
            Connection terminated

            Then the slaved successfully reconnected itself.

            It appeared there was looping thread consuming 100% CPU. Killing the process solved the issue.

            Strangely, some system and Java commands were not working (ps, cat, less, jstack, trace, ...) until it has been killed, whereas other commands worked (top, jps, renice, kill, ...). That could explain the weird Jenkins 4s timeout log (java.util.concurrent.TimeoutException: Ping started on 1367743028681 hasn't completed at 1367743268682): the system was partially frozen and with really unstable response times.

            Note the looping thread came from a previous job which badly stopped on timeout:

            03:00:16.868 Build timed out (after 180 minutes). Marking the build as aborted.
            03:00:16.873 Build was aborted
            03:00:16.874 Archiving artifacts
            03:00:16.874 ERROR: Failed to archive artifacts: **/log/*.log, tomcat*/nxserver/config/distribution.properties
            03:00:16.875 hudson.remoting.ChannelClosedException: channel is already closed
            03:00:16.876 	at hudson.remoting.Channel.send(Channel.java:494)
            03:00:16.876 	at hudson.remoting.Request.call(Request.java:129)
            03:00:16.876 	at hudson.remoting.Channel.call(Channel.java:672)
            03:00:16.876 	at hudson.EnvVars.getRemote(EnvVars.java:212)
            03:00:16.876 	at hudson.model.Computer.getEnvironment(Computer.java:882)
            03:00:16.876 	at jenkins.model.CoreEnvironmentContributor.buildEnvironmentFor(CoreEnvironmentContributor.java:28)
            03:00:16.876 	at hudson.model.Run.getEnvironment(Run.java:2028)
            03:00:16.876 	at hudson.model.AbstractBuild.getEnvironment(AbstractBuild.java:927)
            03:00:16.876 	at hudson.tasks.ArtifactArchiver.perform(ArtifactArchiver.java:115)
            03:00:16.876 	at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:19)
            03:00:16.876 	at hudson.model.AbstractBuild$AbstractBuildExecution.perform(AbstractBuild.java:798)
            03:00:16.876 	at hudson.model.AbstractBuild$AbstractBuildExecution.performAllBuildSteps(AbstractBuild.java:770)
            03:00:16.876 	at hudson.model.Build$BuildExecution.post2(Build.java:183)
            03:00:16.876 	at hudson.model.AbstractBuild$AbstractBuildExecution.post(AbstractBuild.java:720)
            03:00:16.876 	at hudson.model.Run.execute(Run.java:1600)
            03:00:16.876 	at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46)
            03:00:16.876 	at hudson.model.ResourceController.execute(ResourceController.java:88)
            03:00:16.876 	at hudson.model.Executor.run(Executor.java:237)
            03:00:16.876 Caused by: java.io.IOException
            03:00:16.876 	at hudson.remoting.Channel.close(Channel.java:910)
            03:00:16.876 	at hudson.slaves.ChannelPinger$1.onDead(ChannelPinger.java:110)
            03:00:16.876 	at hudson.remoting.PingThread.ping(PingThread.java:120)
            03:00:16.876 	at hudson.remoting.PingThread.run(PingThread.java:81)
            03:00:16.876 Caused by: java.util.concurrent.TimeoutException: Ping started on 1367743028681 hasn't completed at 1367743268682
            03:00:16.876 	... 2 more
            Show
            jcarsique Julien Carsique added a comment - - edited Maybe not a Jenkins issue. Having the same issue, reproduced at every build for a given slave (but with nothing relevant in its logs), I tried to disconnect and reconnect the slave: [05/06/13 14:06:04] Launching slave agent $ ssh slavedns java -jar ~/bin/slave.jar <===[JENKINS REMOTING CAPACITY]===<===[JENKINS REMOTING CAPACITY]===>>channel started channel started Slave.jar version: 2.22 This is a Unix slave Slave.jar version: 2.22 This is a Unix slave Copied maven-agent.jar Copied maven3-agent.jar Copied maven3-interceptor.jar Copied maven-agent.jar Copied maven-interceptor.jar Copied maven2.1-interceptor.jar Copied plexus-classworld.jar Copied maven3-agent.jar Copied maven3-interceptor.jar Copied classworlds.jar Copied maven-interceptor.jar Copied maven2.1-interceptor.jar Copied plexus-classworld.jar Copied classworlds.jar Evacuated stdout Evacuated stdout ERROR: Unexpected error in launching a slave. This is probably a bug in Jenkins (...)java.lang.IllegalStateException: Already connected at hudson.slaves.SlaveComputer.setChannel(SlaveComputer.java:459) at hudson.slaves.SlaveComputer.setChannel(SlaveComputer.java:339) at hudson.slaves.CommandLauncher.launch(CommandLauncher.java:122) at hudson.slaves.SlaveComputer$1.call(SlaveComputer.java:222) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang. Thread .run( Thread .java:662) Connection terminated channel stopped ERROR: Unexpected error in launching a slave. This is probably a bug in Jenkins (...)java.lang.NullPointerException at org.jenkinsci.modules.slave_installer.impl.ComputerListenerImpl.onOnline(ComputerListenerImpl.java:32) at hudson.slaves.SlaveComputer.setChannel(SlaveComputer.java:471) at hudson.slaves.SlaveComputer.setChannel(SlaveComputer.java:339) at hudson.slaves.CommandLauncher.launch(CommandLauncher.java:122) at hudson.slaves.SlaveComputer$1.call(SlaveComputer.java:222) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang. Thread .run( Thread .java:662) channel stopped Connection terminated Then the slaved successfully reconnected itself. It appeared there was looping thread consuming 100% CPU. Killing the process solved the issue. Strangely, some system and Java commands were not working (ps, cat, less, jstack, trace, ...) until it has been killed, whereas other commands worked (top, jps, renice, kill, ...). That could explain the weird Jenkins 4s timeout log ( java.util.concurrent.TimeoutException: Ping started on 1367743028681 hasn't completed at 1367743268682 ): the system was partially frozen and with really unstable response times. Note the looping thread came from a previous job which badly stopped on timeout: 03:00:16.868 Build timed out (after 180 minutes). Marking the build as aborted. 03:00:16.873 Build was aborted 03:00:16.874 Archiving artifacts 03:00:16.874 ERROR: Failed to archive artifacts: **/log /*.log, tomcat*/ nxserver/config/distribution.properties 03:00:16.875 hudson.remoting.ChannelClosedException: channel is already closed 03:00:16.876 at hudson.remoting.Channel.send(Channel.java:494) 03:00:16.876 at hudson.remoting.Request.call(Request.java:129) 03:00:16.876 at hudson.remoting.Channel.call(Channel.java:672) 03:00:16.876 at hudson.EnvVars.getRemote(EnvVars.java:212) 03:00:16.876 at hudson.model.Computer.getEnvironment(Computer.java:882) 03:00:16.876 at jenkins.model.CoreEnvironmentContributor.buildEnvironmentFor(CoreEnvironmentContributor.java:28) 03:00:16.876 at hudson.model.Run.getEnvironment(Run.java:2028) 03:00:16.876 at hudson.model.AbstractBuild.getEnvironment(AbstractBuild.java:927) 03:00:16.876 at hudson.tasks.ArtifactArchiver.perform(ArtifactArchiver.java:115) 03:00:16.876 at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:19) 03:00:16.876 at hudson.model.AbstractBuild$AbstractBuildExecution.perform(AbstractBuild.java:798) 03:00:16.876 at hudson.model.AbstractBuild$AbstractBuildExecution.performAllBuildSteps(AbstractBuild.java:770) 03:00:16.876 at hudson.model.Build$BuildExecution.post2(Build.java:183) 03:00:16.876 at hudson.model.AbstractBuild$AbstractBuildExecution.post(AbstractBuild.java:720) 03:00:16.876 at hudson.model.Run.execute(Run.java:1600) 03:00:16.876 at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46) 03:00:16.876 at hudson.model.ResourceController.execute(ResourceController.java:88) 03:00:16.876 at hudson.model.Executor.run(Executor.java:237) 03:00:16.876 Caused by: java.io.IOException 03:00:16.876 at hudson.remoting.Channel.close(Channel.java:910) 03:00:16.876 at hudson.slaves.ChannelPinger$1.onDead(ChannelPinger.java:110) 03:00:16.876 at hudson.remoting.PingThread.ping(PingThread.java:120) 03:00:16.876 at hudson.remoting.PingThread.run(PingThread.java:81) 03:00:16.876 Caused by: java.util.concurrent.TimeoutException: Ping started on 1367743028681 hasn't completed at 1367743268682 03:00:16.876 ... 2 more
            Hide
            oleg_nenashev Oleg Nenashev added a comment - - edited

            Ping Monitoring timeout should be configurable at least...
            Issue is being reproduced in 1.509.2

            Show
            oleg_nenashev Oleg Nenashev added a comment - - edited Ping Monitoring timeout should be configurable at least... Issue is being reproduced in 1.509.2
            Hide
            nickolayr Nickolay Rumyantsev added a comment -

            Having the similar issue. Slave is reconnected during the archiving of artifacts, when Jenkins is under load.
            Jenkins 1.543.

            Show
            nickolayr Nickolay Rumyantsev added a comment - Having the similar issue. Slave is reconnected during the archiving of artifacts, when Jenkins is under load. Jenkins 1.543.
            Hide
            kleini Marcus Klein added a comment -

            My jobs are randomly failing with version 1.550 with the following exception:

            FATAL: hudson.remoting.RequestAbortedException: hudson.remoting.Channel$OrderlyShutdown: java.util.concurrent.TimeoutException: Ping started on 1392852728137 hasn't completed at 1392852968137
            hudson.remoting.RequestAbortedException: hudson.remoting.RequestAbortedException: hudson.remoting.Channel$OrderlyShutdown: java.util.concurrent.TimeoutException: Ping started on 1392852728137 hasn't completed at 1392852968137
            at hudson.remoting.RequestAbortedException.wrapForRethrow(RequestAbortedException.java:41)
            at hudson.remoting.RequestAbortedException.wrapForRethrow(RequestAbortedException.java:34)
            at hudson.remoting.Request.call(Request.java:174)
            at hudson.remoting.Channel.call(Channel.java:722)
            at hudson.remoting.RemoteInvocationHandler.invoke(RemoteInvocationHandler.java:167)
            at com.sun.proxy.$Proxy47.join(Unknown Source)
            at hudson.Launcher$RemoteLauncher$ProcImpl.join(Launcher.java:925)
            at hudson.Launcher$ProcStarter.join(Launcher.java:360)
            at hudson.tasks.Ant.perform(Ant.java:217)
            at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:20)
            at hudson.model.AbstractBuild$AbstractBuildExecution.perform(AbstractBuild.java:785)
            at hudson.model.Build$BuildExecution.build(Build.java:199)
            at hudson.model.Build$BuildExecution.doRun(Build.java:160)
            at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:566)
            at hudson.model.Run.execute(Run.java:1678)
            at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46)
            at hudson.model.ResourceController.execute(ResourceController.java:88)
            at hudson.model.Executor.run(Executor.java:231)
            Caused by: hudson.remoting.RequestAbortedException: hudson.remoting.Channel$OrderlyShutdown: java.util.concurrent.TimeoutException: Ping started on 1392852728137 hasn't completed at 1392852968137
            at hudson.remoting.Request.abort(Request.java:299)
            at hudson.remoting.Channel.terminate(Channel.java:782)
            at hudson.remoting.Channel$CloseCommand.execute(Channel.java:927)
            at hudson.remoting.Channel$2.handle(Channel.java:461)
            at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:60)
            Caused by: hudson.remoting.Channel$OrderlyShutdown: java.util.concurrent.TimeoutException: Ping started on 1392852728137 hasn't completed at 1392852968137
            ... 3 more
            Caused by: Command close created at
            at hudson.remoting.Command.<init>(Command.java:56)
            at hudson.remoting.Channel$CloseCommand.<init>(Channel.java:921)
            at hudson.remoting.Channel$CloseCommand.<init>(Channel.java:919)
            at hudson.remoting.Channel.close(Channel.java:1002)
            at hudson.slaves.ChannelPinger$1.onDead(ChannelPinger.java:110)
            at hudson.remoting.PingThread.ping(PingThread.java:120)
            at hudson.remoting.PingThread.run(PingThread.java:81)
            Caused by: java.util.concurrent.TimeoutException: Ping started on 1392852728137 hasn't completed at 1392852968137
            ... 2 more

            I rolled back to version 1.546 because I think it did not happen with that version.

            Show
            kleini Marcus Klein added a comment - My jobs are randomly failing with version 1.550 with the following exception: FATAL: hudson.remoting.RequestAbortedException: hudson.remoting.Channel$OrderlyShutdown: java.util.concurrent.TimeoutException: Ping started on 1392852728137 hasn't completed at 1392852968137 hudson.remoting.RequestAbortedException: hudson.remoting.RequestAbortedException: hudson.remoting.Channel$OrderlyShutdown: java.util.concurrent.TimeoutException: Ping started on 1392852728137 hasn't completed at 1392852968137 at hudson.remoting.RequestAbortedException.wrapForRethrow(RequestAbortedException.java:41) at hudson.remoting.RequestAbortedException.wrapForRethrow(RequestAbortedException.java:34) at hudson.remoting.Request.call(Request.java:174) at hudson.remoting.Channel.call(Channel.java:722) at hudson.remoting.RemoteInvocationHandler.invoke(RemoteInvocationHandler.java:167) at com.sun.proxy.$Proxy47.join(Unknown Source) at hudson.Launcher$RemoteLauncher$ProcImpl.join(Launcher.java:925) at hudson.Launcher$ProcStarter.join(Launcher.java:360) at hudson.tasks.Ant.perform(Ant.java:217) at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:20) at hudson.model.AbstractBuild$AbstractBuildExecution.perform(AbstractBuild.java:785) at hudson.model.Build$BuildExecution.build(Build.java:199) at hudson.model.Build$BuildExecution.doRun(Build.java:160) at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:566) at hudson.model.Run.execute(Run.java:1678) at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46) at hudson.model.ResourceController.execute(ResourceController.java:88) at hudson.model.Executor.run(Executor.java:231) Caused by: hudson.remoting.RequestAbortedException: hudson.remoting.Channel$OrderlyShutdown: java.util.concurrent.TimeoutException: Ping started on 1392852728137 hasn't completed at 1392852968137 at hudson.remoting.Request.abort(Request.java:299) at hudson.remoting.Channel.terminate(Channel.java:782) at hudson.remoting.Channel$CloseCommand.execute(Channel.java:927) at hudson.remoting.Channel$2.handle(Channel.java:461) at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:60) Caused by: hudson.remoting.Channel$OrderlyShutdown: java.util.concurrent.TimeoutException: Ping started on 1392852728137 hasn't completed at 1392852968137 ... 3 more Caused by: Command close created at at hudson.remoting.Command.<init>(Command.java:56) at hudson.remoting.Channel$CloseCommand.<init>(Channel.java:921) at hudson.remoting.Channel$CloseCommand.<init>(Channel.java:919) at hudson.remoting.Channel.close(Channel.java:1002) at hudson.slaves.ChannelPinger$1.onDead(ChannelPinger.java:110) at hudson.remoting.PingThread.ping(PingThread.java:120) at hudson.remoting.PingThread.run(PingThread.java:81) Caused by: java.util.concurrent.TimeoutException: Ping started on 1392852728137 hasn't completed at 1392852968137 ... 2 more I rolled back to version 1.546 because I think it did not happen with that version.
            Hide
            oleg_nenashev Oleg Nenashev added a comment - - edited

            The issue appears on remoting-2.36
            Nodes still alive after the issue

            Show
            oleg_nenashev Oleg Nenashev added a comment - - edited The issue appears on remoting-2.36 Nodes still alive after the issue
            Hide
            oleg_nenashev Oleg Nenashev added a comment - - edited

            The issue appears on my custom build of the Jenkins core, but seems it could be reproduced on newest versions as well.

            We've experienced a network overloading, which has let to the exception in the PingThread on Jenkins master, which has closed the communication channel. However, the slave stills online and takes jobs, but any remote action fails (see the log above) => All scheduled builds fail with an error

            The issue affects ssh-slaves only:

            • Linux SSH slaves are "online", but all jobs on the fail with the error above
            • Windows services have reconnected automatically...
            • Windows JNLP slaves have reconnected as well

            I'm not sure if there's any Linux specifics in the issue.
            On my installation default TCP timeouts exceed the PingThread timeout on the master, hence there could be a collision on the server side

            Show
            oleg_nenashev Oleg Nenashev added a comment - - edited The issue appears on my custom build of the Jenkins core, but seems it could be reproduced on newest versions as well. We've experienced a network overloading, which has let to the exception in the PingThread on Jenkins master, which has closed the communication channel. However, the slave stills online and takes jobs, but any remote action fails (see the log above) => All scheduled builds fail with an error The issue affects ssh-slaves only: Linux SSH slaves are "online", but all jobs on the fail with the error above Windows services have reconnected automatically... Windows JNLP slaves have reconnected as well I'm not sure if there's any Linux specifics in the issue. On my installation default TCP timeouts exceed the PingThread timeout on the master, hence there could be a collision on the server side
            Hide
            oleg_nenashev Oleg Nenashev added a comment -

            Updated the issue's description.

            Show
            oleg_nenashev Oleg Nenashev added a comment - Updated the issue's description.
            Hide
            wbauer wbauer added a comment -

            Same issue here with Linux SSH slaves with Jenkins 1.564:

            14:33:30 Started by user wbauer
            14:33:30 [EnvInject] - Loading node environment variables.
            14:33:30 ERROR: SEVERE ERROR occurs
            14:33:30 org.jenkinsci.lib.envinject.EnvInjectException: hudson.remoting.ChannelClosedException: channel is already closed
            14:33:30 at org.jenkinsci.plugins.envinject.service.EnvironmentVariablesNodeLoader.gatherEnvironmentVariablesNode(EnvironmentVariablesNodeLoader.java:75)
            14:33:30 at org.jenkinsci.plugins.envinject.EnvInjectListener.loadEnvironmentVariablesNode(EnvInjectListener.java:81)
            14:33:30 at org.jenkinsci.plugins.envinject.EnvInjectListener.setUpEnvironment(EnvInjectListener.java:39)
            14:33:30 at hudson.model.AbstractBuild$AbstractBuildExecution.createLauncher(AbstractBuild.java:570)
            14:33:30 at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:476)
            14:33:30 at hudson.model.Run.execute(Run.java:1706)
            14:33:30 at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:43)
            14:33:30 at hudson.model.ResourceController.execute(ResourceController.java:88)
            14:33:30 at hudson.model.Executor.run(Executor.java:231)
            14:33:30 Caused by: hudson.remoting.ChannelClosedException: channel is already closed
            14:33:30 at hudson.remoting.Channel.send(Channel.java:541)
            14:33:30 at hudson.remoting.Request.call(Request.java:129)
            14:33:30 at hudson.remoting.Channel.call(Channel.java:739)
            14:33:30 at hudson.FilePath.act(FilePath.java:1009)
            14:33:30 at org.jenkinsci.plugins.envinject.service.EnvironmentVariablesNodeLoader.gatherEnvironmentVariablesNode(EnvironmentVariablesNodeLoader.java:44)
            14:33:30 ... 8 more
            14:33:30 Caused by: java.io.IOException
            14:33:30 at hudson.remoting.Channel.close(Channel.java:1027)
            14:33:30 at hudson.slaves.ChannelPinger$1.onDead(ChannelPinger.java:110)
            14:33:30 at hudson.remoting.PingThread.ping(PingThread.java:120)
            14:33:30 at hudson.remoting.PingThread.run(PingThread.java:81)
            14:33:30 Caused by: java.util.concurrent.TimeoutException: Ping started on 1401187729473 hasn't completed at 1401187969473
            14:33:30 ... 2 more
            14:33:30 Notifying upstream projects of job completion
            14:33:30 [EnvInject] - [ERROR] - SEVERE ERROR occurs: channel is already closed
            14:33:33 Finished: FAILURE

            Show
            wbauer wbauer added a comment - Same issue here with Linux SSH slaves with Jenkins 1.564: 14:33:30 Started by user wbauer 14:33:30 [EnvInject] - Loading node environment variables. 14:33:30 ERROR: SEVERE ERROR occurs 14:33:30 org.jenkinsci.lib.envinject.EnvInjectException: hudson.remoting.ChannelClosedException: channel is already closed 14:33:30 at org.jenkinsci.plugins.envinject.service.EnvironmentVariablesNodeLoader.gatherEnvironmentVariablesNode(EnvironmentVariablesNodeLoader.java:75) 14:33:30 at org.jenkinsci.plugins.envinject.EnvInjectListener.loadEnvironmentVariablesNode(EnvInjectListener.java:81) 14:33:30 at org.jenkinsci.plugins.envinject.EnvInjectListener.setUpEnvironment(EnvInjectListener.java:39) 14:33:30 at hudson.model.AbstractBuild$AbstractBuildExecution.createLauncher(AbstractBuild.java:570) 14:33:30 at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:476) 14:33:30 at hudson.model.Run.execute(Run.java:1706) 14:33:30 at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:43) 14:33:30 at hudson.model.ResourceController.execute(ResourceController.java:88) 14:33:30 at hudson.model.Executor.run(Executor.java:231) 14:33:30 Caused by: hudson.remoting.ChannelClosedException: channel is already closed 14:33:30 at hudson.remoting.Channel.send(Channel.java:541) 14:33:30 at hudson.remoting.Request.call(Request.java:129) 14:33:30 at hudson.remoting.Channel.call(Channel.java:739) 14:33:30 at hudson.FilePath.act(FilePath.java:1009) 14:33:30 at org.jenkinsci.plugins.envinject.service.EnvironmentVariablesNodeLoader.gatherEnvironmentVariablesNode(EnvironmentVariablesNodeLoader.java:44) 14:33:30 ... 8 more 14:33:30 Caused by: java.io.IOException 14:33:30 at hudson.remoting.Channel.close(Channel.java:1027) 14:33:30 at hudson.slaves.ChannelPinger$1.onDead(ChannelPinger.java:110) 14:33:30 at hudson.remoting.PingThread.ping(PingThread.java:120) 14:33:30 at hudson.remoting.PingThread.run(PingThread.java:81) 14:33:30 Caused by: java.util.concurrent.TimeoutException: Ping started on 1401187729473 hasn't completed at 1401187969473 14:33:30 ... 2 more 14:33:30 Notifying upstream projects of job completion 14:33:30 [EnvInject] - [ERROR] - SEVERE ERROR occurs: channel is already closed 14:33:33 Finished: FAILURE
            Hide
            elyz10n Beho Alheit added a comment - - edited

            I have this bug popping up with Mac Mini OSX SSH slaves.

            Jenkins version is 1.564.

            Updated version to 1.574 and the issue is still present.

            It takes a reboot of both the OSX slave and the Jenkins instance itself to fix this issue.
            This only occurs with the nodes running OSX. After running for a couple of days and processing a arbitrary number of builds, the timeout takes place.

            Please let me know if there is more details I can submit.

            Show
            elyz10n Beho Alheit added a comment - - edited I have this bug popping up with Mac Mini OSX SSH slaves. Jenkins version is 1.564. Updated version to 1.574 and the issue is still present. It takes a reboot of both the OSX slave and the Jenkins instance itself to fix this issue. This only occurs with the nodes running OSX. After running for a couple of days and processing a arbitrary number of builds, the timeout takes place. Please let me know if there is more details I can submit.
            Hide
            abedwards Adam added a comment -

            I am seeing similar issues on EC2 linux.

            [EnvInject] - Loading node environment variables.
            FATAL: hudson.remoting.RequestAbortedException: hudson.remoting.Channel$OrderlyShutdown: java.util.concurrent.TimeoutException: Ping started on 1409173580736 hasn't completed at 1409173820736
            hudson.remoting.RequestAbortedException: hudson.remoting.RequestAbortedException: hudson.remoting.Channel$OrderlyShutdown: java.util.concurrent.TimeoutException: Ping started on 1409173580736 hasn't completed at 1409173820736
            	at hudson.remoting.RequestAbortedException.wrapForRethrow(RequestAbortedException.java:41)
            	at hudson.remoting.RequestAbortedException.wrapForRethrow(RequestAbortedException.java:34)
            	at hudson.remoting.Request.call(Request.java:174)
            	at hudson.remoting.Channel.call(Channel.java:739)
            	at hudson.FilePath.act(FilePath.java:1009)
            	at org.jenkinsci.plugins.envinject.service.EnvironmentVariablesNodeLoader.gatherEnvironmentVariablesNode(EnvironmentVariablesNodeLoader.java:44)
            	at org.jenkinsci.plugins.envinject.EnvInjectListener.loadEnvironmentVariablesNode(EnvInjectListener.java:81)
            	at org.jenkinsci.plugins.envinject.EnvInjectListener.setUpEnvironment(EnvInjectListener.java:39)
            	at hudson.model.AbstractBuild$AbstractBuildExecution.createLauncher(AbstractBuild.java:581)
            	at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:487)
            	at hudson.model.Run.execute(Run.java:1706)
            	at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:43)
            	at hudson.model.ResourceController.execute(ResourceController.java:88)
            	at hudson.model.Executor.run(Executor.java:232)
            Caused by: hudson.remoting.RequestAbortedException: hudson.remoting.Channel$OrderlyShutdown: java.util.concurrent.TimeoutException: Ping started on 1409173580736 hasn't completed at 1409173820736
            	at hudson.remoting.Request.abort(Request.java:299)
            	at hudson.remoting.Channel.terminate(Channel.java:802)
            	at hudson.remoting.Channel$CloseCommand.execute(Channel.java:951)
            	at hudson.remoting.Channel$2.handle(Channel.java:475)
            	at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:60)
            Caused by: hudson.remoting.Channel$OrderlyShutdown: java.util.concurrent.TimeoutException: Ping started on 1409173580736 hasn't completed at 1409173820736
            	... 3 more
            Caused by: Command close created at
            	at hudson.remoting.Command.<init>(Command.java:56)
            	at hudson.remoting.Channel$CloseCommand.<init>(Channel.java:945)
            	at hudson.remoting.Channel$CloseCommand.<init>(Channel.java:943)
            	at hudson.remoting.Channel.close(Channel.java:1026)
            	at hudson.slaves.ChannelPinger$1.onDead(ChannelPinger.java:110)
            	at hudson.remoting.PingThread.ping(PingThread.java:120)
            	at hudson.remoting.PingThread.run(PingThread.java:81)
            Caused by: java.util.concurrent.TimeoutException: Ping started on 1409173580736 hasn't completed at 1409173820736
            	... 2 more
            
            Show
            abedwards Adam added a comment - I am seeing similar issues on EC2 linux. [EnvInject] - Loading node environment variables. FATAL: hudson.remoting.RequestAbortedException: hudson.remoting.Channel$OrderlyShutdown: java.util.concurrent.TimeoutException: Ping started on 1409173580736 hasn't completed at 1409173820736 hudson.remoting.RequestAbortedException: hudson.remoting.RequestAbortedException: hudson.remoting.Channel$OrderlyShutdown: java.util.concurrent.TimeoutException: Ping started on 1409173580736 hasn't completed at 1409173820736 at hudson.remoting.RequestAbortedException.wrapForRethrow(RequestAbortedException.java:41) at hudson.remoting.RequestAbortedException.wrapForRethrow(RequestAbortedException.java:34) at hudson.remoting.Request.call(Request.java:174) at hudson.remoting.Channel.call(Channel.java:739) at hudson.FilePath.act(FilePath.java:1009) at org.jenkinsci.plugins.envinject.service.EnvironmentVariablesNodeLoader.gatherEnvironmentVariablesNode(EnvironmentVariablesNodeLoader.java:44) at org.jenkinsci.plugins.envinject.EnvInjectListener.loadEnvironmentVariablesNode(EnvInjectListener.java:81) at org.jenkinsci.plugins.envinject.EnvInjectListener.setUpEnvironment(EnvInjectListener.java:39) at hudson.model.AbstractBuild$AbstractBuildExecution.createLauncher(AbstractBuild.java:581) at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:487) at hudson.model.Run.execute(Run.java:1706) at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:43) at hudson.model.ResourceController.execute(ResourceController.java:88) at hudson.model.Executor.run(Executor.java:232) Caused by: hudson.remoting.RequestAbortedException: hudson.remoting.Channel$OrderlyShutdown: java.util.concurrent.TimeoutException: Ping started on 1409173580736 hasn't completed at 1409173820736 at hudson.remoting.Request.abort(Request.java:299) at hudson.remoting.Channel.terminate(Channel.java:802) at hudson.remoting.Channel$CloseCommand.execute(Channel.java:951) at hudson.remoting.Channel$2.handle(Channel.java:475) at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:60) Caused by: hudson.remoting.Channel$OrderlyShutdown: java.util.concurrent.TimeoutException: Ping started on 1409173580736 hasn't completed at 1409173820736 ... 3 more Caused by: Command close created at at hudson.remoting.Command.<init>(Command.java:56) at hudson.remoting.Channel$CloseCommand.<init>(Channel.java:945) at hudson.remoting.Channel$CloseCommand.<init>(Channel.java:943) at hudson.remoting.Channel.close(Channel.java:1026) at hudson.slaves.ChannelPinger$1.onDead(ChannelPinger.java:110) at hudson.remoting.PingThread.ping(PingThread.java:120) at hudson.remoting.PingThread.run(PingThread.java:81) Caused by: java.util.concurrent.TimeoutException: Ping started on 1409173580736 hasn't completed at 1409173820736 ... 2 more
            Hide
            trbaker Trevor Baker added a comment -

            This happens consistently when the jenkins master is running amazon linux kernel 3.10.x. If you downgrade the kernel to 3.4.x on the master it goes away. It does not seem to matter which kernel the slave runs.

            Every time I forget about this and "yum update" the kernel it takes some head scratching before I remember to lock the master's kernel at 3.4.x in /etc/grub.conf

            Show
            trbaker Trevor Baker added a comment - This happens consistently when the jenkins master is running amazon linux kernel 3.10.x. If you downgrade the kernel to 3.4.x on the master it goes away. It does not seem to matter which kernel the slave runs. Every time I forget about this and "yum update" the kernel it takes some head scratching before I remember to lock the master's kernel at 3.4.x in /etc/grub.conf
            Hide
            abedwards Adam added a comment -

            hey Trevor, any chance you have the update to 3.4.x kernel steps handy. I didn't see the kernel in the main repo.

            Anyone from the Jenkins team able to comment if they could help resolve this issue? I leave ssh sessions open overnight with no issues or drops, so I really can't understand why jenkins keeps having these ping timeouts. Could we get a patch which would disable the ping check entirely? I added "-Dhudson.remoting.Launcher.pingIntervalSec=0 -Dhudson.remoting.Launcher.pingTimeoutSec=0" to my startup on the master but it didn't seem to help.

            Would appreciate any suggestion to avoid rolling back the kernel.

            Trevor, It would be interesting to see the output before/after the rollback of "sysctl -a", wonder if there's a kernel param that changed which might have an impact.

            Show
            abedwards Adam added a comment - hey Trevor, any chance you have the update to 3.4.x kernel steps handy. I didn't see the kernel in the main repo. Anyone from the Jenkins team able to comment if they could help resolve this issue? I leave ssh sessions open overnight with no issues or drops, so I really can't understand why jenkins keeps having these ping timeouts. Could we get a patch which would disable the ping check entirely? I added "-Dhudson.remoting.Launcher.pingIntervalSec=0 -Dhudson.remoting.Launcher.pingTimeoutSec=0" to my startup on the master but it didn't seem to help. Would appreciate any suggestion to avoid rolling back the kernel. Trevor, It would be interesting to see the output before/after the rollback of "sysctl -a", wonder if there's a kernel param that changed which might have an impact.
            Hide
            gboucherie Guillaume Boucherie added a comment -

            Hi,

            First thanks you very much for this tips.
            And yes current Amazon Linux AMI (ami-892fe1fe) can't be downgraded to 3.4.x.
            So i install an Ubuntu 12.04 that come with a 3.2.x kernel (ami-00d12677) and all works fine.
            I also try with an ubuntu 14.04 (3.13.x kernel) and its not good.

            So I attach the result of "sysctl -a" for kernel 3.2 sysctl_3.2.txt and 3.13 sysctl_3.13.txt if it could help.

            Regards

            Show
            gboucherie Guillaume Boucherie added a comment - Hi, First thanks you very much for this tips. And yes current Amazon Linux AMI (ami-892fe1fe) can't be downgraded to 3.4.x. So i install an Ubuntu 12.04 that come with a 3.2.x kernel (ami-00d12677) and all works fine. I also try with an ubuntu 14.04 (3.13.x kernel) and its not good. So I attach the result of "sysctl -a" for kernel 3.2 sysctl_3.2.txt and 3.13 sysctl_3.13.txt if it could help. Regards
            Hide
            trbaker Trevor Baker added a comment -

            The instance I run jenkins on was a 2013.09 version that was upgraded to 2014.03.x via yum updates. I presume you could add the 2013.09 yum repo into /etc/yum.repos.d/ and see the 3.4.x kernels. From experience, 3.4.83-70.111 works with everything else from 2014.03.2 at the latest revisions.

            I've attached a kernel param dump.

            Show
            trbaker Trevor Baker added a comment - The instance I run jenkins on was a 2013.09 version that was upgraded to 2014.03.x via yum updates. I presume you could add the 2013.09 yum repo into /etc/yum.repos.d/ and see the 3.4.x kernels. From experience, 3.4.83-70.111 works with everything else from 2014.03.2 at the latest revisions. I've attached a kernel param dump.
            Hide
            dverbeek84 Danny Verbeek added a comment - - edited

            I have the same kind of issue.

            Working combination:
            Jenkins version: 1.588
            Remoting version: 2.47
            Master JVM: 1.7.0_65
            Slave JVM: 1.6.0_33/1.7.0_65)
            Linux kernel master: 3.13.0-37-generic
            Linux kernel slave: 3.2.0-69-virtual

            Not Working combination:
            Jenkins version: 1.588
            Remoting version: 2.47
            Master JVM: 1.7.0_65
            Slave JVM: 1.6.0_33/1.7.0_65(Tested both)
            Linux kernel master: 3.13.0-37-generic
            Linux kernel slave: 3.13.0-35-generic/3.13.0-39-generic(Tested both)

            The slave hangs on the POM parsing

            stack trace slave.jar:

            Full thread dump OpenJDK 64-Bit Server VM (24.65-b04 mixed mode):
            
            "Attach Listener" daemon prio=10 tid=0x00007f7eb0001800 nid=0x645 runnable [0x0000000000000000]
               java.lang.Thread.State: RUNNABLE
            
            "Stream reader: maven process at Socket[addr=/127.0.0.1,port=46056,localport=55623]" prio=10 tid=0x00007f7ec004f800 nid=0x611 runnable [0x00007f7ea4825000]
               java.lang.Thread.State: RUNNABLE
            	at java.net.SocketInputStream.socketRead0(Native Method)
            	at java.net.SocketInputStream.read(SocketInputStream.java:152)
            	at java.net.SocketInputStream.read(SocketInputStream.java:122)
            	at java.io.FilterInputStream.read(FilterInputStream.java:133)
            	at java.io.FilterInputStream.read(FilterInputStream.java:107)
            	at hudson.util.StreamCopyThread.run(StreamCopyThread.java:60)
            
            "java -XX:MaxPermSize=1024m -cp /tmp/maven3-agent.jar:/usr/share/maven/boot/plexus-classworlds-2.x.jar org.jvnet.hudson.maven3.agent.Maven3Main /usr/share/maven /tmp/slave.jar /tmp/maven3-interceptor.jar /tmp/maven3-interceptor-commons.jar 55623: stdout copier" prio=10 tid=0x00007f7ec0046800 nid=0x607 runnable [0x00007f7ea4623000]
               java.lang.Thread.State: RUNNABLE
            	at java.io.FileInputStream.readBytes(Native Method)
            	at java.io.FileInputStream.read(FileInputStream.java:272)
            	at java.io.BufferedInputStream.read1(BufferedInputStream.java:273)
            	at java.io.BufferedInputStream.read(BufferedInputStream.java:334)
            	- locked <0x00000000bbbc43f0> (a java.lang.UNIXProcess$ProcessPipeInputStream)
            	at java.io.FilterInputStream.read(FilterInputStream.java:107)
            	at hudson.util.StreamCopyThread.run(StreamCopyThread.java:60)
            
            "Proc.executor [#1]" daemon prio=10 tid=0x00007f7eb81e6000 nid=0x570 waiting on condition [0x00007f7ea4724000]
               java.lang.Thread.State: TIMED_WAITING (parking)
            	at sun.misc.Unsafe.park(Native Method)
            	- parking to wait for  <0x00000000cec5cef0> (a java.util.concurrent.SynchronousQueue$TransferStack)
            	at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226)
            	at java.util.concurrent.SynchronousQueue$TransferStack.awaitFulfill(SynchronousQueue.java:460)
            	at java.util.concurrent.SynchronousQueue$TransferStack.transfer(SynchronousQueue.java:359)
            	at java.util.concurrent.SynchronousQueue.poll(SynchronousQueue.java:942)
            	at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1068)
            	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130)
            	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
            	at java.lang.Thread.run(Thread.java:745)
            
            "process reaper" daemon prio=10 tid=0x00007f7eb81ec800 nid=0x56c runnable [0x00007f7ea485e000]
               java.lang.Thread.State: RUNNABLE
            	at java.lang.UNIXProcess.waitForProcessExit(Native Method)
            	at java.lang.UNIXProcess.access$500(UNIXProcess.java:54)
            	at java.lang.UNIXProcess$4.run(UNIXProcess.java:227)
            	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
            	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
            	at java.lang.Thread.run(Thread.java:745)
            
            "pool-1-thread-6" prio=10 tid=0x00007f7eac011800 nid=0x557 waiting on condition [0x00007f7ea5cba000]
               java.lang.Thread.State: TIMED_WAITING (parking)
            	at sun.misc.Unsafe.park(Native Method)
            	- parking to wait for  <0x00000000ce80c268> (a java.util.concurrent.SynchronousQueue$TransferStack)
            	at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226)
            	at java.util.concurrent.SynchronousQueue$TransferStack.awaitFulfill(SynchronousQueue.java:460)
            	at java.util.concurrent.SynchronousQueue$TransferStack.transfer(SynchronousQueue.java:359)
            	at java.util.concurrent.SynchronousQueue.poll(SynchronousQueue.java:942)
            	at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1068)
            	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130)
            	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
            	at java.lang.Thread.run(Thread.java:745)
            
            "pool-1-thread-4" prio=10 tid=0x00007f7eac00e800 nid=0x555 waiting on condition [0x00007f7ea5ebc000]
               java.lang.Thread.State: TIMED_WAITING (parking)
            	at sun.misc.Unsafe.park(Native Method)
            	- parking to wait for  <0x00000000ce80c268> (a java.util.concurrent.SynchronousQueue$TransferStack)
            	at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226)
            	at java.util.concurrent.SynchronousQueue$TransferStack.awaitFulfill(SynchronousQueue.java:460)
            	at java.util.concurrent.SynchronousQueue$TransferStack.transfer(SynchronousQueue.java:359)
            	at java.util.concurrent.SynchronousQueue.poll(SynchronousQueue.java:942)
            	at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1068)
            	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130)
            	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
            	at java.lang.Thread.run(Thread.java:745)
            
            "pool-1-thread-3" prio=10 tid=0x00007f7eac00d000 nid=0x554 waiting on condition [0x00007f7ec419b000]
               java.lang.Thread.State: TIMED_WAITING (parking)
            	at sun.misc.Unsafe.park(Native Method)
            	- parking to wait for  <0x00000000ce80c268> (a java.util.concurrent.SynchronousQueue$TransferStack)
            	at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226)
            	at java.util.concurrent.SynchronousQueue$TransferStack.awaitFulfill(SynchronousQueue.java:460)
            	at java.util.concurrent.SynchronousQueue$TransferStack.transfer(SynchronousQueue.java:359)
            	at java.util.concurrent.SynchronousQueue.poll(SynchronousQueue.java:942)
            	at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1068)
            	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130)
            	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
            	at java.lang.Thread.run(Thread.java:745)
            
            "Ping thread for channel hudson.remoting.Channel@b920c51:channel" daemon prio=10 tid=0x00007f7eb42cd800 nid=0x551 waiting on condition [0x00007f7ec5685000]
               java.lang.Thread.State: TIMED_WAITING (sleeping)
            	at java.lang.Thread.sleep(Native Method)
            	at hudson.remoting.PingThread.run(PingThread.java:91)
            
            "Ping thread for channel hudson.remoting.Channel@b920c51:channel" daemon prio=10 tid=0x00007f7ecc1a7800 nid=0x54f waiting on condition [0x00007f7ec6eaf000]
               java.lang.Thread.State: TIMED_WAITING (sleeping)
            	at java.lang.Thread.sleep(Native Method)
            	at hudson.remoting.PingThread.run(PingThread.java:91)
            
            "Channel reader thread: channel" prio=10 tid=0x00007f7ecc19c800 nid=0x54e runnable [0x00007f7ec6fb0000]
               java.lang.Thread.State: RUNNABLE
            	at java.io.FileInputStream.readBytes(Native Method)
            	at java.io.FileInputStream.read(FileInputStream.java:272)
            	at java.io.BufferedInputStream.fill(BufferedInputStream.java:235)
            	at java.io.BufferedInputStream.read(BufferedInputStream.java:254)
            	- locked <0x00000000ce80e768> (a java.io.BufferedInputStream)
            	at hudson.remoting.FlightRecorderInputStream.read(FlightRecorderInputStream.java:82)
            	at hudson.remoting.ChunkedInputStream.readHeader(ChunkedInputStream.java:72)
            	at hudson.remoting.ChunkedInputStream.readUntilBreak(ChunkedInputStream.java:103)
            	at hudson.remoting.ChunkedCommandTransport.readBlock(ChunkedCommandTransport.java:33)
            	at hudson.remoting.AbstractSynchronousByteArrayCommandTransport.read(AbstractSynchronousByteArrayCommandTransport.java:34)
            	at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:48)
            
            "Service Thread" daemon prio=10 tid=0x00007f7ecc0cc800 nid=0x54c runnable [0x0000000000000000]
               java.lang.Thread.State: RUNNABLE
            
            "C2 CompilerThread1" daemon prio=10 tid=0x00007f7ecc0ca000 nid=0x54b waiting on condition [0x0000000000000000]
               java.lang.Thread.State: RUNNABLE
            
            "C2 CompilerThread0" daemon prio=10 tid=0x00007f7ecc0c8000 nid=0x54a waiting on condition [0x0000000000000000]
               java.lang.Thread.State: RUNNABLE
            
            "Signal Dispatcher" daemon prio=10 tid=0x00007f7ecc0c5800 nid=0x549 runnable [0x0000000000000000]
               java.lang.Thread.State: RUNNABLE
            
            "Finalizer" daemon prio=10 tid=0x00007f7ecc099000 nid=0x548 in Object.wait() [0x00007f7ec773f000]
               java.lang.Thread.State: WAITING (on object monitor)
            	at java.lang.Object.wait(Native Method)
            	- waiting on <0x00000000ce80ea08> (a java.lang.ref.ReferenceQueue$Lock)
            	at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:135)
            	- locked <0x00000000ce80ea08> (a java.lang.ref.ReferenceQueue$Lock)
            	at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:151)
            	at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:209)
            
            "Reference Handler" daemon prio=10 tid=0x00007f7ecc097000 nid=0x547 in Object.wait() [0x00007f7ec7840000]
               java.lang.Thread.State: WAITING (on object monitor)
            	at java.lang.Object.wait(Native Method)
            	- waiting on <0x00000000ce80eab0> (a java.lang.ref.Reference$Lock)
            	at java.lang.Object.wait(Object.java:503)
            	at java.lang.ref.Reference$ReferenceHandler.run(Reference.java:133)
            	- locked <0x00000000ce80eab0> (a java.lang.ref.Reference$Lock)
            
            "main" prio=10 tid=0x00007f7ecc00a000 nid=0x545 in Object.wait() [0x00007f7ed24b4000]
               java.lang.Thread.State: TIMED_WAITING (on object monitor)
            	at java.lang.Object.wait(Native Method)
            	- waiting on <0x00000000ce80c310> (a hudson.remoting.Channel)
            	at hudson.remoting.Channel.join(Channel.java:902)
            	- locked <0x00000000ce80c310> (a hudson.remoting.Channel)
            	at hudson.remoting.Launcher.main(Launcher.java:527)
            	at hudson.remoting.Launcher.runWithStdinStdout(Launcher.java:459)
            	at hudson.remoting.Launcher.run(Launcher.java:239)
            	at hudson.remoting.Launcher.main(Launcher.java:192)
            
            "VM Thread" prio=10 tid=0x00007f7ecc092800 nid=0x546 runnable
            
            "VM Periodic Task Thread" prio=10 tid=0x00007f7ecc0d7800 nid=0x54d waiting on condition
            
            JNI global references: 221
            
            

            Stack trace maven.jar:

             
            Full thread dump OpenJDK 64-Bit Server VM (24.65-b04 mixed mode):
            
            "Attach Listener" daemon prio=10 tid=0x00007f6308046000 nid=0x650 runnable [0x0000000000000000]
               java.lang.Thread.State: RUNNABLE
            
            "pool-1-thread-1 for channel / waiting for hudson.remoting.Channel@2855e552:channel" prio=10 tid=0x00007f6304002000 nid=0x614 in Object.wait() [0x00007f62ff89a000]
               java.lang.Thread.State: TIMED_WAITING (on object monitor)
            	at java.lang.Object.wait(Native Method)
            	- waiting on <0x0000000085272700> (a hudson.remoting.UserRequest)
            	at hudson.remoting.Request.call(Request.java:146)
            	- locked <0x0000000085272700> (a hudson.remoting.UserRequest)
            	at hudson.remoting.Channel.call(Channel.java:751)
            	at hudson.remoting.RemoteInvocationHandler.invoke(RemoteInvocationHandler.java:173)
            	at com.sun.proxy.$Proxy5.setExecutedMojos(Unknown Source)
            	at hudson.maven.MavenBuildProxy$Filter.setExecutedMojos(MavenBuildProxy.java:267)
            	at hudson.maven.Maven3Builder$MavenExecutionListener.recordMojoEnded(Maven3Builder.java:630)
            	at hudson.maven.Maven3Builder$MavenExecutionListener.mojoSucceeded(Maven3Builder.java:615)
            	at org.apache.maven.lifecycle.internal.DefaultExecutionEventCatapult.fire(DefaultExecutionEventCatapult.java:87)
            	at org.apache.maven.lifecycle.internal.DefaultExecutionEventCatapult.fire(DefaultExecutionEventCatapult.java:42)
            	at org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:228)
            	at org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:153)
            	at org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:145)
            	at org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject(LifecycleModuleBuilder.java:84)
            	at org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject(LifecycleModuleBuilder.java:59)
            	at org.apache.maven.lifecycle.internal.LifecycleStarter.singleThreadedBuild(LifecycleStarter.java:183)
            	at org.apache.maven.lifecycle.internal.LifecycleStarter.execute(LifecycleStarter.java:161)
            	at org.apache.maven.DefaultMaven.doExecute(DefaultMaven.java:320)
            	at org.apache.maven.DefaultMaven.execute(DefaultMaven.java:156)
            	at org.jvnet.hudson.maven3.launcher.Maven3Launcher.main(Maven3Launcher.java:117)
            	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
            	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
            	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
            	at java.lang.reflect.Method.invoke(Method.java:606)
            	at org.codehaus.plexus.classworlds.launcher.Launcher.launchStandard(Launcher.java:330)
            	at org.codehaus.plexus.classworlds.launcher.Launcher.launch(Launcher.java:238)
            	at org.jvnet.hudson.maven3.agent.Maven3Main.launch(Maven3Main.java:178)
            	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
            	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
            	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
            	at java.lang.reflect.Method.invoke(Method.java:606)
            	at hudson.maven.Maven3Builder.call(Maven3Builder.java:136)
            	at hudson.maven.Maven3Builder.call(Maven3Builder.java:71)
            	at hudson.remoting.UserRequest.perform(UserRequest.java:121)
            	at hudson.remoting.UserRequest.perform(UserRequest.java:49)
            	at hudson.remoting.Request$2.run(Request.java:324)
            	at hudson.remoting.InterceptingExecutorService$1.call(InterceptingExecutorService.java:68)
            	at java.util.concurrent.FutureTask.run(FutureTask.java:262)
            	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
            	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
            	at java.lang.Thread.run(Thread.java:745)
            
            "Channel reader thread: channel" prio=10 tid=0x00007f63201ad000 nid=0x613 runnable [0x00007f62ff99d000]
               java.lang.Thread.State: RUNNABLE
            	at java.net.SocketInputStream.socketRead0(Native Method)
            	at java.net.SocketInputStream.read(SocketInputStream.java:152)
            	at java.net.SocketInputStream.read(SocketInputStream.java:122)
            	at java.io.FilterInputStream.read(FilterInputStream.java:133)
            	at java.io.BufferedInputStream.fill(BufferedInputStream.java:235)
            	at java.io.BufferedInputStream.read(BufferedInputStream.java:254)
            	- locked <0x0000000098dc0720> (a java.io.BufferedInputStream)
            	at hudson.remoting.FlightRecorderInputStream.read(FlightRecorderInputStream.java:82)
            	at hudson.remoting.ChunkedInputStream.readHeader(ChunkedInputStream.java:72)
            	at hudson.remoting.ChunkedInputStream.readUntilBreak(ChunkedInputStream.java:103)
            	at hudson.remoting.ChunkedCommandTransport.readBlock(ChunkedCommandTransport.java:33)
            	at hudson.remoting.AbstractSynchronousByteArrayCommandTransport.read(AbstractSynchronousByteArrayCommandTransport.java:34)
            	at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:48)
            
            "Service Thread" daemon prio=10 tid=0x00007f63200a5800 nid=0x60f runnable [0x0000000000000000]
               java.lang.Thread.State: RUNNABLE
            
            "C2 CompilerThread1" daemon prio=10 tid=0x00007f63200a3000 nid=0x60e waiting on condition [0x0000000000000000]
               java.lang.Thread.State: RUNNABLE
            
            "C2 CompilerThread0" daemon prio=10 tid=0x00007f63200a1000 nid=0x60d waiting on condition [0x0000000000000000]
               java.lang.Thread.State: RUNNABLE
            
            "Signal Dispatcher" daemon prio=10 tid=0x00007f632009e800 nid=0x60c runnable [0x0000000000000000]
               java.lang.Thread.State: RUNNABLE
            
            "Finalizer" daemon prio=10 tid=0x00007f6320072000 nid=0x60b in Object.wait() [0x00007f6324886000]
               java.lang.Thread.State: WAITING (on object monitor)
            	at java.lang.Object.wait(Native Method)
            	- waiting on <0x0000000098dc09c0> (a java.lang.ref.ReferenceQueue$Lock)
            	at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:135)
            	- locked <0x0000000098dc09c0> (a java.lang.ref.ReferenceQueue$Lock)
            	at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:151)
            	at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:209)
            
            "Reference Handler" daemon prio=10 tid=0x00007f6320070000 nid=0x60a in Object.wait() [0x00007f6324987000]
               java.lang.Thread.State: WAITING (on object monitor)
            	at java.lang.Object.wait(Native Method)
            	- waiting on <0x0000000098dc0a68> (a java.lang.ref.Reference$Lock)
            	at java.lang.Object.wait(Object.java:503)
            	at java.lang.ref.Reference$ReferenceHandler.run(Reference.java:133)
            	- locked <0x0000000098dc0a68> (a java.lang.ref.Reference$Lock)
            
            "main" prio=10 tid=0x00007f632000a000 nid=0x608 in Object.wait() [0x00007f632895c000]
               java.lang.Thread.State: TIMED_WAITING (on object monitor)
            	at java.lang.Object.wait(Native Method)
            	- waiting on <0x0000000098dbe408> (a hudson.remoting.Channel)
            	at hudson.remoting.Channel.join(Channel.java:902)
            	- locked <0x0000000098dbe408> (a hudson.remoting.Channel)
            	at hudson.remoting.Launcher.main(Launcher.java:527)
            	at hudson.remoting.Launcher.main(Launcher.java:501)
            	at hudson.remoting.Launcher.main(Launcher.java:493)
            	at hudson.remoting.Launcher.main(Launcher.java:489)
            	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
            	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
            	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
            	at java.lang.reflect.Method.invoke(Method.java:606)
            	at org.jvnet.hudson.maven3.agent.Maven3Main.main(Maven3Main.java:136)
            	at org.jvnet.hudson.maven3.agent.Maven3Main.main(Maven3Main.java:64)
            
            "VM Thread" prio=10 tid=0x00007f632006b800 nid=0x609 runnable
            
            "VM Periodic Task Thread" prio=10 tid=0x00007f63200b1000 nid=0x610 waiting on condition
            
            JNI global references: 180
            
            
            Show
            dverbeek84 Danny Verbeek added a comment - - edited I have the same kind of issue. Working combination: Jenkins version: 1.588 Remoting version: 2.47 Master JVM: 1.7.0_65 Slave JVM: 1.6.0_33/1.7.0_65) Linux kernel master: 3.13.0-37-generic Linux kernel slave: 3.2.0-69-virtual Not Working combination: Jenkins version: 1.588 Remoting version: 2.47 Master JVM: 1.7.0_65 Slave JVM: 1.6.0_33/1.7.0_65(Tested both) Linux kernel master: 3.13.0-37-generic Linux kernel slave: 3.13.0-35-generic/3.13.0-39-generic(Tested both) The slave hangs on the POM parsing stack trace slave.jar: Full thread dump OpenJDK 64-Bit Server VM (24.65-b04 mixed mode): "Attach Listener" daemon prio=10 tid=0x00007f7eb0001800 nid=0x645 runnable [0x0000000000000000] java.lang. Thread .State: RUNNABLE "Stream reader: maven process at Socket[addr=/127.0.0.1,port=46056,localport=55623]" prio=10 tid=0x00007f7ec004f800 nid=0x611 runnable [0x00007f7ea4825000] java.lang. Thread .State: RUNNABLE at java.net.SocketInputStream.socketRead0(Native Method) at java.net.SocketInputStream.read(SocketInputStream.java:152) at java.net.SocketInputStream.read(SocketInputStream.java:122) at java.io.FilterInputStream.read(FilterInputStream.java:133) at java.io.FilterInputStream.read(FilterInputStream.java:107) at hudson.util.StreamCopyThread.run(StreamCopyThread.java:60) "java -XX:MaxPermSize=1024m -cp /tmp/maven3-agent.jar:/usr/share/maven/boot/plexus-classworlds-2.x.jar org.jvnet.hudson.maven3.agent.Maven3Main /usr/share/maven /tmp/slave.jar /tmp/maven3-interceptor.jar /tmp/maven3-interceptor-commons.jar 55623: stdout copier" prio=10 tid=0x00007f7ec0046800 nid=0x607 runnable [0x00007f7ea4623000] java.lang. Thread .State: RUNNABLE at java.io.FileInputStream.readBytes(Native Method) at java.io.FileInputStream.read(FileInputStream.java:272) at java.io.BufferedInputStream.read1(BufferedInputStream.java:273) at java.io.BufferedInputStream.read(BufferedInputStream.java:334) - locked <0x00000000bbbc43f0> (a java.lang.UNIXProcess$ProcessPipeInputStream) at java.io.FilterInputStream.read(FilterInputStream.java:107) at hudson.util.StreamCopyThread.run(StreamCopyThread.java:60) "Proc.executor [#1]" daemon prio=10 tid=0x00007f7eb81e6000 nid=0x570 waiting on condition [0x00007f7ea4724000] java.lang. Thread .State: TIMED_WAITING (parking) at sun.misc.Unsafe.park(Native Method) - parking to wait for <0x00000000cec5cef0> (a java.util.concurrent.SynchronousQueue$TransferStack) at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226) at java.util.concurrent.SynchronousQueue$TransferStack.awaitFulfill(SynchronousQueue.java:460) at java.util.concurrent.SynchronousQueue$TransferStack.transfer(SynchronousQueue.java:359) at java.util.concurrent.SynchronousQueue.poll(SynchronousQueue.java:942) at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1068) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang. Thread .run( Thread .java:745) "process reaper" daemon prio=10 tid=0x00007f7eb81ec800 nid=0x56c runnable [0x00007f7ea485e000] java.lang. Thread .State: RUNNABLE at java.lang.UNIXProcess.waitForProcessExit(Native Method) at java.lang.UNIXProcess.access$500(UNIXProcess.java:54) at java.lang.UNIXProcess$4.run(UNIXProcess.java:227) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang. Thread .run( Thread .java:745) "pool-1-thread-6" prio=10 tid=0x00007f7eac011800 nid=0x557 waiting on condition [0x00007f7ea5cba000] java.lang. Thread .State: TIMED_WAITING (parking) at sun.misc.Unsafe.park(Native Method) - parking to wait for <0x00000000ce80c268> (a java.util.concurrent.SynchronousQueue$TransferStack) at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226) at java.util.concurrent.SynchronousQueue$TransferStack.awaitFulfill(SynchronousQueue.java:460) at java.util.concurrent.SynchronousQueue$TransferStack.transfer(SynchronousQueue.java:359) at java.util.concurrent.SynchronousQueue.poll(SynchronousQueue.java:942) at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1068) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang. Thread .run( Thread .java:745) "pool-1-thread-4" prio=10 tid=0x00007f7eac00e800 nid=0x555 waiting on condition [0x00007f7ea5ebc000] java.lang. Thread .State: TIMED_WAITING (parking) at sun.misc.Unsafe.park(Native Method) - parking to wait for <0x00000000ce80c268> (a java.util.concurrent.SynchronousQueue$TransferStack) at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226) at java.util.concurrent.SynchronousQueue$TransferStack.awaitFulfill(SynchronousQueue.java:460) at java.util.concurrent.SynchronousQueue$TransferStack.transfer(SynchronousQueue.java:359) at java.util.concurrent.SynchronousQueue.poll(SynchronousQueue.java:942) at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1068) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang. Thread .run( Thread .java:745) "pool-1-thread-3" prio=10 tid=0x00007f7eac00d000 nid=0x554 waiting on condition [0x00007f7ec419b000] java.lang. Thread .State: TIMED_WAITING (parking) at sun.misc.Unsafe.park(Native Method) - parking to wait for <0x00000000ce80c268> (a java.util.concurrent.SynchronousQueue$TransferStack) at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226) at java.util.concurrent.SynchronousQueue$TransferStack.awaitFulfill(SynchronousQueue.java:460) at java.util.concurrent.SynchronousQueue$TransferStack.transfer(SynchronousQueue.java:359) at java.util.concurrent.SynchronousQueue.poll(SynchronousQueue.java:942) at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1068) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang. Thread .run( Thread .java:745) "Ping thread for channel hudson.remoting.Channel@b920c51:channel" daemon prio=10 tid=0x00007f7eb42cd800 nid=0x551 waiting on condition [0x00007f7ec5685000] java.lang. Thread .State: TIMED_WAITING (sleeping) at java.lang. Thread .sleep(Native Method) at hudson.remoting.PingThread.run(PingThread.java:91) "Ping thread for channel hudson.remoting.Channel@b920c51:channel" daemon prio=10 tid=0x00007f7ecc1a7800 nid=0x54f waiting on condition [0x00007f7ec6eaf000] java.lang. Thread .State: TIMED_WAITING (sleeping) at java.lang. Thread .sleep(Native Method) at hudson.remoting.PingThread.run(PingThread.java:91) "Channel reader thread: channel" prio=10 tid=0x00007f7ecc19c800 nid=0x54e runnable [0x00007f7ec6fb0000] java.lang. Thread .State: RUNNABLE at java.io.FileInputStream.readBytes(Native Method) at java.io.FileInputStream.read(FileInputStream.java:272) at java.io.BufferedInputStream.fill(BufferedInputStream.java:235) at java.io.BufferedInputStream.read(BufferedInputStream.java:254) - locked <0x00000000ce80e768> (a java.io.BufferedInputStream) at hudson.remoting.FlightRecorderInputStream.read(FlightRecorderInputStream.java:82) at hudson.remoting.ChunkedInputStream.readHeader(ChunkedInputStream.java:72) at hudson.remoting.ChunkedInputStream.readUntilBreak(ChunkedInputStream.java:103) at hudson.remoting.ChunkedCommandTransport.readBlock(ChunkedCommandTransport.java:33) at hudson.remoting.AbstractSynchronousByteArrayCommandTransport.read(AbstractSynchronousByteArrayCommandTransport.java:34) at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:48) "Service Thread " daemon prio=10 tid=0x00007f7ecc0cc800 nid=0x54c runnable [0x0000000000000000] java.lang. Thread .State: RUNNABLE "C2 CompilerThread1" daemon prio=10 tid=0x00007f7ecc0ca000 nid=0x54b waiting on condition [0x0000000000000000] java.lang. Thread .State: RUNNABLE "C2 CompilerThread0" daemon prio=10 tid=0x00007f7ecc0c8000 nid=0x54a waiting on condition [0x0000000000000000] java.lang. Thread .State: RUNNABLE "Signal Dispatcher" daemon prio=10 tid=0x00007f7ecc0c5800 nid=0x549 runnable [0x0000000000000000] java.lang. Thread .State: RUNNABLE "Finalizer" daemon prio=10 tid=0x00007f7ecc099000 nid=0x548 in Object .wait() [0x00007f7ec773f000] java.lang. Thread .State: WAITING (on object monitor) at java.lang. Object .wait(Native Method) - waiting on <0x00000000ce80ea08> (a java.lang.ref.ReferenceQueue$Lock) at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:135) - locked <0x00000000ce80ea08> (a java.lang.ref.ReferenceQueue$Lock) at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:151) at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:209) "Reference Handler" daemon prio=10 tid=0x00007f7ecc097000 nid=0x547 in Object .wait() [0x00007f7ec7840000] java.lang. Thread .State: WAITING (on object monitor) at java.lang. Object .wait(Native Method) - waiting on <0x00000000ce80eab0> (a java.lang.ref.Reference$Lock) at java.lang. Object .wait( Object .java:503) at java.lang.ref.Reference$ReferenceHandler.run(Reference.java:133) - locked <0x00000000ce80eab0> (a java.lang.ref.Reference$Lock) "main" prio=10 tid=0x00007f7ecc00a000 nid=0x545 in Object .wait() [0x00007f7ed24b4000] java.lang. Thread .State: TIMED_WAITING (on object monitor) at java.lang. Object .wait(Native Method) - waiting on <0x00000000ce80c310> (a hudson.remoting.Channel) at hudson.remoting.Channel.join(Channel.java:902) - locked <0x00000000ce80c310> (a hudson.remoting.Channel) at hudson.remoting.Launcher.main(Launcher.java:527) at hudson.remoting.Launcher.runWithStdinStdout(Launcher.java:459) at hudson.remoting.Launcher.run(Launcher.java:239) at hudson.remoting.Launcher.main(Launcher.java:192) "VM Thread " prio=10 tid=0x00007f7ecc092800 nid=0x546 runnable "VM Periodic Task Thread " prio=10 tid=0x00007f7ecc0d7800 nid=0x54d waiting on condition JNI global references: 221 Stack trace maven.jar: Full thread dump OpenJDK 64-Bit Server VM (24.65-b04 mixed mode): "Attach Listener" daemon prio=10 tid=0x00007f6308046000 nid=0x650 runnable [0x0000000000000000] java.lang. Thread .State: RUNNABLE "pool-1-thread-1 for channel / waiting for hudson.remoting.Channel@2855e552:channel" prio=10 tid=0x00007f6304002000 nid=0x614 in Object .wait() [0x00007f62ff89a000] java.lang. Thread .State: TIMED_WAITING (on object monitor) at java.lang. Object .wait(Native Method) - waiting on <0x0000000085272700> (a hudson.remoting.UserRequest) at hudson.remoting.Request.call(Request.java:146) - locked <0x0000000085272700> (a hudson.remoting.UserRequest) at hudson.remoting.Channel.call(Channel.java:751) at hudson.remoting.RemoteInvocationHandler.invoke(RemoteInvocationHandler.java:173) at com.sun.proxy.$Proxy5.setExecutedMojos(Unknown Source) at hudson.maven.MavenBuildProxy$Filter.setExecutedMojos(MavenBuildProxy.java:267) at hudson.maven.Maven3Builder$MavenExecutionListener.recordMojoEnded(Maven3Builder.java:630) at hudson.maven.Maven3Builder$MavenExecutionListener.mojoSucceeded(Maven3Builder.java:615) at org.apache.maven.lifecycle.internal.DefaultExecutionEventCatapult.fire(DefaultExecutionEventCatapult.java:87) at org.apache.maven.lifecycle.internal.DefaultExecutionEventCatapult.fire(DefaultExecutionEventCatapult.java:42) at org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:228) at org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:153) at org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:145) at org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject(LifecycleModuleBuilder.java:84) at org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject(LifecycleModuleBuilder.java:59) at org.apache.maven.lifecycle.internal.LifecycleStarter.singleThreadedBuild(LifecycleStarter.java:183) at org.apache.maven.lifecycle.internal.LifecycleStarter.execute(LifecycleStarter.java:161) at org.apache.maven.DefaultMaven.doExecute(DefaultMaven.java:320) at org.apache.maven.DefaultMaven.execute(DefaultMaven.java:156) at org.jvnet.hudson.maven3.launcher.Maven3Launcher.main(Maven3Launcher.java:117) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.codehaus.plexus.classworlds.launcher.Launcher.launchStandard(Launcher.java:330) at org.codehaus.plexus.classworlds.launcher.Launcher.launch(Launcher.java:238) at org.jvnet.hudson.maven3.agent.Maven3Main.launch(Maven3Main.java:178) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at hudson.maven.Maven3Builder.call(Maven3Builder.java:136) at hudson.maven.Maven3Builder.call(Maven3Builder.java:71) at hudson.remoting.UserRequest.perform(UserRequest.java:121) at hudson.remoting.UserRequest.perform(UserRequest.java:49) at hudson.remoting.Request$2.run(Request.java:324) at hudson.remoting.InterceptingExecutorService$1.call(InterceptingExecutorService.java:68) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang. Thread .run( Thread .java:745) "Channel reader thread: channel" prio=10 tid=0x00007f63201ad000 nid=0x613 runnable [0x00007f62ff99d000] java.lang. Thread .State: RUNNABLE at java.net.SocketInputStream.socketRead0(Native Method) at java.net.SocketInputStream.read(SocketInputStream.java:152) at java.net.SocketInputStream.read(SocketInputStream.java:122) at java.io.FilterInputStream.read(FilterInputStream.java:133) at java.io.BufferedInputStream.fill(BufferedInputStream.java:235) at java.io.BufferedInputStream.read(BufferedInputStream.java:254) - locked <0x0000000098dc0720> (a java.io.BufferedInputStream) at hudson.remoting.FlightRecorderInputStream.read(FlightRecorderInputStream.java:82) at hudson.remoting.ChunkedInputStream.readHeader(ChunkedInputStream.java:72) at hudson.remoting.ChunkedInputStream.readUntilBreak(ChunkedInputStream.java:103) at hudson.remoting.ChunkedCommandTransport.readBlock(ChunkedCommandTransport.java:33) at hudson.remoting.AbstractSynchronousByteArrayCommandTransport.read(AbstractSynchronousByteArrayCommandTransport.java:34) at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:48) "Service Thread " daemon prio=10 tid=0x00007f63200a5800 nid=0x60f runnable [0x0000000000000000] java.lang. Thread .State: RUNNABLE "C2 CompilerThread1" daemon prio=10 tid=0x00007f63200a3000 nid=0x60e waiting on condition [0x0000000000000000] java.lang. Thread .State: RUNNABLE "C2 CompilerThread0" daemon prio=10 tid=0x00007f63200a1000 nid=0x60d waiting on condition [0x0000000000000000] java.lang. Thread .State: RUNNABLE "Signal Dispatcher" daemon prio=10 tid=0x00007f632009e800 nid=0x60c runnable [0x0000000000000000] java.lang. Thread .State: RUNNABLE "Finalizer" daemon prio=10 tid=0x00007f6320072000 nid=0x60b in Object .wait() [0x00007f6324886000] java.lang. Thread .State: WAITING (on object monitor) at java.lang. Object .wait(Native Method) - waiting on <0x0000000098dc09c0> (a java.lang.ref.ReferenceQueue$Lock) at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:135) - locked <0x0000000098dc09c0> (a java.lang.ref.ReferenceQueue$Lock) at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:151) at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:209) "Reference Handler" daemon prio=10 tid=0x00007f6320070000 nid=0x60a in Object .wait() [0x00007f6324987000] java.lang. Thread .State: WAITING (on object monitor) at java.lang. Object .wait(Native Method) - waiting on <0x0000000098dc0a68> (a java.lang.ref.Reference$Lock) at java.lang. Object .wait( Object .java:503) at java.lang.ref.Reference$ReferenceHandler.run(Reference.java:133) - locked <0x0000000098dc0a68> (a java.lang.ref.Reference$Lock) "main" prio=10 tid=0x00007f632000a000 nid=0x608 in Object .wait() [0x00007f632895c000] java.lang. Thread .State: TIMED_WAITING (on object monitor) at java.lang. Object .wait(Native Method) - waiting on <0x0000000098dbe408> (a hudson.remoting.Channel) at hudson.remoting.Channel.join(Channel.java:902) - locked <0x0000000098dbe408> (a hudson.remoting.Channel) at hudson.remoting.Launcher.main(Launcher.java:527) at hudson.remoting.Launcher.main(Launcher.java:501) at hudson.remoting.Launcher.main(Launcher.java:493) at hudson.remoting.Launcher.main(Launcher.java:489) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.jvnet.hudson.maven3.agent.Maven3Main.main(Maven3Main.java:136) at org.jvnet.hudson.maven3.agent.Maven3Main.main(Maven3Main.java:64) "VM Thread " prio=10 tid=0x00007f632006b800 nid=0x609 runnable "VM Periodic Task Thread " prio=10 tid=0x00007f63200b1000 nid=0x610 waiting on condition JNI global references: 180
            Hide
            poulhenriksen Poul Henriksen added a comment -

            I had the same issue.

            Downgrading the Jenkins master kernel from 3.14.19-17.43.amzn1.x86_64 to 3.4.62-53.42.amzn1.x86_64 solved it.

            Show
            poulhenriksen Poul Henriksen added a comment - I had the same issue. Downgrading the Jenkins master kernel from 3.14.19-17.43.amzn1.x86_64 to 3.4.62-53.42.amzn1.x86_64 solved it.
            Hide
            bertjan Bert Jan Schrijver added a comment - - edited

            Same thing for us: Amazon Linux master with EC2 slaves plugin and Amazon Linux slaves. Builds were randomly hanging and slaves were timing out.
            We downgraded the kernel on the master this morning from 3.14.23-22.44.amzn1.x86_64 to 3.4.73-64.112.amzn1.x86_64 and haven't seen any issues since.
            Slaves are still running 3.14 kernel (3.14.27-25.47.amzn1.x86_64).
            I'll report back later.

            Show
            bertjan Bert Jan Schrijver added a comment - - edited Same thing for us: Amazon Linux master with EC2 slaves plugin and Amazon Linux slaves. Builds were randomly hanging and slaves were timing out. We downgraded the kernel on the master this morning from 3.14.23-22.44.amzn1.x86_64 to 3.4.73-64.112.amzn1.x86_64 and haven't seen any issues since. Slaves are still running 3.14 kernel (3.14.27-25.47.amzn1.x86_64). I'll report back later.
            Hide
            intel352 Jonathan Langevin added a comment -

            I'm on Amazon EC2 w/ Ubuntu 14.04.1 LTS.

            I've downgraded the kernel (manually) to 3.4.

            wget http://kernel.ubuntu.com/~kernel-ppa/mainline/v3.4.105-quantal/linux-headers-3.4.105-0304105-generic_3.4.105-0304105.201412012335_amd64.deb
            wget http://kernel.ubuntu.com/~kernel-ppa/mainline/v3.4.105-quantal/linux-headers-3.4.105-0304105_3.4.105-0304105.201412012335_all.deb
            wget http://kernel.ubuntu.com/~kernel-ppa/mainline/v3.4.105-quantal/linux-image-3.4.105-0304105-generic_3.4.105-0304105.201412012335_amd64.deb
            dpkg -i linux-*
            

            Once I had the 3.4 kernel installed, I had to make it the default kernel on boot, so I followed the instructions here: http://statusq.org/archives/2012/10/24/4584/

            I'm trying out some builds now to see how Jenkins behaves...

            Show
            intel352 Jonathan Langevin added a comment - I'm on Amazon EC2 w/ Ubuntu 14.04.1 LTS. I've downgraded the kernel (manually) to 3.4. wget http: //kernel.ubuntu.com/~kernel-ppa/mainline/v3.4.105-quantal/linux-headers-3.4.105-0304105-generic_3.4.105-0304105.201412012335_amd64.deb wget http: //kernel.ubuntu.com/~kernel-ppa/mainline/v3.4.105-quantal/linux-headers-3.4.105-0304105_3.4.105-0304105.201412012335_all.deb wget http: //kernel.ubuntu.com/~kernel-ppa/mainline/v3.4.105-quantal/linux-image-3.4.105-0304105-generic_3.4.105-0304105.201412012335_amd64.deb dpkg -i linux-* Once I had the 3.4 kernel installed, I had to make it the default kernel on boot, so I followed the instructions here: http://statusq.org/archives/2012/10/24/4584/ I'm trying out some builds now to see how Jenkins behaves...
            Hide
            bertjan Bert Jan Schrijver added a comment -

            Downgrading the kernel on the master has definitely fixed it for us.
            Running for a week now without any trouble.

            Show
            bertjan Bert Jan Schrijver added a comment - Downgrading the kernel on the master has definitely fixed it for us. Running for a week now without any trouble.
            Hide
            seanabbott Sean Abbott added a comment -

            I have the same issue. kernel is 3.14. Part of the problem is that when the slave.jar file fails, it does NOT cause the agent to display as offline for the master, so the master keeps trying to send jobs:

            Expanded the channel window size to 4MB
            [05/11/15 18:02:23] [SSH] Starting slave process: cd "/var/lib/jenkins" && java -Dfile.encoding=UTF8 -jar slave.jar

            <===[JENKINS REMOTING CAPACITY]===>ERROR: Unexpected error in launching a slave. This is probably a bug in Jenkins.
            java.lang.IllegalStateException: Already connected
            at hudson.slaves.SlaveComputer.setChannel(SlaveComputer.java:448)
            at hudson.slaves.SlaveComputer.setChannel(SlaveComputer.java:366)
            at hudson.plugins.sshslaves.SSHLauncher.startSlave(SSHLauncher.java:945)
            at hudson.plugins.sshslaves.SSHLauncher.access$400(SSHLauncher.java:133)
            at hudson.plugins.sshslaves.SSHLauncher$2.call(SSHLauncher.java:711)
            at hudson.plugins.sshslaves.SSHLauncher$2.call(SSHLauncher.java:696)
            at java.util.concurrent.FutureTask.run(FutureTask.java:262)
            at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
            at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
            at java.lang.Thread.run(Thread.java:745)
            [05/11/15 18:02:24] Launch failed - cleaning up connection
            [05/11/15 18:02:24] [SSH] Connection closed.

            Even though the log reports the connection closed, jenkins still reports it as up.

            My jenkins master is on 1.596.2.

            Show
            seanabbott Sean Abbott added a comment - I have the same issue. kernel is 3.14. Part of the problem is that when the slave.jar file fails, it does NOT cause the agent to display as offline for the master, so the master keeps trying to send jobs: Expanded the channel window size to 4MB [05/11/15 18:02:23] [SSH] Starting slave process: cd "/var/lib/jenkins" && java -Dfile.encoding=UTF8 -jar slave.jar <=== [JENKINS REMOTING CAPACITY] ===>ERROR: Unexpected error in launching a slave. This is probably a bug in Jenkins. java.lang.IllegalStateException: Already connected at hudson.slaves.SlaveComputer.setChannel(SlaveComputer.java:448) at hudson.slaves.SlaveComputer.setChannel(SlaveComputer.java:366) at hudson.plugins.sshslaves.SSHLauncher.startSlave(SSHLauncher.java:945) at hudson.plugins.sshslaves.SSHLauncher.access$400(SSHLauncher.java:133) at hudson.plugins.sshslaves.SSHLauncher$2.call(SSHLauncher.java:711) at hudson.plugins.sshslaves.SSHLauncher$2.call(SSHLauncher.java:696) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) [05/11/15 18:02:24] Launch failed - cleaning up connection [05/11/15 18:02:24] [SSH] Connection closed. Even though the log reports the connection closed, jenkins still reports it as up. My jenkins master is on 1.596.2.
            Hide
            seanabbott Sean Abbott added a comment -

            I was able to connect to the same slave from another jenkins master using the same kernel and jenkins version with no issues...

            Show
            seanabbott Sean Abbott added a comment - I was able to connect to the same slave from another jenkins master using the same kernel and jenkins version with no issues...
            Hide
            gboucherie Guillaume Boucherie added a comment -

            Hi,

            I just do a test on last AWS Linux machine (kernel : 3.14.35), with the same machine on both master and slave.
            And the problem gone ...
            Jenkins version used is the last stable : 1.609.1

            Regards

            Show
            gboucherie Guillaume Boucherie added a comment - Hi, I just do a test on last AWS Linux machine (kernel : 3.14.35), with the same machine on both master and slave. And the problem gone ... Jenkins version used is the last stable : 1.609.1 Regards
            Hide
            jglick Jesse Glick added a comment -

            Related to JENKINS-1948 perhaps?

            Show
            jglick Jesse Glick added a comment - Related to JENKINS-1948 perhaps?
            Hide
            srivadlamani Srikanth Vadlamani added a comment -

            In AWS, we are using Ubuntu 14.04.4 LTS. EC2-plugin version is 1.36. We are also seeing similar errors where the agent would disconnect from Jenkins randomly with the error below.

             

            ERROR: SEVERE ERROR occurs
            org.jenkinsci.lib.envinject.EnvInjectException: hudson.remoting.ChannelClosedException: channel is already closed
            at org.jenkinsci.plugins.envinject.service.EnvironmentVariablesNodeLoader.gatherEnvironmentVariablesNode(EnvironmentVariablesNodeLoader.java:79)
            at org.jenkinsci.plugins.envinject.EnvInjectListener.loadEnvironmentVariablesNode(EnvInjectListener.java:80)
            at org.jenkinsci.plugins.envinject.EnvInjectListener.setUpEnvironment(EnvInjectListener.java:42)
            at hudson.model.AbstractBuild$AbstractBuildExecution.createLauncher(AbstractBuild.java:572)
            at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:492)
            at hudson.model.Run.execute(Run.java:1741)
            at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:43)
            at hudson.model.ResourceController.execute(ResourceController.java:98)
            at hudson.model.Executor.run(Executor.java:410)
            Caused by: hudson.remoting.ChannelClosedException: channel is already closed
            at hudson.remoting.Channel.send(Channel.java:578)
            at hudson.remoting.Request.call(Request.java:130)
            at hudson.remoting.Channel.call(Channel.java:780)
            at hudson.FilePath.act(FilePath.java:1102)
            at org.jenkinsci.plugins.envinject.service.EnvironmentVariablesNodeLoader.gatherEnvironmentVariablesNode(EnvironmentVariablesNodeLoader.java:48)
            ... 8 more
            Caused by: java.io.IOException
            at hudson.remoting.Channel.close(Channel.java:1163)
            at hudson.slaves.ChannelPinger$1.onDead(ChannelPinger.java:121)
            at hudson.remoting.PingThread.ping(PingThread.java:130)
            at hudson.remoting.PingThread.run(PingThread.java:86)
            Caused by: java.util.concurrent.TimeoutException: Ping started at 1493347954228 hasn't completed by 1493348194229
            
            

             

            Show
            srivadlamani Srikanth Vadlamani added a comment - In AWS, we are using Ubuntu 14.04.4 LTS. EC2-plugin version is 1.36. We are also seeing similar errors where the agent would disconnect from Jenkins randomly with the error below.   ERROR: SEVERE ERROR occurs org.jenkinsci.lib.envinject.EnvInjectException: hudson.remoting.ChannelClosedException: channel is already closed at org.jenkinsci.plugins.envinject.service.EnvironmentVariablesNodeLoader.gatherEnvironmentVariablesNode(EnvironmentVariablesNodeLoader.java:79) at org.jenkinsci.plugins.envinject.EnvInjectListener.loadEnvironmentVariablesNode(EnvInjectListener.java:80) at org.jenkinsci.plugins.envinject.EnvInjectListener.setUpEnvironment(EnvInjectListener.java:42) at hudson.model.AbstractBuild$AbstractBuildExecution.createLauncher(AbstractBuild.java:572) at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:492) at hudson.model.Run.execute(Run.java:1741) at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:43) at hudson.model.ResourceController.execute(ResourceController.java:98) at hudson.model.Executor.run(Executor.java:410) Caused by: hudson.remoting.ChannelClosedException: channel is already closed at hudson.remoting.Channel.send(Channel.java:578) at hudson.remoting.Request.call(Request.java:130) at hudson.remoting.Channel.call(Channel.java:780) at hudson.FilePath.act(FilePath.java:1102) at org.jenkinsci.plugins.envinject.service.EnvironmentVariablesNodeLoader.gatherEnvironmentVariablesNode(EnvironmentVariablesNodeLoader.java:48) ... 8 more Caused by: java.io.IOException at hudson.remoting.Channel.close(Channel.java:1163) at hudson.slaves.ChannelPinger$1.onDead(ChannelPinger.java:121) at hudson.remoting.PingThread.ping(PingThread.java:130) at hudson.remoting.PingThread.run(PingThread.java:86) Caused by: java.util.concurrent.TimeoutException: Ping started at 1493347954228 hasn't completed by 1493348194229  
            Hide
            ifernandezcalvo Ivan Fernandez Calvo added a comment -

            because there is not recent info here and seems similar to JENKINS-53810 I will close it.

            Show
            ifernandezcalvo Ivan Fernandez Calvo added a comment - because there is not recent info here and seems similar to JENKINS-53810 I will close it.

              People

              • Assignee:
                ifernandezcalvo Ivan Fernandez Calvo
                Reporter:
                olamy Olivier Lamy
              • Votes:
                33 Vote for this issue
                Watchers:
                51 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: