Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-52710

ECS build nodes disconnect from Master

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Cannot Reproduce
    • Icon: Major Major
    • remoting
    • None
    • Jenkins version 2.121.1 running on Windows
      remoting version 3.17
      ECS build nodes running on CentOS 7

      We are running jenkins jobs remotely on ECS build nodes connected via JNLP using the remoting agent. These build nodes connect successfully every time but after some execution time has lapsed, the JNLP connection is terminated causing the build job to fail with the following stack trace:

      java.nio.channels.ClosedChannelException
      at org.jenkinsci.remoting.protocol.impl.ChannelApplicationLayer.onReadClosed(ChannelApplicationLayer.java:208)
      at org.jenkinsci.remoting.protocol.ApplicationLayer.onRecvClosed(ApplicationLayer.java:222)
      at org.jenkinsci.remoting.protocol.ProtocolStack$Ptr.onRecvClosed(ProtocolStack.java:832)
      at org.jenkinsci.remoting.protocol.FilterLayer.onRecvClosed(FilterLayer.java:287)
      at org.jenkinsci.remoting.protocol.impl.SSLEngineFilterLayer.onRecvClosed(SSLEngineFilterLayer.java:181)
      at org.jenkinsci.remoting.protocol.impl.SSLEngineFilterLayer.switchToNoSecure(SSLEngineFilterLayer.java:283)
      at org.jenkinsci.remoting.protocol.impl.SSLEngineFilterLayer.processWrite(SSLEngineFilterLayer.java:503)
      at org.jenkinsci.remoting.protocol.impl.SSLEngineFilterLayer.processQueuedWrites(SSLEngineFilterLayer.java:248)
      at org.jenkinsci.remoting.protocol.impl.SSLEngineFilterLayer.doSend(SSLEngineFilterLayer.java:200)
      at org.jenkinsci.remoting.protocol.impl.SSLEngineFilterLayer.doCloseSend(SSLEngineFilterLayer.java:213)
      at org.jenkinsci.remoting.protocol.ProtocolStack$Ptr.doCloseSend(ProtocolStack.java:800)
      at org.jenkinsci.remoting.protocol.ApplicationLayer.doCloseWrite(ApplicationLayer.java:173)
      at org.jenkinsci.remoting.protocol.impl.ChannelApplicationLayer$ByteBufferCommandTransport.closeWrite(ChannelApplicationLayer.java:313)
      at hudson.remoting.Channel.close(Channel.java:1446)
      at hudson.remoting.Channel.close(Channel.java:1399)
      at hudson.slaves.SlaveComputer.closeChannel(SlaveComputer.java:746)
      at hudson.slaves.SlaveComputer.access$800(SlaveComputer.java:99)
      at hudson.slaves.SlaveComputer$3.run(SlaveComputer.java:664)
      at jenkins.util.ContextResettingExecutorService$1.run(ContextResettingExecutorService.java:28)
      at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
      at java.util.concurrent.FutureTask.run(Unknown Source)
      at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
      at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
      at java.lang.Thread.run(Unknown Source)
      Caused: java.io.IOException: Backing channel 'JNLP4-connect connection from 10.43.3.65/10.43.3.65:57573' is disconnected.
      at hudson.remoting.RemoteInvocationHandler.channelOrFail(RemoteInvocationHandler.java:209)
      at hudson.remoting.RemoteInvocationHandler.invoke(RemoteInvocationHandler.java:278)
      at com.sun.proxy.$Proxy74.isAlive(Unknown Source)
      at hudson.Launcher$RemoteLauncher$ProcImpl.isAlive(Launcher.java:1137)
      at hudson.Launcher$RemoteLauncher$ProcImpl.join(Launcher.java:1129)
      at hudson.tasks.CommandInterpreter.join(CommandInterpreter.java:155)
      at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:109)
      at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:66)
      at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:20)
      at hudson.model.AbstractBuild$AbstractBuildExecution.perform(AbstractBuild.java:744)
      at hudson.model.Build$BuildExecution.build(Build.java:206)
      at hudson.model.Build$BuildExecution.doRun(Build.java:163)
      at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:504)
      at hudson.model.Run.execute(Run.java:1727)
      at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:43)
      at hudson.model.ResourceController.execute(ResourceController.java:97)
      at hudson.model.Executor.run(Executor.java:429)
      Build step 'Execute shell' marked build as failure

       

      In the Jenkins logs we see the following stack trace:

      java.nio.channels.ClosedChannelException at org.jenkinsci.remoting.protocol.impl.ChannelApplicationLayer.onReadClosed(ChannelApplicationLayer.java:209) at org.jenkinsci.remoting.protocol.ApplicationLayer.onRecvClosed(ApplicationLayer.java:222) at org.jenkinsci.remoting.protocol.ProtocolStack$Ptr.onRecvClosed(ProtocolStack.java:832) at org.jenkinsci.remoting.protocol.FilterLayer.onRecvClosed(FilterLayer.java:287) at org.jenkinsci.remoting.protocol.impl.SSLEngineFilterLayer.onRecvClosed(SSLEngineFilterLayer.java:181) at org.jenkinsci.remoting.protocol.impl.SSLEngineFilterLayer.switchToNoSecure(SSLEngineFilterLayer.java:283) at org.jenkinsci.remoting.protocol.impl.SSLEngineFilterLayer.processWrite(SSLEngineFilterLayer.java:503) at org.jenkinsci.remoting.protocol.impl.SSLEngineFilterLayer.processQueuedWrites(SSLEngineFilterLayer.java:248) at org.jenkinsci.remoting.protocol.impl.SSLEngineFilterLayer.doSend(SSLEngineFilterLayer.java:200) at org.jenkinsci.remoting.protocol.impl.SSLEngineFilterLayer.doCloseSend(SSLEngineFilterLayer.java:213) at org.jenkinsci.remoting.protocol.ProtocolStack$Ptr.doCloseSend(ProtocolStack.java:800) at org.jenkinsci.remoting.protocol.ApplicationLayer.doCloseWrite(ApplicationLayer.java:173) at org.jenkinsci.remoting.protocol.impl.ChannelApplicationLayer$ByteBufferCommandTransport.closeWrite(ChannelApplicationLayer.java:314) at hudson.remoting.Channel.close(Channel.java:1450) at hudson.remoting.Channel.close(Channel.java:1403) at hudson.slaves.SlaveComputer.closeChannel(SlaveComputer.java:746) at hudson.slaves.SlaveComputer.access$800(SlaveComputer.java:99) at hudson.slaves.SlaveComputer$3.run(SlaveComputer.java:664) at jenkins.util.ContextResettingExecutorService$1.run(ContextResettingExecutorService.java:28) at jenkins.security.ImpersonatingExecutorService$1.run(ImpersonatingExecutorService.java:59) at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source) at java.util.concurrent.FutureTask.run(Unknown Source) at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at java.lang.Thread.run(Unknown Source)

       

      The closest post I found was from a user running the build nodes on windows and Jenkins was behind an AWS ELB. My build nodes are linux and jenkins is not behind an ELB.

            jthompson Jeff Thompson
            ffraticelli Fernando Fraticelli
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

              Created:
              Updated:
              Resolved: