Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-14332

Repeated channel/timeout errors from Jenkins slave

    Details

    • Similar Issues:

      Description

      The issue appears on my custom build of the Jenkins core, but seems it could be reproduced on newest versions as well.

      We've experienced a network overloading, which has let to the exception in the PingThread on Jenkins master, which has closed the communication channel. However, the slave stills online and takes jobs, but any remote action fails (see logs above) => All scheduled builds fail with an error

      The issue affects ssh-slaves only:

      • Linux SSH slaves are "online", but all jobs on the fail with the error above
      • Windows services have reconnected automatically...
      • Windows JNLP slaves have reconnected as well

        Attachments

          Issue Links

            Activity

            Hide
            seanabbott Sean Abbott added a comment -

            I was able to connect to the same slave from another jenkins master using the same kernel and jenkins version with no issues...

            Show
            seanabbott Sean Abbott added a comment - I was able to connect to the same slave from another jenkins master using the same kernel and jenkins version with no issues...
            Hide
            gboucherie Guillaume Boucherie added a comment -

            Hi,

            I just do a test on last AWS Linux machine (kernel : 3.14.35), with the same machine on both master and slave.
            And the problem gone ...
            Jenkins version used is the last stable : 1.609.1

            Regards

            Show
            gboucherie Guillaume Boucherie added a comment - Hi, I just do a test on last AWS Linux machine (kernel : 3.14.35), with the same machine on both master and slave. And the problem gone ... Jenkins version used is the last stable : 1.609.1 Regards
            Hide
            jglick Jesse Glick added a comment -

            Related to JENKINS-1948 perhaps?

            Show
            jglick Jesse Glick added a comment - Related to JENKINS-1948 perhaps?
            Hide
            srivadlamani Srikanth Vadlamani added a comment -

            In AWS, we are using Ubuntu 14.04.4 LTS. EC2-plugin version is 1.36. We are also seeing similar errors where the agent would disconnect from Jenkins randomly with the error below.

             

            ERROR: SEVERE ERROR occurs
            org.jenkinsci.lib.envinject.EnvInjectException: hudson.remoting.ChannelClosedException: channel is already closed
            at org.jenkinsci.plugins.envinject.service.EnvironmentVariablesNodeLoader.gatherEnvironmentVariablesNode(EnvironmentVariablesNodeLoader.java:79)
            at org.jenkinsci.plugins.envinject.EnvInjectListener.loadEnvironmentVariablesNode(EnvInjectListener.java:80)
            at org.jenkinsci.plugins.envinject.EnvInjectListener.setUpEnvironment(EnvInjectListener.java:42)
            at hudson.model.AbstractBuild$AbstractBuildExecution.createLauncher(AbstractBuild.java:572)
            at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:492)
            at hudson.model.Run.execute(Run.java:1741)
            at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:43)
            at hudson.model.ResourceController.execute(ResourceController.java:98)
            at hudson.model.Executor.run(Executor.java:410)
            Caused by: hudson.remoting.ChannelClosedException: channel is already closed
            at hudson.remoting.Channel.send(Channel.java:578)
            at hudson.remoting.Request.call(Request.java:130)
            at hudson.remoting.Channel.call(Channel.java:780)
            at hudson.FilePath.act(FilePath.java:1102)
            at org.jenkinsci.plugins.envinject.service.EnvironmentVariablesNodeLoader.gatherEnvironmentVariablesNode(EnvironmentVariablesNodeLoader.java:48)
            ... 8 more
            Caused by: java.io.IOException
            at hudson.remoting.Channel.close(Channel.java:1163)
            at hudson.slaves.ChannelPinger$1.onDead(ChannelPinger.java:121)
            at hudson.remoting.PingThread.ping(PingThread.java:130)
            at hudson.remoting.PingThread.run(PingThread.java:86)
            Caused by: java.util.concurrent.TimeoutException: Ping started at 1493347954228 hasn't completed by 1493348194229
            
            

             

            Show
            srivadlamani Srikanth Vadlamani added a comment - In AWS, we are using Ubuntu 14.04.4 LTS. EC2-plugin version is 1.36. We are also seeing similar errors where the agent would disconnect from Jenkins randomly with the error below.   ERROR: SEVERE ERROR occurs org.jenkinsci.lib.envinject.EnvInjectException: hudson.remoting.ChannelClosedException: channel is already closed at org.jenkinsci.plugins.envinject.service.EnvironmentVariablesNodeLoader.gatherEnvironmentVariablesNode(EnvironmentVariablesNodeLoader.java:79) at org.jenkinsci.plugins.envinject.EnvInjectListener.loadEnvironmentVariablesNode(EnvInjectListener.java:80) at org.jenkinsci.plugins.envinject.EnvInjectListener.setUpEnvironment(EnvInjectListener.java:42) at hudson.model.AbstractBuild$AbstractBuildExecution.createLauncher(AbstractBuild.java:572) at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:492) at hudson.model.Run.execute(Run.java:1741) at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:43) at hudson.model.ResourceController.execute(ResourceController.java:98) at hudson.model.Executor.run(Executor.java:410) Caused by: hudson.remoting.ChannelClosedException: channel is already closed at hudson.remoting.Channel.send(Channel.java:578) at hudson.remoting.Request.call(Request.java:130) at hudson.remoting.Channel.call(Channel.java:780) at hudson.FilePath.act(FilePath.java:1102) at org.jenkinsci.plugins.envinject.service.EnvironmentVariablesNodeLoader.gatherEnvironmentVariablesNode(EnvironmentVariablesNodeLoader.java:48) ... 8 more Caused by: java.io.IOException at hudson.remoting.Channel.close(Channel.java:1163) at hudson.slaves.ChannelPinger$1.onDead(ChannelPinger.java:121) at hudson.remoting.PingThread.ping(PingThread.java:130) at hudson.remoting.PingThread.run(PingThread.java:86) Caused by: java.util.concurrent.TimeoutException: Ping started at 1493347954228 hasn't completed by 1493348194229  
            Hide
            ifernandezcalvo Ivan Fernandez Calvo added a comment -

            because there is not recent info here and seems similar to JENKINS-53810 I will close it.

            Show
            ifernandezcalvo Ivan Fernandez Calvo added a comment - because there is not recent info here and seems similar to JENKINS-53810 I will close it.

              People

              • Assignee:
                ifernandezcalvo Ivan Fernandez Calvo
                Reporter:
                olamy Olivier Lamy
              • Votes:
                33 Vote for this issue
                Watchers:
                51 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: