Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-29825

some windows agents unable to reconnect after master restart

    Details

    • Type: Bug
    • Status: Resolved (View Workflow)
    • Priority: Major
    • Resolution: Duplicate
    • Component/s: core, remoting
    • Labels:
      None
    • Environment:
      master: solaris
      agents: windows7
      Jenkins LTS 1.596.2
    • Similar Issues:

      Description

      Hi,

      Sometimes, after a master restart, some of our windows agents are unable to reconnect. On the agent side in the log file we see this being repeated endlessly after restart:

      Aug 06, 2015 10:07:34 AM hudson.remoting.jnlp.Main$CuiListener error
      SEVERE: The server rejected the connection: WIN7SLAVE1 is already connected to this master. Rejecting this connection.
      java.lang.Exception: The server rejected the connection: WIN7SLAVE1 is already connected to this master. Rejecting this connection.
      	at hudson.remoting.Engine.onConnectionRejected(Engine.java:306)
      	at hudson.remoting.Engine.run(Engine.java:276)
      
      Aug 06, 2015 10:07:35 AM hudson.remoting.jnlp.Main createEngine
      INFO: Setting up slave: WIN7SLAVE1
      Aug 06, 2015 10:07:35 AM hudson.remoting.jnlp.Main$CuiListener <init>
      INFO: Jenkins agent is running in headless mode.
      Aug 06, 2015 10:07:35 AM hudson.remoting.jnlp.Main$CuiListener status
      INFO: Locating server among [http://ci-master/jenkins/, http://ci-master:5080/jenkins/]
      Aug 06, 2015 10:07:35 AM hudson.remoting.jnlp.Main$CuiListener status
      INFO: Connecting to ci-master:56749
      Aug 06, 2015 10:07:35 AM hudson.remoting.jnlp.Main$CuiListener status
      INFO: Handshaking
      Aug 06, 2015 10:07:35 AM hudson.remoting.jnlp.Main$CuiListener error
      SEVERE: The server rejected the connection: WIN7SLAVE1 is already connected to this master. Rejecting this connection.
      java.lang.Exception: The server rejected the connection: WIN7SLAVE1 is already connected to this master. Rejecting this connection.
      	at hudson.remoting.Engine.onConnectionRejected(Engine.java:306)
      	at hudson.remoting.Engine.run(Engine.java:276)
      
      Aug 06, 2015 10:07:35 AM hudson.remoting.jnlp.Main createEngine
      INFO: Setting up slave: WIN7SLAVE1
      Aug 06, 2015 10:07:35 AM hudson.remoting.jnlp.Main$CuiListener <init>
      INFO: Jenkins agent is running in headless mode.
      Aug 06, 2015 10:07:35 AM hudson.remoting.jnlp.Main$CuiListener status
      INFO: Locating server among [http://ci-master/jenkins/, http://ci-master:5080/jenkins/]
      Aug 06, 2015 10:07:35 AM hudson.remoting.jnlp.Main$CuiListener status
      INFO: Connecting to ci-master:56749
      Aug 06, 2015 10:07:35 AM hudson.remoting.jnlp.Main$CuiListener status
      INFO: Handshaking
      Aug 06, 2015 10:07:35 AM hudson.remoting.jnlp.Main$CuiListener error
      SEVERE: The server rejected the connection: WIN7SLAVE1 is already connected to this master. Rejecting this connection.
      java.lang.Exception: The server rejected the connection: WIN7SLAVE1 is already connected to this master. Rejecting this connection.
      	at hudson.remoting.Engine.onConnectionRejected(Engine.java:306)
      	at hudson.remoting.Engine.run(Engine.java:276)
      

      In the agent log on the jenkins side we see this:

      JNLP agent connected from /158.166.68.73
      <===[JENKINS REMOTING CAPACITY]===>Slave.jar version: 2.49
      This is a Windows slave
      Slave.jar version: 2.49
      This is a Windows slave
      Connection terminated
      Connection terminated
      ERROR: Failed to install restarter
      hudson.remoting.RequestAbortedException: hudson.remoting.Channel$OrderlyShutdown
      	at hudson.remoting.Request.abort(Request.java:295)
      	at hudson.remoting.Channel.terminate(Channel.java:814)
      	at hudson.remoting.Channel$CloseCommand.execute(Channel.java:1029)
      	at hudson.remoting.Channel$2.handle(Channel.java:483)
      	at hudson.remoting.AbstractByteArrayCommandTransport$1.handle(AbstractByteArrayCommandTransport.java:61)
      	at org.jenkinsci.remoting.nio.NioChannelHub$2.run(NioChannelHub.java:597)
      	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
      	at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
      	at java.util.concurrent.FutureTask.run(FutureTask.java:166)
      	at hudson.remoting.SingleLaneExecutorService$1.run(SingleLaneExecutorService.java:111)
      	at jenkins.util.ContextResettingExecutorService$1.run(ContextResettingExecutorService.java:28)
      	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
      	at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
      	at java.util.concurrent.FutureTask.run(FutureTask.java:166)
      	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
      	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
      	at java.lang.Thread.run(Thread.java:724)
      	at ......remote call to D02DI1419932AGR(Native Method)
      	at hudson.remoting.Channel.attachCallSiteStackTrace(Channel.java:1356)
      	at hudson.remoting.Request.call(Request.java:171)
      	at hudson.remoting.Channel.call(Channel.java:751)
      	at jenkins.slaves.restarter.JnlpSlaveRestarterInstaller.install(JnlpSlaveRestarterInstaller.java:52)
      	at jenkins.slaves.restarter.JnlpSlaveRestarterInstaller.access$000(JnlpSlaveRestarterInstaller.java:33)
      	at jenkins.slaves.restarter.JnlpSlaveRestarterInstaller$1.call(JnlpSlaveRestarterInstaller.java:39)
      	at jenkins.slaves.restarter.JnlpSlaveRestarterInstaller$1.call(JnlpSlaveRestarterInstaller.java:36)
      	at jenkins.util.ContextResettingExecutorService$2.call(ContextResettingExecutorService.java:46)
      	at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
      	at java.util.concurrent.FutureTask.run(FutureTask.java:166)
      	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
      	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
      	at java.lang.Thread.run(Thread.java:724)
      Caused by: hudson.remoting.Channel$OrderlyShutdown
      	at hudson.remoting.Channel$CloseCommand.execute(Channel.java:1029)
      	at hudson.remoting.Channel$2.handle(Channel.java:483)
      	at hudson.remoting.AbstractByteArrayCommandTransport$1.handle(AbstractByteArrayCommandTransport.java:61)
      	at org.jenkinsci.remoting.nio.NioChannelHub$2.run(NioChannelHub.java:597)
      	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
      	at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
      	at java.util.concurrent.FutureTask.run(FutureTask.java:166)
      	at hudson.remoting.SingleLaneExecutorService$1.run(SingleLaneExecutorService.java:111)
      	at jenkins.util.ContextResettingExecutorService$1.run(ContextResettingExecutorService.java:28)
      	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
      	... 5 more
      Caused by: Command close created at
      	at hudson.remoting.Command.<init>(Command.java:56)
      	at hudson.remoting.Channel$CloseCommand.<init>(Channel.java:1023)
      	at hudson.remoting.Channel$CloseCommand.<init>(Channel.java:1021)
      	at hudson.remoting.Channel.close(Channel.java:1104)
      	at hudson.remoting.Channel.close(Channel.java:1087)
      	at hudson.remoting.Channel$CloseCommand.execute(Channel.java:1028)
      	at hudson.remoting.Channel$2.handle(Channel.java:483)
      	at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:60)
      JNLP agent connected from /158.166.68.73
      <===[JENKINS REMOTING CAPACITY]===>Slave.jar version: 2.49
      This is a Windows slave
      Slave.jar version: 2.49
      This is a Windows slave
      Connection terminated
      Connection terminated
      ERROR: Failed to install restarter
      hudson.remoting.RequestAbortedException: hudson.remoting.Channel$OrderlyShutdown
      	at hudson.remoting.Request.abort(Request.java:295)
      	at hudson.remoting.Channel.terminate(Channel.java:814)
      	at hudson.remoting.Channel$CloseCommand.execute(Channel.java:1029)
      	at hudson.remoting.Channel$2.handle(Channel.java:483)
      	at hudson.remoting.AbstractByteArrayCommandTransport$1.handle(AbstractByteArrayCommandTransport.java:61)
      	at org.jenkinsci.remoting.nio.NioChannelHub$2.run(NioChannelHub.java:597)
      	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
      	at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
      	at java.util.concurrent.FutureTask.run(FutureTask.java:166)
      	at hudson.remoting.SingleLaneExecutorService$1.run(SingleLaneExecutorService.java:111)
      	at jenkins.util.ContextResettingExecutorService$1.run(ContextResettingExecutorService.java:28)
      	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
      	at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
      	at java.util.concurrent.FutureTask.run(FutureTask.java:166)
      	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
      	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
      	at java.lang.Thread.run(Thread.java:724)
      	at ......remote call to D02DI1419932AGR(Native Method)
      	at hudson.remoting.Channel.attachCallSiteStackTrace(Channel.java:1356)
      	at hudson.remoting.Request.call(Request.java:171)
      	at hudson.remoting.Channel.call(Channel.java:751)
      	at jenkins.slaves.restarter.JnlpSlaveRestarterInstaller.install(JnlpSlaveRestarterInstaller.java:52)
      	at jenkins.slaves.restarter.JnlpSlaveRestarterInstaller.access$000(JnlpSlaveRestarterInstaller.java:33)
      	at jenkins.slaves.restarter.JnlpSlaveRestarterInstaller$1.call(JnlpSlaveRestarterInstaller.java:39)
      	at jenkins.slaves.restarter.JnlpSlaveRestarterInstaller$1.call(JnlpSlaveRestarterInstaller.java:36)
      	at jenkins.util.ContextResettingExecutorService$2.call(ContextResettingExecutorService.java:46)
      	at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
      	at java.util.concurrent.FutureTask.run(FutureTask.java:166)
      	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
      	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
      	at java.lang.Thread.run(Thread.java:724)
      Caused by: hudson.remoting.Channel$OrderlyShutdown
      	at hudson.remoting.Channel$CloseCommand.execute(Channel.java:1029)
      	at hudson.remoting.Channel$2.handle(Channel.java:483)
      	at hudson.remoting.AbstractByteArrayCommandTransport$1.handle(AbstractByteArrayCommandTransport.java:61)
      	at org.jenkinsci.remoting.nio.NioChannelHub$2.run(NioChannelHub.java:597)
      	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
      	at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
      	at java.util.concurrent.FutureTask.run(FutureTask.java:166)
      	at hudson.remoting.SingleLaneExecutorService$1.run(SingleLaneExecutorService.java:111)
      	at jenkins.util.ContextResettingExecutorService$1.run(ContextResettingExecutorService.java:28)
      	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
      	... 5 more
      Caused by: Command close created at
      	at hudson.remoting.Command.<init>(Command.java:56)
      	at hudson.remoting.Channel$CloseCommand.<init>(Channel.java:1023)
      	at hudson.remoting.Channel$CloseCommand.<init>(Channel.java:1021)
      	at hudson.remoting.Channel.close(Channel.java:1104)
      	at hudson.remoting.Channel.close(Channel.java:1087)
      	at hudson.remoting.Channel$CloseCommand.execute(Channel.java:1028)
      	at hudson.remoting.Channel$2.handle(Channel.java:483)
      	at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:60)
      

      Restarting the windows service on the agent fixes the problem, but is annoying as you can understand.

        Attachments

          Issue Links

            Activity

            Hide
            oleg_nenashev Oleg Nenashev added a comment -

            Likely it is just a runaway process. It should have been fixed in Jenkins 2.50 by introduction of the Runaway process Killer. See JENKINS-39231

            Show
            oleg_nenashev Oleg Nenashev added a comment - Likely it is just a runaway process. It should have been fixed in Jenkins 2.50 by introduction of the Runaway process Killer. See JENKINS-39231

              People

              • Assignee:
                oleg_nenashev Oleg Nenashev
                Reporter:
                heymjo Jorg Heymans
              • Votes:
                1 Vote for this issue
                Watchers:
                3 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: