Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-49956

Kubernetes agent fails to reconnect on master restarts

    Details

    • Similar Issues:

      Description

      After restarting the master for plugin upgrades the agent fails to reconnect

      master logs

      Mar 06, 2018 4:50:23 PM jenkins.model.Jenkins$24 run
      SEVERE: Restarting VM as requested by xxx
      Mar 06, 2018 4:50:23 PM jenkins.model.Jenkins cleanUp
      INFO: Stopping Jenkins
      Mar 06, 2018 4:50:23 PM jenkins.model.Jenkins$19 onAttained
      INFO: Started termination
      Mar 06, 2018 4:50:23 PM jenkins.model.Jenkins$19 onAttained
      INFO: Completed termination
      Mar 06, 2018 4:50:23 PM jenkins.model.Jenkins _cleanUpDisconnectComputers
      INFO: Starting node disconnection
      Mar 06, 2018 4:50:23 PM jenkins.slaves.DefaultJnlpSlaveReceiver channelClosed
      WARNING: safe-restart thread for artifact-manager-s3-c7701-0rmp3 terminated
      java.nio.channels.ClosedChannelException
      	at org.jenkinsci.remoting.protocol.impl.ChannelApplicationLayer.onReadClosed(ChannelApplicationLayer.java:208)
      	at org.jenkinsci.remoting.protocol.ApplicationLayer.onRecvClosed(ApplicationLayer.java:222)
      	at org.jenkinsci.remoting.protocol.ProtocolStack$Ptr.onRecvClosed(ProtocolStack.java:832)
      	at org.jenkinsci.remoting.protocol.FilterLayer.onRecvClosed(FilterLayer.java:287)
      	at org.jenkinsci.remoting.protocol.impl.SSLEngineFilterLayer.onRecvClosed(SSLEngineFilterLayer.java:181)
      	at org.jenkinsci.remoting.protocol.impl.SSLEngineFilterLayer.switchToNoSecure(SSLEngineFilterLayer.java:283)
      	at org.jenkinsci.remoting.protocol.impl.SSLEngineFilterLayer.processWrite(SSLEngineFilterLayer.java:503)
      	at org.jenkinsci.remoting.protocol.impl.SSLEngineFilterLayer.processQueuedWrites(SSLEngineFilterLayer.java:248)
      	at org.jenkinsci.remoting.protocol.impl.SSLEngineFilterLayer.doSend(SSLEngineFilterLayer.java:200)
      	at org.jenkinsci.remoting.protocol.impl.SSLEngineFilterLayer.doCloseSend(SSLEngineFilterLayer.java:213)
      	at org.jenkinsci.remoting.protocol.ProtocolStack$Ptr.doCloseSend(ProtocolStack.java:800)
      	at org.jenkinsci.remoting.protocol.ApplicationLayer.doCloseWrite(ApplicationLayer.java:173)
      	at org.jenkinsci.remoting.protocol.impl.ChannelApplicationLayer$ByteBufferCommandTransport.closeWrite(ChannelApplicationLayer.java:311)
      	at hudson.remoting.Channel.close(Channel.java:1405)
      	at hudson.remoting.Channel.close(Channel.java:1358)
      	at hudson.slaves.SlaveComputer.closeChannel(SlaveComputer.java:737)
      	at hudson.slaves.SlaveComputer.kill(SlaveComputer.java:704)
      	at hudson.model.AbstractCIBase.killComputer(AbstractCIBase.java:88)
      	at jenkins.model.Jenkins.access$2000(Jenkins.java:304)
      	at jenkins.model.Jenkins$20.run(Jenkins.java:3367)
      	at hudson.model.Queue._withLock(Queue.java:1373)
      	at hudson.model.Queue.withLock(Queue.java:1250)
      	at jenkins.model.Jenkins._cleanUpDisconnectComputers(Jenkins.java:3361)
      	at jenkins.model.Jenkins.cleanUp(Jenkins.java:3237)
      	at hudson.lifecycle.UnixLifecycle.restart(UnixLifecycle.java:73)
      	at jenkins.model.Jenkins$24.run(Jenkins.java:4218)
      
      Mar 06, 2018 4:50:24 PM jenkins.model.Jenkins _cleanUpShutdownPluginManager
      INFO: Stopping plugin manager
      Mar 06, 2018 4:50:24 PM jenkins.model.Jenkins _cleanUpPersistQueue
      INFO: Persisting build queue
      Mar 06, 2018 4:50:24 PM jenkins.model.Jenkins _cleanUpAwaitDisconnects
      INFO: Waiting for node disconnection completion
      Mar 06, 2018 4:50:24 PM jenkins.model.Jenkins cleanUp
      INFO: Jenkins stopped
      VM settings:
          Max. Heap Size: 4.00G
          Ergonomics Machine Class: server
          Using VM: OpenJDK 64-Bit Server VM
      
      Running from: /usr/share/jenkins/jenkins.war
      webroot: EnvVars.masterEnvVars.get("JENKINS_HOME")
      Mar 06, 2018 4:50:27 PM Main deleteWinstoneTempContents
      WARNING: Failed to delete the temporary Winstone file /tmp/winstone/jenkins.war
      Mar 06, 2018 4:50:27 PM org.eclipse.jetty.util.log.Log initialized
      INFO: Logging initialized @1153ms to org.eclipse.jetty.util.log.JavaUtilLog
      Mar 06, 2018 4:50:28 PM winstone.Logger logInternal
      INFO: Beginning extraction from war file
      Mar 06, 2018 4:50:28 PM org.eclipse.jetty.server.handler.ContextHandler setContextPath
      WARNING: Empty contextPath
      Mar 06, 2018 4:50:28 PM org.eclipse.jetty.server.Server doStart
      INFO: jetty-9.4.z-SNAPSHOT
      Mar 06, 2018 4:50:29 PM org.eclipse.jetty.webapp.StandardDescriptorProcessor visitServlet
      INFO: NO JSP Support for /, did not find org.eclipse.jetty.jsp.JettyJspServlet
      Mar 06, 2018 4:50:29 PM org.eclipse.jetty.server.session.DefaultSessionIdManager doStart
      INFO: DefaultSessionIdManager workerName=node0
      Mar 06, 2018 4:50:29 PM org.eclipse.jetty.server.session.DefaultSessionIdManager doStart
      INFO: No SessionScavenger set, using defaults
      Mar 06, 2018 4:50:29 PM org.eclipse.jetty.server.session.HouseKeeper startScavenging
      INFO: Scavenging every 660000ms
      Jenkins home directory: /var/jenkins_home found at: EnvVars.masterEnvVars.get("JENKINS_HOME")
      Mar 06, 2018 4:50:30 PM org.eclipse.jetty.server.handler.ContextHandler doStart
      INFO: Started w.@239a307b{/,file:///var/jenkins_home/war/,AVAILABLE}{/var/jenkins_home/war}
      Mar 06, 2018 4:50:30 PM org.eclipse.jetty.server.AbstractConnector doStart
      INFO: Started ServerConnector@30f842ca{HTTP/1.1,[http/1.1]}{0.0.0.0:8080}
      Mar 06, 2018 4:50:30 PM org.eclipse.jetty.server.Server doStart
      INFO: Started @4439ms
      Mar 06, 2018 4:50:30 PM winstone.Logger logInternal
      INFO: Winstone Servlet Engine v4.0 running: controlPort=disabled
      Mar 06, 2018 4:50:33 PM jenkins.InitReactorRunner$1 onAttained
      INFO: Started initialization
      Mar 06, 2018 4:50:35 PM jenkins.InitReactorRunner$1 onAttained
      INFO: Listed all plugins
      
      // here kubernetes killed jenkins pod because Liveness probe failed
      
      $ kubectl logs -f jenkins-0
      VM settings:
          Max. Heap Size: 4.00G
          Ergonomics Machine Class: server
          Using VM: OpenJDK 64-Bit Server VM
      
      Running from: /usr/share/jenkins/jenkins.war
      webroot: EnvVars.masterEnvVars.get("JENKINS_HOME")
      Mar 06, 2018 4:50:54 PM Main deleteWinstoneTempContents
      WARNING: Failed to delete the temporary Winstone file /tmp/winstone/jenkins.war
      Mar 06, 2018 4:50:54 PM org.eclipse.jetty.util.log.Log initialized
      INFO: Logging initialized @1119ms to org.eclipse.jetty.util.log.JavaUtilLog
      Mar 06, 2018 4:50:54 PM winstone.Logger logInternal
      INFO: Beginning extraction from war file
      Mar 06, 2018 4:50:54 PM org.eclipse.jetty.server.handler.ContextHandler setContextPath
      WARNING: Empty contextPath
      Mar 06, 2018 4:50:54 PM org.eclipse.jetty.server.Server doStart
      INFO: jetty-9.4.z-SNAPSHOT
      Mar 06, 2018 4:50:55 PM org.eclipse.jetty.webapp.StandardDescriptorProcessor visitServlet
      INFO: NO JSP Support for /, did not find org.eclipse.jetty.jsp.JettyJspServlet
      Mar 06, 2018 4:50:56 PM org.eclipse.jetty.server.session.DefaultSessionIdManager doStart
      INFO: DefaultSessionIdManager workerName=node0
      Mar 06, 2018 4:50:56 PM org.eclipse.jetty.server.session.DefaultSessionIdManager doStart
      INFO: No SessionScavenger set, using defaults
      Mar 06, 2018 4:50:56 PM org.eclipse.jetty.server.session.HouseKeeper startScavenging
      INFO: Scavenging every 600000ms
      Jenkins home directory: /var/jenkins_home found at: EnvVars.masterEnvVars.get("JENKINS_HOME")
      Mar 06, 2018 4:50:57 PM org.eclipse.jetty.server.handler.ContextHandler doStart
      INFO: Started w.@239a307b{/,file:///var/jenkins_home/war/,AVAILABLE}{/var/jenkins_home/war}
      Mar 06, 2018 4:50:57 PM org.eclipse.jetty.server.AbstractConnector doStart
      INFO: Started ServerConnector@30f842ca{HTTP/1.1,[http/1.1]}{0.0.0.0:8080}
      Mar 06, 2018 4:50:57 PM org.eclipse.jetty.server.Server doStart
      INFO: Started @4295ms
      Mar 06, 2018 4:50:57 PM winstone.Logger logInternal
      INFO: Winstone Servlet Engine v4.0 running: controlPort=disabled
      Mar 06, 2018 4:50:59 PM jenkins.InitReactorRunner$1 onAttained
      INFO: Started initialization
      Mar 06, 2018 4:51:00 PM jenkins.InitReactorRunner$1 onAttained
      INFO: Listed all plugins
      Mar 06, 2018 4:51:21 PM jenkins.InitReactorRunner$1 onAttained
      INFO: Prepared all plugins
      Mar 06, 2018 4:51:21 PM jenkins.InitReactorRunner$1 onAttained
      INFO: Started all plugins
      Mar 06, 2018 4:51:23 PM hudson.ExtensionFinder$GuiceFinder$FaultTolerantScope$1 error
      INFO: Failed to instantiate optional component hudson.plugins.build_timeout.operations.AbortAndRestartOperation$DescriptorImpl; skipping
      Mar 06, 2018 4:51:26 PM jenkins.InitReactorRunner$1 onAttained
      INFO: Augmented all extensions
      Mar 06, 2018 4:51:27 PM org.csanchez.jenkins.plugins.kubernetes.KubernetesLauncher launch
      INFO: Agent has already been launched, activating: {}
      Mar 06, 2018 4:51:28 PM jenkins.InitReactorRunner$1 onAttained
      INFO: Loaded all jobs
      Mar 06, 2018 4:51:28 PM jenkins.util.groovy.GroovyHookScript execute
      INFO: Executing /var/jenkins_home/init.groovy.d/tcp-slave-agent-port.groovy
      Mar 06, 2018 4:51:29 PM hudson.model.AsyncPeriodicWork$1 run
      INFO: Started Download metadata
      Mar 06, 2018 4:51:29 PM hudson.model.AsyncPeriodicWork$1 run
      INFO: Finished Download metadata. 91 ms
      Mar 06, 2018 4:51:30 PM org.springframework.context.support.AbstractApplicationContext prepareRefresh
      INFO: Refreshing org.springframework.web.context.support.StaticWebApplicationContext@5ef7d416: display name [Root WebApplicationContext]; startup date [Tue Mar 06 16:51:30 GMT 2018]; root of context hierarchy
      Mar 06, 2018 4:51:30 PM org.springframework.context.support.AbstractApplicationContext obtainFreshBeanFactory
      INFO: Bean factory for application context [org.springframework.web.context.support.StaticWebApplicationContext@5ef7d416]: org.springframework.beans.factory.support.DefaultListableBeanFactory@3f128517
      Mar 06, 2018 4:51:30 PM org.springframework.beans.factory.support.DefaultListableBeanFactory preInstantiateSingletons
      INFO: Pre-instantiating singletons in org.springframework.beans.factory.support.DefaultListableBeanFactory@3f128517: defining beans [filter,legacy]; root of factory hierarchy
      Mar 06, 2018 4:51:30 PM jenkins.InitReactorRunner$1 onAttained
      INFO: Completed initialization
      Mar 06, 2018 4:51:34 PM hudson.WebAppMain$3 run
      INFO: Jenkins is fully up and running
      --> setting agent port for jnlp
      --> setting agent port for jnlp... done
      Mar 06, 2018 4:52:16 PM hudson.TcpSlaveAgentListener$ConnectionHandler run
      INFO: Accepted JNLP4-connect connection #1 from /100.96.6.38:47260
      Mar 06, 2018 4:52:25 PM org.csanchez.jenkins.plugins.kubernetes.KubernetesSlave _terminate
      INFO: Terminating Kubernetes instance for agent artifact-manager-s3-c7701-0rmp3
      Mar 06, 2018 4:52:25 PM jenkins.slaves.DefaultJnlpSlaveReceiver channelClosed
      WARNING: Computer.threadPoolForRemoting [#9] for artifact-manager-s3-c7701-0rmp3 terminated
      java.nio.channels.ClosedChannelException
      	at org.jenkinsci.remoting.protocol.impl.ChannelApplicationLayer.onReadClosed(ChannelApplicationLayer.java:208)
      	at org.jenkinsci.remoting.protocol.ApplicationLayer.onRecvClosed(ApplicationLayer.java:222)
      	at org.jenkinsci.remoting.protocol.ProtocolStack$Ptr.onRecvClosed(ProtocolStack.java:832)
      	at org.jenkinsci.remoting.protocol.FilterLayer.onRecvClosed(FilterLayer.java:287)
      	at org.jenkinsci.remoting.protocol.impl.SSLEngineFilterLayer.onRecvClosed(SSLEngineFilterLayer.java:181)
      	at org.jenkinsci.remoting.protocol.impl.SSLEngineFilterLayer.switchToNoSecure(SSLEngineFilterLayer.java:283)
      	at org.jenkinsci.remoting.protocol.impl.SSLEngineFilterLayer.processWrite(SSLEngineFilterLayer.java:503)
      	at org.jenkinsci.remoting.protocol.impl.SSLEngineFilterLayer.processQueuedWrites(SSLEngineFilterLayer.java:248)
      	at org.jenkinsci.remoting.protocol.impl.SSLEngineFilterLayer.doSend(SSLEngineFilterLayer.java:200)
      	at org.jenkinsci.remoting.protocol.impl.SSLEngineFilterLayer.doCloseSend(SSLEngineFilterLayer.java:213)
      	at org.jenkinsci.remoting.protocol.ProtocolStack$Ptr.doCloseSend(ProtocolStack.java:800)
      	at org.jenkinsci.remoting.protocol.ApplicationLayer.doCloseWrite(ApplicationLayer.java:173)
      	at org.jenkinsci.remoting.protocol.impl.ChannelApplicationLayer$ByteBufferCommandTransport.closeWrite(ChannelApplicationLayer.java:311)
      	at hudson.remoting.Channel.close(Channel.java:1405)
      	at hudson.remoting.Channel.close(Channel.java:1358)
      	at hudson.slaves.SlaveComputer.closeChannel(SlaveComputer.java:737)
      	at hudson.slaves.SlaveComputer.access$800(SlaveComputer.java:96)
      	at hudson.slaves.SlaveComputer$3.run(SlaveComputer.java:655)
      	at jenkins.util.ContextResettingExecutorService$1.run(ContextResettingExecutorService.java:28)
      	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
      	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
      	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
      	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
      	at java.lang.Thread.run(Thread.java:748)
      
      Mar 06, 2018 4:52:25 PM hudson.remoting.Request$2 run
      WARNING: Failed to send back a reply to the request hudson.remoting.Request$2@78147134
      hudson.remoting.ChannelClosedException: channel is already closed
      	at hudson.remoting.Channel.send(Channel.java:671)
      	at hudson.remoting.Request$2.run(Request.java:370)
      	at hudson.remoting.InterceptingExecutorService$1.call(InterceptingExecutorService.java:72)
      	at org.jenkinsci.remoting.CallableDecorator.call(CallableDecorator.java:19)
      	at hudson.remoting.CallableDecoratorList$1.call(CallableDecoratorList.java:21)
      	at jenkins.util.ContextResettingExecutorService$2.call(ContextResettingExecutorService.java:46)
      	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
      	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
      	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
      	at java.lang.Thread.run(Thread.java:748)
      Caused by: java.nio.channels.ClosedChannelException
      	at org.jenkinsci.remoting.protocol.impl.ChannelApplicationLayer.onReadClosed(ChannelApplicationLayer.java:208)
      	at org.jenkinsci.remoting.protocol.ApplicationLayer.onRecvClosed(ApplicationLayer.java:222)
      	at org.jenkinsci.remoting.protocol.ProtocolStack$Ptr.onRecvClosed(ProtocolStack.java:832)
      	at org.jenkinsci.remoting.protocol.FilterLayer.onRecvClosed(FilterLayer.java:287)
      	at org.jenkinsci.remoting.protocol.impl.SSLEngineFilterLayer.onRecvClosed(SSLEngineFilterLayer.java:181)
      	at org.jenkinsci.remoting.protocol.impl.SSLEngineFilterLayer.switchToNoSecure(SSLEngineFilterLayer.java:283)
      	at org.jenkinsci.remoting.protocol.impl.SSLEngineFilterLayer.processWrite(SSLEngineFilterLayer.java:503)
      	at org.jenkinsci.remoting.protocol.impl.SSLEngineFilterLayer.processQueuedWrites(SSLEngineFilterLayer.java:248)
      	at org.jenkinsci.remoting.protocol.impl.SSLEngineFilterLayer.doSend(SSLEngineFilterLayer.java:200)
      	at org.jenkinsci.remoting.protocol.impl.SSLEngineFilterLayer.doCloseSend(SSLEngineFilterLayer.java:213)
      	at org.jenkinsci.remoting.protocol.ProtocolStack$Ptr.doCloseSend(ProtocolStack.java:800)
      	at org.jenkinsci.remoting.protocol.ApplicationLayer.doCloseWrite(ApplicationLayer.java:173)
      	at org.jenkinsci.remoting.protocol.impl.ChannelApplicationLayer$ByteBufferCommandTransport.closeWrite(ChannelApplicationLayer.java:311)
      	at hudson.remoting.Channel.close(Channel.java:1405)
      	at hudson.remoting.Channel.close(Channel.java:1358)
      	at hudson.slaves.SlaveComputer.closeChannel(SlaveComputer.java:737)
      	at hudson.slaves.SlaveComputer.access$800(SlaveComputer.java:96)
      	at hudson.slaves.SlaveComputer$3.run(SlaveComputer.java:655)
      	at jenkins.util.ContextResettingExecutorService$1.run(ContextResettingExecutorService.java:28)
      	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
      	... 4 more
      
      Mar 06, 2018 4:52:25 PM org.csanchez.jenkins.plugins.kubernetes.pipeline.PodTemplateStepExecution$PodTemplateCallback finished
      INFO: Removing pod template and deleting pod artifact-manager-s3-c7701 from cloud kubernetes
      Mar 06, 2018 4:52:27 PM okhttp3.internal.platform.Platform log
      INFO: ALPN callback dropped: HTTP/2 is disabled. Is alpn-boot on the boot class path?
      Mar 06, 2018 4:52:27 PM okhttp3.internal.platform.Platform log
      INFO: ALPN callback dropped: HTTP/2 is disabled. Is alpn-boot on the boot class path?
      Mar 06, 2018 4:52:27 PM org.csanchez.jenkins.plugins.kubernetes.KubernetesSlave _terminate
      INFO: Terminated Kubernetes instance for agent jenkins/artifact-manager-s3-c7701-0rmp3
      Terminated Kubernetes instance for agent jenkins/artifact-manager-s3-c7701-0rmp3
      Mar 06, 2018 4:52:27 PM org.csanchez.jenkins.plugins.kubernetes.KubernetesSlave _terminate
      INFO: Disconnected computer artifact-manager-s3-c7701-0rmp3
      Mar 06, 2018 4:52:27 PM org.csanchez.jenkins.plugins.kubernetes.pipeline.PodTemplateStepExecution$PodTemplateCallback finished
      WARNING: Failed to delete pod for agent jenkins/artifact-manager-s3-c7701: not found
      Mar 06, 2018 4:52:27 PM org.jenkinsci.plugins.workflow.job.WorkflowRun finish
      INFO: test #2 completed: FAILURE
      

      agent logs

      Mar 06, 2018 4:41:48 PM hudson.remoting.jnlp.Main$CuiListener status
      INFO: Connected
      Mar 06, 2018 4:50:23 PM hudson.remoting.jnlp.Main$CuiListener status
      INFO: Terminated
      Mar 06, 2018 4:50:33 PM org.jenkinsci.remoting.engine.JnlpAgentEndpointResolver waitForReady
      INFO: Master isn't ready to talk to us on http://jenkins/tcpSlaveAgentListener/. Will try again: response code=503
      Mar 06, 2018 4:50:48 PM org.jenkinsci.remoting.engine.JnlpAgentEndpointResolver waitForReady
      INFO: Failed to connect to the master. Will try again: java.net.SocketTimeoutException connect timed out
      Mar 06, 2018 4:51:03 PM org.jenkinsci.remoting.engine.JnlpAgentEndpointResolver waitForReady
      INFO: Failed to connect to the master. Will try again: java.net.SocketTimeoutException connect timed out
      Mar 06, 2018 4:51:18 PM org.jenkinsci.remoting.engine.JnlpAgentEndpointResolver waitForReady
      INFO: Failed to connect to the master. Will try again: java.net.SocketTimeoutException connect timed out
      Mar 06, 2018 4:51:33 PM org.jenkinsci.remoting.engine.JnlpAgentEndpointResolver waitForReady
      INFO: Failed to connect to the master. Will try again: java.net.SocketTimeoutException connect timed out
      Mar 06, 2018 4:51:48 PM org.jenkinsci.remoting.engine.JnlpAgentEndpointResolver waitForReady
      INFO: Failed to connect to the master. Will try again: java.net.SocketTimeoutException connect timed out
      Mar 06, 2018 4:52:03 PM org.jenkinsci.remoting.engine.JnlpAgentEndpointResolver waitForReady
      INFO: Failed to connect to the master. Will try again: java.net.SocketTimeoutException connect timed out
      Mar 06, 2018 4:52:13 PM jenkins.slaves.restarter.JnlpSlaveRestarterInstaller$2$1 onReconnect
      INFO: Restarting agent via jenkins.slaves.restarter.UnixSlaveRestarter@2d4c4c79
      Mar 06, 2018 4:52:15 PM hudson.remoting.jnlp.Main createEngine
      INFO: Setting up agent: artifact-manager-s3-c7701-0rmp3
      Mar 06, 2018 4:52:15 PM hudson.remoting.jnlp.Main$CuiListener <init>
      INFO: Jenkins agent is running in headless mode.
      Mar 06, 2018 4:52:15 PM hudson.remoting.Engine startEngine
      WARNING: No Working Directory. Using the legacy JAR Cache location: /home/jenkins/.jenkins/cache/jars
      Mar 06, 2018 4:52:15 PM hudson.remoting.jnlp.Main$CuiListener status
      INFO: Locating server among [http://jenkins/]
      Mar 06, 2018 4:52:15 PM org.jenkinsci.remoting.engine.JnlpAgentEndpointResolver resolve
      INFO: Remoting server accepts the following protocols: [JNLP4-connect, Ping]
      Mar 06, 2018 4:52:15 PM hudson.remoting.jnlp.Main$CuiListener status
      INFO: Agent discovery successful
        Agent address: jenkins
        Agent port:    50000
        Identity:      6b:d5:db:ec:8b:c9:46:b3:00:40:72:91:da:6f:6c:31
      Mar 06, 2018 4:52:15 PM hudson.remoting.jnlp.Main$CuiListener status
      INFO: Handshaking
      Mar 06, 2018 4:52:15 PM hudson.remoting.jnlp.Main$CuiListener status
      INFO: Connecting to jenkins:50000
      Mar 06, 2018 4:52:15 PM hudson.remoting.jnlp.Main$CuiListener status
      INFO: Trying protocol: JNLP4-connect
      Mar 06, 2018 4:52:16 PM hudson.remoting.jnlp.Main$CuiListener status
      INFO: Remote identity confirmed: 6b:d5:db:ec:8b:c9:46:b3:00:40:72:91:da:6f:6c:31
      Mar 06, 2018 4:52:17 PM hudson.remoting.jnlp.Main$CuiListener status
      INFO: Connected
      Mar 06, 2018 4:52:25 PM hudson.remoting.jnlp.Main$CuiListener status
      INFO: Terminated
      Mar 06, 2018 4:52:25 PM hudson.remoting.Request$2 run
      WARNING: Failed to send back a reply to the request hudson.remoting.Request$2@4fbffa99
      hudson.remoting.ChannelClosedException: Channel "hudson.remoting.Channel@452b851a:JNLP4-connect connection to jenkins/100.67.180.57:50000": channel is already closed
      	at hudson.remoting.Channel.send(Channel.java:672)
      	at hudson.remoting.Request$2.run(Request.java:371)
      	at hudson.remoting.InterceptingExecutorService$1.call(InterceptingExecutorService.java:72)
      	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
      	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
      	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
      	at hudson.remoting.Engine$1$1.run(Engine.java:94)
      	at java.lang.Thread.run(Thread.java:748)
      Caused by: hudson.remoting.Channel$OrderlyShutdown: Command close created at
      	at hudson.remoting.Channel$CloseCommand.execute(Channel.java:1226)
      	at hudson.remoting.Channel$1.handle(Channel.java:561)
      	at hudson.remoting.AbstractByteBufferCommandTransport.processCommand(AbstractByteBufferCommandTransport.java:202)
      	at hudson.remoting.AbstractByteBufferCommandTransport.receive(AbstractByteBufferCommandTransport.java:188)
      	at org.jenkinsci.remoting.protocol.impl.ChannelApplicationLayer.onRead(ChannelApplicationLayer.java:186)
      	at org.jenkinsci.remoting.protocol.ApplicationLayer.onRecv(ApplicationLayer.java:207)
      	at org.jenkinsci.remoting.protocol.ProtocolStack$Ptr.onRecv(ProtocolStack.java:669)
      	at org.jenkinsci.remoting.protocol.impl.SSLEngineFilterLayer.processRead(SSLEngineFilterLayer.java:369)
      	at org.jenkinsci.remoting.protocol.impl.SSLEngineFilterLayer.onRecv(SSLEngineFilterLayer.java:117)
      	at org.jenkinsci.remoting.protocol.ProtocolStack$Ptr.onRecv(ProtocolStack.java:669)
      	at org.jenkinsci.remoting.protocol.NetworkLayer.onRead(NetworkLayer.java:136)
      	at org.jenkinsci.remoting.protocol.impl.BIONetworkLayer.access$2200(BIONetworkLayer.java:48)
      	at org.jenkinsci.remoting.protocol.impl.BIONetworkLayer$Reader.run(BIONetworkLayer.java:283)
      	... 4 more
      Caused by: Command close created at
      	at hudson.remoting.Command.<init>(Command.java:60)
      	at hudson.remoting.Channel$CloseCommand.<init>(Channel.java:1219)
      	at hudson.remoting.Channel$CloseCommand.<init>(Channel.java:1217)
      	at hudson.remoting.Channel.close(Channel.java:1391)
      	at hudson.remoting.Channel.close(Channel.java:1358)
      	at hudson.slaves.SlaveComputer.closeChannel(SlaveComputer.java:737)
      	at hudson.slaves.SlaveComputer.access$800(SlaveComputer.java:96)
      	at hudson.slaves.SlaveComputer$3.run(SlaveComputer.java:655)
      	at jenkins.util.ContextResettingExecutorService$1.run(ContextResettingExecutorService.java:28)
      	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
      	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
      	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
      	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
      	... 1 more
      

        Attachments

          Activity

          Hide
          oleg_nenashev Oleg Nenashev added a comment -

          Carlos Sanchez

          1. does the Remoting process die after the error in the log? I'd guess so, it should be a fatal error
          2. Do the docker images you use have any failover configured outside the remoting executable?

          Show
          oleg_nenashev Oleg Nenashev added a comment - Carlos Sanchez 1. does the Remoting process die after the error in the log? I'd guess so, it should be a fatal error 2. Do the docker images you use have any failover configured outside the remoting executable?
          Hide
          csanchez Carlos Sanchez added a comment -

          1. yes, the docker agent dies after the last log Failed to send back a reply to the request
          2. No, it's the standard https://github.com/jenkinsci/docker-jnlp-slave image

          Show
          csanchez Carlos Sanchez added a comment - 1. yes, the docker agent dies after the last log Failed to send back a reply to the request 2. No, it's the standard https://github.com/jenkinsci/docker-jnlp-slave image
          Hide
          csanchez Carlos Sanchez added a comment -

          My guess is that it has to do with the double restart because there are tests that exercise durable pipeline at https://github.com/jenkinsci/kubernetes-plugin/blob/master/src/test/java/org/csanchez/jenkins/plugins/kubernetes/pipeline/RestartPipelineTest.java

          In my instance I increased the time for health checks to avoid the double restart

          Show
          csanchez Carlos Sanchez added a comment - My guess is that it has to do with the double restart because there are tests that exercise durable pipeline at https://github.com/jenkinsci/kubernetes-plugin/blob/master/src/test/java/org/csanchez/jenkins/plugins/kubernetes/pipeline/RestartPipelineTest.java In my instance I increased the time for health checks to avoid the double restart
          Hide
          oleg_nenashev Oleg Nenashev added a comment -

          From Remoting perspective it is a default behavior for fatal errors. What is "fatal" is separate topic, e.g. see JENKINS-46515.

          Generally the advanced failover behavior should be implemented by users of slave.jar, e.g. see https://github.com/jenkinsci/windows-slave-installer-module/blob/master/src/main/resources/org/jenkinsci/modules/windows_slave_installer/jenkins-slave.xml#L52 . Feel free to submit feature requests to Remoting though.

          Show
          oleg_nenashev Oleg Nenashev added a comment - From Remoting perspective it is a default behavior for fatal errors. What is "fatal" is separate topic, e.g. see JENKINS-46515 . Generally the advanced failover behavior should be implemented by users of slave.jar, e.g. see https://github.com/jenkinsci/windows-slave-installer-module/blob/master/src/main/resources/org/jenkinsci/modules/windows_slave_installer/jenkins-slave.xml#L52 . Feel free to submit feature requests to Remoting though.
          Hide
          csanchez Carlos Sanchez added a comment -

          This seems to work after increasing the liveness checks making the master not restart twice

          Show
          csanchez Carlos Sanchez added a comment - This seems to work after increasing the liveness checks making the master not restart twice

            People

            • Assignee:
              csanchez Carlos Sanchez
              Reporter:
              csanchez Carlos Sanchez
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: