-
Bug
-
Resolution: Unresolved
-
Major
-
None
A JNLP agent without the option '-noReconnect' can abort the connection while trying to reconnect to a master which restarted.
In our architecture, there are several instances of reverse proxies, fronted by a load balancer. Upon restart, a master can roam to a different underlying machine, and the reverse proxies configuration is updated dynamically. However there is a time window (a few seconds) where some reverse proxy can route to the master, and some other can't because it hasn't processed the configuration update yet.
In the agent logs, the sequence looks like this.
Sep 26, 2017 9:51:39 AM hudson.remoting.jnlp.Main createEngine INFO: Setting up slave: fd843edf Sep 26, 2017 9:51:39 AM hudson.remoting.jnlp.Main$CuiListener <init> INFO: Jenkins agent is running in headless mode. Sep 26, 2017 9:51:39 AM hudson.remoting.Engine startEngine WARNING: No Working Directory. Using the legacy JAR Cache location: /root/.jenkins/cache/jars Sep 26, 2017 9:51:39 AM hudson.remoting.jnlp.Main$CuiListener status INFO: Locating server among [http://redactedurl/economic-influence/] Sep 26, 2017 9:51:39 AM org.jenkinsci.remoting.engine.JnlpAgentEndpointResolver resolve INFO: Remoting server accepts the following protocols: [JNLP4-connect, JNLP-connect, Ping, Diagnostic-Ping, JNLP2-connect, OperationsCenter2] Sep 26, 2017 9:51:39 AM hudson.remoting.jnlp.Main$CuiListener status INFO: Agent discovery successful Agent address: ec2-34-224-67-174.compute-1.amazonaws.com Agent port: 31364 Identity: 49:9b:2d:d8:21:41:5d:c6:2b:94:4b:be:08:4f:d5:61 Sep 26, 2017 9:51:39 AM hudson.remoting.jnlp.Main$CuiListener status INFO: Handshaking Sep 26, 2017 9:51:39 AM hudson.remoting.jnlp.Main$CuiListener status INFO: Connecting to ec2-34-224-67-174.compute-1.amazonaws.com:31364 Sep 26, 2017 9:51:39 AM hudson.remoting.jnlp.Main$CuiListener status INFO: Trying protocol: JNLP4-connect Sep 26, 2017 9:51:39 AM hudson.remoting.jnlp.Main$CuiListener status INFO: Remote identity confirmed: 49:9b:2d:d8:21:41:5d:c6:2b:94:4b:be:08:4f:d5:61 Sep 26, 2017 9:51:40 AM hudson.remoting.jnlp.Main$CuiListener status INFO: Connected Sep 26, 2017 9:51:49 AM hudson.remoting.jnlp.Main$CuiListener status INFO: Terminated Sep 26, 2017 9:51:59 AM org.jenkinsci.remoting.engine.JnlpAgentEndpointResolver waitForReady INFO: Master isnt ready to talk to us on {0}. Will retry again: response code={1} Sep 26, 2017 9:52:09 AM org.jenkinsci.remoting.engine.JnlpAgentEndpointResolver waitForReady INFO: Master isnt ready to talk to us on {0}. Will retry again: response code={1} Sep 26, 2017 9:52:24 AM org.jenkinsci.remoting.engine.JnlpAgentEndpointResolver waitForReady INFO: Master isnt ready to talk to us on {0}. Will retry again: response code={1} Sep 26, 2017 9:52:34 AM jenkins.slaves.restarter.JnlpSlaveRestarterInstaller$2$1 onReconnect INFO: Restarting agent via jenkins.slaves.restarter.UnixSlaveRestarter@b0a6197 Sep 26, 2017 9:52:36 AM hudson.remoting.jnlp.Main createEngine INFO: Setting up slave: fd843edf Sep 26, 2017 9:52:36 AM hudson.remoting.jnlp.Main$CuiListener <init> INFO: Jenkins agent is running in headless mode. Sep 26, 2017 9:52:36 AM hudson.remoting.Engine startEngine WARNING: No Working Directory. Using the legacy JAR Cache location: /root/.jenkins/cache/jars Sep 26, 2017 9:52:36 AM hudson.remoting.jnlp.Main$CuiListener status INFO: Locating server among [http://redactedurl/economic-influence/] Sep 26, 2017 9:52:36 AM hudson.remoting.jnlp.Main$CuiListener error SEVERE: http://redactedurl/economic-influence/tcpSlaveAgentListener/ is invalid: 502 Bad Gateway java.io.IOException: http://redactedurl/economic-influence/tcpSlaveAgentListener/ is invalid: 502 Bad Gateway at org.jenkinsci.remoting.engine.JnlpAgentEndpointResolver.resolve(JnlpAgentEndpointResolver.java:168) at hudson.remoting.Engine.innerRun(Engine.java:495) at hudson.remoting.Engine.run(Engine.java:447)
The current problem is once the health check has passed once, a new connection will be attempted, and if it fails, the agent will abort completely instead of falling back to a retry loop.
- links to