Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-53431

Agent attempts reconnect after termination. Leaves Kube pod in failed state

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Fixed
    • Icon: Minor Minor
    • core, kubernetes-plugin
    • None

      I noticed about a month ago that I had many un-deleted pods in Kubernetes. The jobs didn't fail according to Jenkins, but the pods exited non-zero in kubernetes. I'm not sure what version this started with, but it's happening on the latest.

       

      kubectl --namespace=jenkins get po -a
      NAME READY STATUS RESTARTS AGE
      jenkins-75fd5fdbb-jkrmf 1/1 Running 0 53d
      jenkins-slave-jxk94-729sl 0/1 Error 0 3h
      jenkins-slave-jxk94-bq6kp 0/1 Error 0 25m
      jenkins-slave-jxk94-jrdgb 0/1 Error 0 35m
      jenkins-slave-jxk94-w0tdh 0/1 Error 0 2m
      

      This bug isn't critical since it's not failing my builds, but something is going wrong.

       

      This is the log from the agent:

       

      Warning: JnlpProtocol3 is disabled by default, use JNLP_PROTOCOL_OPTS to alter the behavior
      Warning: SECRET is defined twice in command-line arguments and the environment variable
      Warning: AGENT_NAME is defined twice in command-line arguments and the environment variable
      Sep 05, 2018 3:57:44 PM hudson.remoting.jnlp.Main createEngine
      INFO: Setting up agent: jenkins-slave-jxk94-6c6sd
      Sep 05, 2018 3:57:44 PM hudson.remoting.jnlp.Main$CuiListener <init>
      INFO: Jenkins agent is running in headless mode.
      Sep 05, 2018 3:57:44 PM hudson.remoting.Engine startEngine
      INFO: Using Remoting version: 3.25
      Sep 05, 2018 3:57:44 PM hudson.remoting.Engine startEngine
      WARNING: No Working Directory. Using the legacy JAR Cache location: /home/jenkins/.jenkins/cache/jars
      Sep 05, 2018 3:57:45 PM hudson.remoting.jnlp.Main$CuiListener status
      INFO: Locating server among [http://jenkins-discovery/]
      Sep 05, 2018 3:57:45 PM org.jenkinsci.remoting.engine.JnlpAgentEndpointResolver resolve
      INFO: Remoting server accepts the following protocols: [JNLP4-connect, Ping]
      Sep 05, 2018 3:57:45 PM hudson.remoting.jnlp.Main$CuiListener status
      INFO: Agent discovery successful
        Agent address: jenkins-discovery
        Agent port:    50000
        Identity:      96:81:a1:68:84:20:aa:12:1f:2b:97:b0:c5:2f:de:25
      Sep 05, 2018 3:57:45 PM hudson.remoting.jnlp.Main$CuiListener status
      INFO: Handshaking
      Sep 05, 2018 3:57:45 PM hudson.remoting.jnlp.Main$CuiListener status
      INFO: Connecting to jenkins-discovery:50000
      Sep 05, 2018 3:57:45 PM hudson.remoting.jnlp.Main$CuiListener status
      INFO: Trying protocol: JNLP4-connect
      Sep 05, 2018 3:57:45 PM hudson.remoting.jnlp.Main$CuiListener status
      INFO: Remote identity confirmed: 96:81:a1:68:84:20:aa:12:1f:2b:97:b0:c5:2f:de:25
      Sep 05, 2018 3:57:46 PM hudson.remoting.jnlp.Main$CuiListener status
      INFO: Connected
      Sep 05, 2018 3:57:48 PM org.jenkinsci.remoting.util.AnonymousClassWarnings warn
      WARNING: Attempt to (de-)serialize anonymous class org.jenkinsci.plugins.envinject.EnvInjectComputerListener$2; see: https://jenkins.io/redirect/serialization-of-anonymous-classes/
      Sep 05, 2018 3:57:49 PM org.jenkinsci.remoting.util.AnonymousClassWarnings warn
      WARNING: Attempt to (de-)serialize anonymous class org.jenkinsci.plugins.gitclient.Git$1; see: https://jenkins.io/redirect/serialization-of-anonymous-classes/
      Sep 05, 2018 3:57:51 PM org.jenkinsci.remoting.util.AnonymousClassWarnings warn
      WARNING: Attempt to (de-)serialize anonymous class org.jenkinsci.plugins.gitclient.RemoteGitImpl$CommandInvocationHandler$1; see: https://jenkins.io/redirect/serialization-of-anonymous-classes/
      Sep 05, 2018 4:07:36 PM hudson.remoting.jnlp.Main$CuiListener status
      INFO: Terminated
      

      ^^ This terminates because I abort the build. The log then continues:

       

       

      Sep 05, 2018 4:07:46 PM jenkins.slaves.restarter.JnlpSlaveRestarterInstaller$FindEffectiveRestarters$1 onReconnect
      INFO: Restarting agent via jenkins.slaves.restarter.UnixSlaveRestarter@581415ae
      Sep 05, 2018 4:07:49 PM hudson.remoting.jnlp.Main createEngine
      INFO: Setting up agent: jenkins-slave-jxk94-6c6sd
      Sep 05, 2018 4:07:49 PM hudson.remoting.jnlp.Main$CuiListener <init>
      INFO: Jenkins agent is running in headless mode.
      Sep 05, 2018 4:07:49 PM hudson.remoting.Engine startEngine
      INFO: Using Remoting version: 3.25
      Sep 05, 2018 4:07:49 PM hudson.remoting.Engine startEngine
      WARNING: No Working Directory. Using the legacy JAR Cache location: /home/jenkins/.jenkins/cache/jars
      Sep 05, 2018 4:07:49 PM hudson.remoting.jnlp.Main$CuiListener status
      INFO: Locating server among [http://jenkins-discovery/]
      Sep 05, 2018 4:07:49 PM org.jenkinsci.remoting.engine.JnlpAgentEndpointResolver resolve
      INFO: Remoting server accepts the following protocols: [JNLP4-connect, Ping]
      Sep 05, 2018 4:07:50 PM hudson.remoting.jnlp.Main$CuiListener status
      INFO: Agent discovery successful
        Agent address: jenkins-discovery
        Agent port:    50000
        Identity:      96:81:a1:68:84:20:aa:12:1f:2b:97:b0:c5:2f:de:25
      Sep 05, 2018 4:07:50 PM hudson.remoting.jnlp.Main$CuiListener status
      INFO: Handshaking
      Sep 05, 2018 4:07:50 PM hudson.remoting.jnlp.Main$CuiListener status
      INFO: Connecting to jenkins-discovery:50000
      Sep 05, 2018 4:07:50 PM hudson.remoting.jnlp.Main$CuiListener status
      INFO: Trying protocol: JNLP4-connect
      Sep 05, 2018 4:07:50 PM hudson.remoting.jnlp.Main$CuiListener status
      INFO: Remote identity confirmed: 96:81:a1:68:84:20:aa:12:1f:2b:97:b0:c5:2f:de:25
      Sep 05, 2018 4:07:50 PM org.jenkinsci.remoting.protocol.impl.ConnectionHeadersFilterLayer onRecv
      INFO: [JNLP4-connect connection to jenkins-discovery/10.55.243.214:50000] Local headers refused by remote: Unknown client name: jenkins-slave-jxk94-6c6sd
      Sep 05, 2018 4:07:50 PM hudson.remoting.jnlp.Main$CuiListener status
      INFO: Protocol JNLP4-connect encountered an unexpected exception
      java.util.concurrent.ExecutionException: org.jenkinsci.remoting.protocol.impl.ConnectionRefusalException: Unknown client name: jenkins-slave-jxk94-6c6sd
      	at org.jenkinsci.remoting.util.SettableFuture.get(SettableFuture.java:223)
      	at hudson.remoting.Engine.innerRun(Engine.java:614)
      	at hudson.remoting.Engine.run(Engine.java:474)
      Caused by: org.jenkinsci.remoting.protocol.impl.ConnectionRefusalException: Unknown client name: jenkins-slave-jxk94-6c6sd
      	at org.jenkinsci.remoting.protocol.impl.ConnectionHeadersFilterLayer.newAbortCause(ConnectionHeadersFilterLayer.java:378)
      	at org.jenkinsci.remoting.protocol.impl.ConnectionHeadersFilterLayer.onRecvClosed(ConnectionHeadersFilterLayer.java:433)
      	at org.jenkinsci.remoting.protocol.ProtocolStack$Ptr.onRecvClosed(ProtocolStack.java:832)
      	at org.jenkinsci.remoting.protocol.FilterLayer.onRecvClosed(FilterLayer.java:287)
      	at org.jenkinsci.remoting.protocol.impl.SSLEngineFilterLayer.onRecvClosed(SSLEngineFilterLayer.java:172)
      	at org.jenkinsci.remoting.protocol.ProtocolStack$Ptr.onRecvClosed(ProtocolStack.java:832)
      	at org.jenkinsci.remoting.protocol.NetworkLayer.onRecvClosed(NetworkLayer.java:154)
      	at org.jenkinsci.remoting.protocol.impl.BIONetworkLayer.access$1500(BIONetworkLayer.java:48)
      	at org.jenkinsci.remoting.protocol.impl.BIONetworkLayer$Reader.run(BIONetworkLayer.java:247)
      	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
      	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
      	at hudson.remoting.Engine$1.lambda$newThread$0(Engine.java:93)
      	at java.lang.Thread.run(Thread.java:748)
      	Suppressed: java.nio.channels.ClosedChannelException
      		... 7 moreSep 05, 2018 4:07:50 PM hudson.remoting.jnlp.Main$CuiListener status
      INFO: Connecting to jenkins-discovery:50000
      Sep 05, 2018 4:07:50 PM hudson.remoting.jnlp.Main$CuiListener status
      INFO: Server reports protocol JNLP4-plaintext not supported, skipping
      Sep 05, 2018 4:07:50 PM hudson.remoting.jnlp.Main$CuiListener status
      INFO: Protocol JNLP3-connect is not enabled, skipping
      Sep 05, 2018 4:07:50 PM hudson.remoting.jnlp.Main$CuiListener status
      INFO: Server reports protocol JNLP2-connect not supported, skipping
      Sep 05, 2018 4:07:50 PM hudson.remoting.jnlp.Main$CuiListener status
      INFO: Server reports protocol JNLP-connect not supported, skipping
      Sep 05, 2018 4:07:50 PM hudson.remoting.jnlp.Main$CuiListener error
      SEVERE: The server rejected the connection: None of the protocols were accepted
      java.lang.Exception: The server rejected the connection: None of the protocols were accepted
      	at hudson.remoting.Engine.onConnectionRejected(Engine.java:675)
      	at hudson.remoting.Engine.innerRun(Engine.java:639)
      	at hudson.remoting.Engine.run(Engine.java:474)
      

      It attempts to reconnect but is met with "Unknown client name".  Is there some reason it attempts to reconnect and then errors when it cannot?

      I can launch a 4 job parallel pipeline and if I abort it I end up with 5 Errored pods. One is the pipeline job itself, with four of the sub-jobs. This can also happen when builds succeed.

      Is this normal behavior for the agent to keep trying to connect after being terminated?

       

       

       

       

        1. active.txt
          3 kB
        2. node_info
          3 kB
        3. pod_describe
          4 kB
        4. pod_log
          7 kB

            csanchez Carlos Sanchez
            mckaymatt Matt McKay
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

              Created:
              Updated:
              Resolved: