Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-72997

Agent does not reconnect to controller after controller restart

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Fixed
    • Icon: Major Major
    • core
    • None
    • Windows
      Jenkins version: Jenkins 2.440.2
      java version: amazoncorretto 21 windows 64bit

      Hi,

      we have a typical controller-agent setup in our environment. Both (controller and agent) running as windows service under the same active directory user.

      When I start first the controller and then the agent, everything works as expected. But when I restart the controller (for example for plugin upates) the agent does not com back.

      The agent talked with the controller over hard JNLP port. The agent logs have no errors and the agent says "connected".

      On the controller side we see in our logs following:

       

      2024-04-11 09:15:10.445+0000 [id=150] INFO h.TcpSlaveAgentListener$ConnectionHandler#run: Accepted JNLP4-connect connection #2 from /147.54.218.112:56269 2024-04-11 09:15:10.445+0000 [id=149] INFO h.TcpSlaveAgentListener$ConnectionHandler#run: Connection #1 from /147.54.218.112:56268 failed: null 2024-04-11 09:15:11.513+0000 [id=153] WARNING j.u.ErrorLoggingExecutorService#lambda$wrap$0 java.lang.NullPointerException: Cannot invoke "java.lang.Boolean.booleanValue()" because the return value of "java.util.Map.put(Object, Object)" is null at com.cloudbees.jenkins.support.impl.SlaveLaunchLogs$Jenkins72799Hack.preOnline(SlaveLaunchLogs.java:200) at hudson.slaves.SlaveComputer.setChannel(SlaveComputer.java:720) at jenkins.slaves.DefaultJnlpSlaveReceiver.afterChannel(DefaultJnlpSlaveReceiver.java:176) at org.jenkinsci.remoting.engine.JnlpConnectionState.fire(JnlpConnectionState.java:337) at org.jenkinsci.remoting.engine.JnlpConnectionState.fireAfterChannel(JnlpConnectionState.java:428) at org.jenkinsci.remoting.engine.JnlpProtocol4Handler$Handler.lambda$onChannel$0(JnlpProtocol4Handler.java:336) at jenkins.util.ContextResettingExecutorService$1.run(ContextResettingExecutorService.java:28) at jenkins.security.ImpersonatingExecutorService$1.run(ImpersonatingExecutorService.java:68) at jenkins.util.ErrorLoggingExecutorService.lambda$wrap$0(ErrorLoggingExecutorService.java:51) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642) at java.base/java.lang.Thread.run(Thread.java:1583) 2024-04-11 09:15:11.519+0000 [id=153] WARNING h.u.ExceptionCatchingThreadFactory#uncaughtException: Thread Computer.threadPoolForRemoting [#3] terminated unexpectedly java.lang.NullPointerException: Cannot invoke "java.lang.Boolean.booleanValue()" because the return value of "java.util.Map.put(Object, Object)" is null at com.cloudbees.jenkins.support.impl.SlaveLaunchLogs$Jenkins72799Hack.preOnline(SlaveLaunchLogs.java:200) at hudson.slaves.SlaveComputer.setChannel(SlaveComputer.java:720) at jenkins.slaves.DefaultJnlpSlaveReceiver.afterChannel(DefaultJnlpSlaveReceiver.java:176) at org.jenkinsci.remoting.engine.JnlpConnectionState.fire(JnlpConnectionState.java:337) at org.jenkinsci.remoting.engine.JnlpConnectionState.fireAfterChannel(JnlpConnectionState.java:428) at org.jenkinsci.remoting.engine.JnlpProtocol4Handler$Handler.lambda$onChannel$0(JnlpProtocol4Handler.java:336) at jenkins.util.ContextResettingExecutorService$1.run(ContextResettingExecutorService.java:28) at jenkins.security.ImpersonatingExecutorService$1.run(ImpersonatingExecutorService.java:68) at jenkins.util.ErrorLoggingExecutorService.lambda$wrap$0(ErrorLoggingExecutorService.java:51) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642) at java.base/java.lang.Thread.run(Thread.java:1583)
       
      

       

      This is the agent log:

       

      pr 11, 2024 11:18:43 AM org.jenkinsci.remoting.engine.WorkDirManager initializeWorkDir INFO: Using D:\applications\www\jenkinsSlaveNG\remoting as a remoting work directory Apr 11, 2024 11:18:43 AM org.jenkinsci.remoting.engine.WorkDirManager setupLogging INFO: Both error and output logs will be printed to D:\applications\www\jenkinsSlaveNG\remoting Apr 11, 2024 11:18:43 AM hudson.remoting.Launcher createEngine INFO: Setting up agent: test-from-DINRj4fDGbJNmuj Apr 11, 2024 11:18:43 AM hudson.remoting.Engine startEngine INFO: Using Remoting version: 3206.vb_15dcf73f6a_9 Apr 11, 2024 11:18:43 AM org.jenkinsci.remoting.engine.WorkDirManager initializeWorkDir INFO: Using D:\applications\www\jenkinsSlaveNG\remoting as a remoting work directory Apr 11, 2024 11:18:43 AM hudson.remoting.Launcher$CuiListener status INFO: Locating server among [https://eb-conf-ci.siemens.net/jenkinsng/] Apr 11, 2024 11:18:43 AM org.jenkinsci.remoting.engine.JnlpAgentEndpointResolver resolve INFO: Remoting server accepts the following protocols: [JNLP4-connect, Ping] Apr 11, 2024 11:18:43 AM hudson.remoting.Launcher$CuiListener status INFO: Agent discovery successful Agent address: eb-conf-ci.siemens.net Agent port: 49188 Identity: 5f:2a:e9:52:1c:41:77:b9:4d:5c:cf:26:df:47:09:c6 Apr 11, 2024 11:18:43 AM hudson.remoting.Launcher$CuiListener status INFO: Handshaking Apr 11, 2024 11:18:43 AM hudson.remoting.Launcher$CuiListener status INFO: Connecting to eb-conf-ci.siemens.net:49188 Apr 11, 2024 11:18:43 AM hudson.remoting.Launcher$CuiListener status INFO: Server reports protocol JNLP4-connect-proxy not supported, skipping Apr 11, 2024 11:18:43 AM hudson.remoting.Launcher$CuiListener status INFO: Trying protocol: JNLP4-connect Apr 11, 2024 11:18:43 AM org.jenkinsci.remoting.protocol.impl.BIONetworkLayer$Reader run INFO: Waiting for ProtocolStack to start. Apr 11, 2024 11:18:44 AM hudson.remoting.Launcher$CuiListener status INFO: Remote identity confirmed: 5f:2a:e9:52:1c:41:77:b9:4d:5c:cf:26:df:47:09:c6 Apr 11, 2024 11:18:44 AM hudson.remoting.Launcher$CuiListener status INFO: Connected
       
      

      When restarting the agent service after the controller is back, the agent node is then up and running.

            Unassigned Unassigned
            waffel Thomas Wabner
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

              Created:
              Updated:
              Resolved: