Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-59910

Nodes can't connect to master after Uncaught exception in TcpSlaveAgentListener ConnectionHandler Thread

    Details

    • Type: Bug
    • Status: Open (View Workflow)
    • Priority: Critical
    • Resolution: Unresolved
    • Component/s: core, kubernetes-plugin
    • Labels:
    • Environment:
      Docker image based on jenkins/jenkins:2.204.5-jdk11
      Both with and without Nginx 1.17.6 as reverse proxy
      Ubuntu 18.04
    • Similar Issues:

      Description

      Investigating a spike in builds queue size we've found out that TcpSlaveAgent listener thread was dead with the following logs:

      2019-10-23 09:02:17.236+0000 [id=200815]        SEVERE  h.TcpSlaveAgentListener$ConnectionHandler#lambda$new$0: Uncaught exception in TcpSlaveAgentListener ConnectionHandler Thread[TCP agent connection handler #1715 with /10.125.100.99:47700,5,main]
      java.lang.UnsupportedOperationException: Network layer is not supposed to call isSendOpen
              at org.jenkinsci.remoting.protocol.ProtocolStack$Ptr.isSendOpen(ProtocolStack.java:730)
              at org.jenkinsci.remoting.protocol.FilterLayer.isSendOpen(FilterLayer.java:340)
              at org.jenkinsci.remoting.protocol.ProtocolStack$Ptr.isSendOpen(ProtocolStack.java:738)
              at org.jenkinsci.remoting.protocol.FilterLayer.isSendOpen(FilterLayer.java:340)
              at org.jenkinsci.remoting.protocol.impl.SSLEngineFilterLayer.isSendOpen(SSLEngineFilterLayer.java:237)
              at org.jenkinsci.remoting.protocol.ProtocolStack$Ptr.isSendOpen(ProtocolStack.java:738)
              at org.jenkinsci.remoting.protocol.FilterLayer.isSendOpen(FilterLayer.java:340)
              at org.jenkinsci.remoting.protocol.impl.ConnectionHeadersFilterLayer.isSendOpen(ConnectionHeadersFilterLayer.java:514)
              at org.jenkinsci.remoting.protocol.ProtocolStack$Ptr.doSend(ProtocolStack.java:690)
              at org.jenkinsci.remoting.protocol.ApplicationLayer.write(ApplicationLayer.java:157)
              at org.jenkinsci.remoting.protocol.impl.ChannelApplicationLayer.start(ChannelApplicationLayer.java:230)
              at org.jenkinsci.remoting.protocol.ProtocolStack.init(ProtocolStack.java:201)
              at org.jenkinsci.remoting.protocol.ProtocolStack.access$700(ProtocolStack.java:106)
              at org.jenkinsci.remoting.protocol.ProtocolStack$Builder.build(ProtocolStack.java:554)
              at org.jenkinsci.remoting.engine.JnlpProtocol4Handler.handle(JnlpProtocol4Handler.java:153)
              at jenkins.slaves.JnlpSlaveAgentProtocol4.handle(JnlpSlaveAgentProtocol4.java:203)
              at hudson.TcpSlaveAgentListener$ConnectionHandler.run(TcpSlaveAgentListener.java:271)
      2019-10-23 09:02:17.237+0000 [id=200815]        WARNING hudson.TcpSlaveAgentListener$1#run: Connection handler failed, restarting listener
      java.lang.UnsupportedOperationException: Network layer is not supposed to call isSendOpen
              at org.jenkinsci.remoting.protocol.ProtocolStack$Ptr.isSendOpen(ProtocolStack.java:730)
              at org.jenkinsci.remoting.protocol.FilterLayer.isSendOpen(FilterLayer.java:340)
              at org.jenkinsci.remoting.protocol.ProtocolStack$Ptr.isSendOpen(ProtocolStack.java:738)
              at org.jenkinsci.remoting.protocol.FilterLayer.isSendOpen(FilterLayer.java:340)
              at org.jenkinsci.remoting.protocol.impl.SSLEngineFilterLayer.isSendOpen(SSLEngineFilterLayer.java:237)
              at org.jenkinsci.remoting.protocol.ProtocolStack$Ptr.isSendOpen(ProtocolStack.java:738)
              at org.jenkinsci.remoting.protocol.FilterLayer.isSendOpen(FilterLayer.java:340)
              at org.jenkinsci.remoting.protocol.impl.ConnectionHeadersFilterLayer.isSendOpen(ConnectionHeadersFilterLayer.java:514)
              at org.jenkinsci.remoting.protocol.ProtocolStack$Ptr.doSend(ProtocolStack.java:690)
              at org.jenkinsci.remoting.protocol.ApplicationLayer.write(ApplicationLayer.java:157)
              at org.jenkinsci.remoting.protocol.impl.ChannelApplicationLayer.start(ChannelApplicationLayer.java:230)
              at org.jenkinsci.remoting.protocol.ProtocolStack.init(ProtocolStack.java:201)
              at org.jenkinsci.remoting.protocol.ProtocolStack.access$700(ProtocolStack.java:106)
              at org.jenkinsci.remoting.protocol.ProtocolStack$Builder.build(ProtocolStack.java:554)
              at org.jenkinsci.remoting.engine.JnlpProtocol4Handler.handle(JnlpProtocol4Handler.java:153)
              at jenkins.slaves.JnlpSlaveAgentProtocol4.handle(JnlpSlaveAgentProtocol4.java:203)
              at hudson.TcpSlaveAgentListener$ConnectionHandler.run(TcpSlaveAgentListener.java:271) 

      Followed by logs from nodes created by Jenkins Kubernetes Plugin:

      SEVERE: http://jenkins-master.example.com/ provided port:50000 is not reachable
      java.io.IOException: http://jenkins-master.example.com/ provided port:50000 is not reachable
              at org.jenkinsci.remoting.engine.JnlpAgentEndpointResolver.resolve(JnlpAgentEndpointResolver.java:287)
              at hudson.remoting.Engine.innerRun(Engine.java:523)
              at hudson.remoting.Engine.run(Engine.java:474)
       

      Changing JNLP port from 50000 to 50001 and back in Jenkins settings helped to restore connection and then nodes were able to connect to master again.

        Attachments

          Activity

          Hide
          oxygenxo Andrey Babushkin added a comment -

          It didn't occur after rollback to Jenkins 2.176.4

          Show
          oxygenxo Andrey Babushkin added a comment - It didn't occur after rollback to Jenkins 2.176.4
          Hide
          oxygenxo Andrey Babushkin added a comment -

          Nope it DID occur and still occur sometimes, we on Jenkins 2.204.5 now.

          Jesse Glick could you please give me some hints of how to debug this further to find the root cause?

          Show
          oxygenxo Andrey Babushkin added a comment - Nope it DID occur and still occur sometimes, we on Jenkins 2.204.5 now. Jesse Glick could you please give me some hints of how to debug this further to find the root cause?
          Hide
          jglick Jesse Glick added a comment -

          Sorry, I am not familiar with ProtocolStack really. You can try WebSocket mode to see if it behaves any better.

          Show
          jglick Jesse Glick added a comment - Sorry, I am not familiar with ProtocolStack really. You can try WebSocket mode to see if it behaves any better.
          Hide
          oxygenxo Andrey Babushkin added a comment -

          Thanks! I'll give it a shot.

          For future readers of this ticket Jesse referenced to WebSocket mode introduced in 2.217 https://www.jenkins.io/blog/2020/02/02/web-socket/

          Show
          oxygenxo Andrey Babushkin added a comment - Thanks! I'll give it a shot. For future readers of this ticket Jesse referenced to WebSocket mode introduced in 2.217  https://www.jenkins.io/blog/2020/02/02/web-socket/

            People

            • Assignee:
              Unassigned
              Reporter:
              oxygenxo Andrey Babushkin
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated: