Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-41163

Launch retry not effective most of the time

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Fixed
    • Icon: Major Major
    • ssh-slaves-plugin
    • None

      The feature to retry connection is only activated in case it fails with IOException. Most of the intermittent issues (at least for provisioned nodes: No rout to host, Connection Refused, etc.) are signaled as IllegalStateException and therefore not a subject to retry.

      Also, the code seems to handle "Connection refused" only. I found these in my slave logs:

      Connection refused (Connection refused)
      Connection reset
      Connection timed out
      Connection timed out (Connection timed out)
      No route to host
      No route to host (Host unreachable)
      Premature connection close
      

      Also, it seems that the problems are reported after the retry cycle. I am not familiar with the code base but it feels strange to me that Connection.connect() completes and right after that an attempt to use the connection fails with No route to host. Note that the cause message is printed but not attached to exception reported.

      [05/20/16 13:01:45] [SSH] Opening SSH connection to 192.168.1.36:22.
      No route to host
      ERROR: Unexpected error in launching a slave. This is probably a bug in Jenkins.
      ha:AAAAWB+LCAAAAAAAAP9b85aBtbiIQSmjNKU4P08vOT+vOD8nVc8DzHWtSE4tKMnMz/PLL0ldFVf2c+b/lb5MDAwVRQxSaBqcITRIIQMEMIIUFgAAckCEiWAAAAA=java.lang.IllegalStateException: Connection is not established!
      at com.trilead.ssh2.Connection.getRemainingAuthMethods(Connection.java:1030)
      at com.cloudbees.jenkins.plugins.sshcredentials.impl.TrileadSSHPublicKeyAuthenticator.getRemainingAuthMethods(TrileadSSHPublicKeyAuthenticator.java:88)
      at com.cloudbees.jenkins.plugins.sshcredentials.impl.TrileadSSHPublicKeyAuthenticator.canAuthenticate(TrileadSSHPublicKeyAuthenticator.java:80)
      at com.cloudbees.jenkins.plugins.sshcredentials.SSHAuthenticator.newInstance(SSHAuthenticator.java:212)
      at com.cloudbees.jenkins.plugins.sshcredentials.SSHAuthenticator.newInstance(SSHAuthenticator.java:172)
      at hudson.plugins.sshslaves.SSHLauncher.openConnection(SSHLauncher.java:1212)
      at hudson.plugins.sshslaves.SSHLauncher$2.call(SSHLauncher.java:711)
      at hudson.plugins.sshslaves.SSHLauncher$2.call(SSHLauncher.java:706)
      at java.util.concurrent.FutureTask.run(FutureTask.java:266)
      at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
      at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
      at java.lang.Thread.run(Thread.java:745)
      [05/20/16 13:01:48] Launch failed - cleaning up connection
      [05/20/16 13:01:48] [SSH] Connection closed.
      

      [1] https://issues.jenkins-ci.org/browse/JENKINS-34100

            olivergondza Oliver Gondža
            olivergondza Oliver Gondža
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

              Created:
              Updated:
              Resolved: