Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-54679

SSHLauncher doesn't continue retrying to connect to remote executor

XMLWordPrintable

      SSHLauncher{host='10.50.10.252', port=22, credentialsId='aaf2ee5e-32bd-4675-9793-0570922f9c66', jvmOptions='', javaPath='', prefixStartSlaveCmd='', suffixStartSlaveCmd='', launchTimeoutSeconds=5, maxNumRetries=120, retryWaitTime=2, sshHostKeyVerificationStrategy=hudson.plugins.sshslaves.verifiers.ManuallyTrustedKeyVerificationStrategy, tcpNoDelay=true, trackCredentials=true}
      [11/16/18 20:19:40] [SSH] Opening SSH connection to 10.50.10.252:22.
      Connection refused (Connection refused)
      SSH Connection failed with IOException: "Connection refused (Connection refused)", retrying in 2 seconds. There are 120 more retries left.
      Connection refused (Connection refused)
      SSH Connection failed with IOException: "Connection refused (Connection refused)", retrying in 2 seconds. There are 119 more retries left.
      Connection refused (Connection refused)
      SSH Connection failed with IOException: "Connection refused (Connection refused)", retrying in 2 seconds. There are 118 more retries left.
      ERROR: null
      java.util.concurrent.CancellationException
      {{ at java.util.concurrent.FutureTask.report(FutureTask.java:121)}}
      {{ at java.util.concurrent.FutureTask.get(FutureTask.java:192)}}
      {{ at hudson.plugins.sshslaves.SSHLauncher.launch(SSHLauncher.java:904)}}
      {{ at hudson.slaves.SlaveComputer$1.call(SlaveComputer.java:294)}}
      {{ at jenkins.util.ContextResettingExecutorService$2.call(ContextResettingExecutorService.java:46)}}
      {{ at jenkins.security.ImpersonatingExecutorService$2.call(ImpersonatingExecutorService.java:71)}}
      {{ at java.util.concurrent.FutureTask.run(FutureTask.java:266)}}
      {{ at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)}}
      {{ at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)}}
      {{ at java.lang.Thread.run(Thread.java:748)}}
      [11/16/18 20:19:45] Launch failed - cleaning up connection
      [11/16/18 20:19:45] [SSH] Connection closed.

       

      This happens whenever a new ec2 fleet instance is brought online. During this time cloud-init is still working it's magic to install docker/openjdk and add the new Jenkins user (and it's key). However after the Launch failed error message there are no more retries and that slave is never contacted again, even-though if we manually press the button to reconnect the slave comes online without issues.

       

      Clearly there are more retries left, yet it is completely dead in the water.

      This used to work without issues on older versions of Jenkins and this just recently started.

       

      We are running Jenkins ver. 2.138.3 from the jenkinsci/blueocean docker image.

            terma Artem Stasiuk
            xistence Bert JW Regeer
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

              Created:
              Updated:
              Resolved: