Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-59579

EC2 Plugin stops slave when build is running

    Details

    • Type: Bug
    • Status: Closed (View Workflow)
    • Priority: Blocker
    • Resolution: Not A Defect
    • Component/s: ec2-plugin, swarm-plugin
    • Labels:
      None
    • Environment:
      Jenkins 2.187
      Amazon EC2 1.44.1
      Swarm 3.13
    • Similar Issues:

      Description

      I have set up the connection between Jenkins and AWS via Amazon EC2 plugin. Jenkins master cloud config: 

       

      The node connects via the Amazon plugin and then creates a new connection via Swarm plugin and the job ends up running on the connection made through swarm. (This is because my jobs include TestComplete & FlaUI and winRM is not quite suited for their requirements).

       

      Jobs that take under 25 min run successfully, anything that goes over 25-26 min fails with the following:

       Slave log:

      12:49:46 java.io.IOException: Backing channel 'JNLP4-connect connection from 10.230.0.101/10.230.0.101:49724' is disconnected.
      12:49:46 	at hudson.remoting.RemoteInvocationHandler.channelOrFail(RemoteInvocationHandler.java:214)
      12:49:46 	at hudson.remoting.RemoteInvocationHandler.invoke(RemoteInvocationHandler.java:283)
      12:49:46 	at com.sun.proxy.$Proxy89.isAlive(Unknown Source)
      12:49:46 	at hudson.Launcher$RemoteLauncher$ProcImpl.isAlive(Launcher.java:1172)
      12:49:46 	at hudson.Launcher$RemoteLauncher$ProcImpl.join(Launcher.java:1164)
      12:49:46 	at hudson.Launcher$ProcStarter.join(Launcher.java:492)
      12:49:46 	at hudson.plugins.gradle.Gradle.performTask(Gradle.java:333)
      12:49:46 	at hudson.plugins.gradle.Gradle.perform(Gradle.java:225)
      12:49:46 	at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:20)
      12:49:46 	at hudson.model.AbstractBuild$AbstractBuildExecution.perform(AbstractBuild.java:741)
      12:49:46 	at hudson.model.Build$BuildExecution.build(Build.java:206)
      12:49:46 	at hudson.model.Build$BuildExecution.doRun(Build.java:163)
      12:49:46 	at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:504)
      12:49:46 	at hudson.model.Run.execute(Run.java:1815)
      12:49:46 	at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:43)
      12:49:46 	at hudson.model.ResourceController.execute(ResourceController.java:97)
      12:49:46 	at hudson.model.Executor.run(Executor.java:429)
      12:49:46 Caused by: java.nio.channels.ClosedChannelException
      12:49:46 	at org.jenkinsci.remoting.protocol.NetworkLayer.onRecvClosed(NetworkLayer.java:154)
      12:49:46 	at org.jenkinsci.remoting.protocol.impl.NIONetworkLayer.ready(NIONetworkLayer.java:179)
      12:49:46 	at org.jenkinsci.remoting.protocol.IOHub$OnReady.run(IOHub.java:795)
      12:49:46 	at jenkins.util.ContextResettingExecutorService$1.run(ContextResettingExecutorService.java:28)
      12:49:46 	at jenkins.security.ImpersonatingExecutorService$1.run(ImpersonatingExecutorService.java:59)
      12:49:46 	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
      12:49:46 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
      12:49:46 	at java.lang.Thread.run(Thread.java:748)
      

      On the master's log I can see:

      Idle timeout of EC2 (Itiviti AWS) - Windows Jenkins node autoconnecting to deb-jenkins-prd using Swarm plugin (i-000908b57bb5d82a7) after 30 idle minutes, instance statusRUNNING
      Sep 30, 2019 8:40:45 AM INFO hudson.plugins.ec2.EC2AbstractSlave idleTimeout
      EC2 instance idle time expired: i-000908b57bb5d82a7
      Sep 30, 2019 8:40:46 AM INFO hudson.plugins.ec2.EC2OndemandSlave terminate
      Terminated EC2 instance (terminated): i-000908b57bb5d82a7
      Sep 30, 2019 8:40:46 AM INFO jenkins.slaves.DefaultJnlpSlaveReceiver channelClosed
      IOHub#1: Worker[channel:java.nio.channels.SocketChannel[connected local=/172.17.0.2:40440 remote=10.230.0.71/10.230.0.71:49735]] / Computer.threadPoolForRemoting [#85772] for ec2amaz-glc1084 terminated: java.nio.channels.ClosedChannelException
      Sep 30, 2019 8:40:46 AM INFO hudson.model.Run execute
      aws-ul-trader-extension-master-desk-uitests-listorders #22 main build action completed: FAILURE
      Sep 30, 2019 8:40:46 AM INFO hudson.plugins.ec2.EC2OndemandSlave terminate
      Removed EC2 instance from jenkins master: i-000908b57bb5d82a7
      

      After that period of time the slave is disconnected even though the build was running on it. Any help in tracking down the problem is much appreciated!

        Attachments

        1. Capture.PNG
          Capture.PNG
          56 kB
        2. Capture2.PNG
          Capture2.PNG
          44 kB
        3. Capture3.PNG
          Capture3.PNG
          53 kB

          Activity

          There are no comments yet on this issue.

            People

            • Assignee:
              thoulen FABRIZIO MANFREDI
              Reporter:
              gcimpoies George Cimpoies
            • Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: