Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-57795

Orphaned EC2 instances after Jenkins restart

    Details

    • Type: Bug
    • Status: Open (View Workflow)
    • Priority: Critical
    • Resolution: Unresolved
    • Component/s: ec2-plugin
    • Labels:
      None
    • Environment:
      Jenkins ver. 2.176.1
      ec2 plugin 1.43, 1.44, 1.45
    • Similar Issues:

      Description

      Sometimes after a Jenkins restart the plugin won't be able to spawn more agents.

      The plugin will just loop on this:

      SlaveTemplate{ami='ami-0efbb291c6e8cc847', labels='docker'}. Attempting to provision slave needed by excess workload of 1 units
      May 31, 2019 2:23:53 PM INFO hudson.plugins.ec2.EC2Cloud getNewOrExistingAvailableSlave
      SlaveTemplate{ami='ami-0efbb291c6e8cc847', labels='docker'}. Cannot provision - no capacity for instances: 0
      May 31, 2019 2:23:53 PM WARNING hudson.plugins.ec2.EC2Cloud provision
      Can't raise nodes for SlaveTemplate{ami='ami-0efbb291c6e8cc847', labels='docker'}
      

      If I go to the EC2 console and terminate the instance manually the plugin will spawn a new one and use it.

      It seems like there is some mismatch in the plugin logic. The part responsible for calculating the number of instances and checking the cap sees the EC2 instance. However the part responsible for picking up running EC2 instances doesn't seem to be able to find it.

      We use a single subnet, security group and vpc (I've seen some reports about this causing problems).

      We use instanceCap = 1 setting as we are testing the plugin, this might make this problem more visible than with a higher cap.

        Attachments

          Activity

          jbochenski Jakub Bochenski created issue -
          jbochenski Jakub Bochenski made changes -
          Field Original Value New Value
          Description Sometimes after a Jenkins restart the plugin won't be able to spawn more agents.

          The plugin will just loop on this:
          {code}SlaveTemplate{ami='ami-0efbb291c6e8cc847', labels='docker'}. Attempting to provision slave needed by excess workload of 1 units
          May 31, 2019 2:23:53 PM INFO hudson.plugins.ec2.EC2Cloud getNewOrExistingAvailableSlave
          SlaveTemplate{ami='ami-0efbb291c6e8cc847', labels='docker'}. Cannot provision - no capacity for instances: 0
          May 31, 2019 2:23:53 PM WARNING hudson.plugins.ec2.EC2Cloud provision
          Can't raise nodes for SlaveTemplate{ami='ami-0efbb291c6e8cc847', labels='docker'}
          {code}

          If I go to the EC2 console and terminate the instance manually the plugin will spawn a new one and use it.

          It seems like there is some mismatch in the plugin logic. The part responsible for calculating the number of instances and checking the cap sees the EC2 instance. However the part responsible for picking up running EC2 instances doesn't seem to be able to find it.

          We use a single subnet, security group and vpc (I've seen some reports about this causing problems).

          It seems the problems do not occur when I do a `/safeRestart` but they do if I use e.g. "restart Jenkins when no jobs are running" form the Update Center.

          We use instanceCap = 1 setting as we are testing the plugin, this might make this problem more visible than with a higher cap.
          Sometimes after a Jenkins restart the plugin won't be able to spawn more agents.

          The plugin will just loop on this:
          {code}SlaveTemplate{ami='ami-0efbb291c6e8cc847', labels='docker'}. Attempting to provision slave needed by excess workload of 1 units
          May 31, 2019 2:23:53 PM INFO hudson.plugins.ec2.EC2Cloud getNewOrExistingAvailableSlave
          SlaveTemplate{ami='ami-0efbb291c6e8cc847', labels='docker'}. Cannot provision - no capacity for instances: 0
          May 31, 2019 2:23:53 PM WARNING hudson.plugins.ec2.EC2Cloud provision
          Can't raise nodes for SlaveTemplate{ami='ami-0efbb291c6e8cc847', labels='docker'}
          {code}

          If I go to the EC2 console and terminate the instance manually the plugin will spawn a new one and use it.

          It seems like there is some mismatch in the plugin logic. The part responsible for calculating the number of instances and checking the cap sees the EC2 instance. However the part responsible for picking up running EC2 instances doesn't seem to be able to find it.

          We use a single subnet, security group and vpc (I've seen some reports about this causing problems).

          It seems the problems do not occur when I do a {{/safeRestart}} but they do if I use e.g. "restart Jenkins when no jobs are running" form the Update Center.

          We use instanceCap = 1 setting as we are testing the plugin, this might make this problem more visible than with a higher cap.
          Hide
          jbochenski Jakub Bochenski added a comment -

          FABRIZIO MANFREDI it would be nice to at least get some pointers on how to debug this further or work around it

          Show
          jbochenski Jakub Bochenski added a comment - FABRIZIO MANFREDI it would be nice to at least get some pointers on how to debug this further or work around it
          Hide
          jbochenski Jakub Bochenski added a comment -

          Raihaan Shouhell maybe you would care to respond?

          Show
          jbochenski Jakub Bochenski added a comment - Raihaan Shouhell maybe you would care to respond?
          Hide
          thoulen FABRIZIO MANFREDI added a comment -

          Can you tell me which version are you using ? 

          There is a bug of the calculation, but should not  affect you case.

          What is the configuration of your pool ? 

          do you have more then one pool with same description, ami and tags ? 

          Can you try with 2  ?

          Show
          thoulen FABRIZIO MANFREDI added a comment - i  Can you tell me which version are you using ?  There is a bug of the calculation, but should not  affect you case. What is the configuration of your pool ?  do you have more then one pool with same description, ami and tags ?  Can you try with 2  ?
          jbochenski Jakub Bochenski made changes -
          Description Sometimes after a Jenkins restart the plugin won't be able to spawn more agents.

          The plugin will just loop on this:
          {code}SlaveTemplate{ami='ami-0efbb291c6e8cc847', labels='docker'}. Attempting to provision slave needed by excess workload of 1 units
          May 31, 2019 2:23:53 PM INFO hudson.plugins.ec2.EC2Cloud getNewOrExistingAvailableSlave
          SlaveTemplate{ami='ami-0efbb291c6e8cc847', labels='docker'}. Cannot provision - no capacity for instances: 0
          May 31, 2019 2:23:53 PM WARNING hudson.plugins.ec2.EC2Cloud provision
          Can't raise nodes for SlaveTemplate{ami='ami-0efbb291c6e8cc847', labels='docker'}
          {code}

          If I go to the EC2 console and terminate the instance manually the plugin will spawn a new one and use it.

          It seems like there is some mismatch in the plugin logic. The part responsible for calculating the number of instances and checking the cap sees the EC2 instance. However the part responsible for picking up running EC2 instances doesn't seem to be able to find it.

          We use a single subnet, security group and vpc (I've seen some reports about this causing problems).

          It seems the problems do not occur when I do a {{/safeRestart}} but they do if I use e.g. "restart Jenkins when no jobs are running" form the Update Center.

          We use instanceCap = 1 setting as we are testing the plugin, this might make this problem more visible than with a higher cap.
          Sometimes after a Jenkins restart the plugin won't be able to spawn more agents.

          The plugin will just loop on this:
          {code}SlaveTemplate{ami='ami-0efbb291c6e8cc847', labels='docker'}. Attempting to provision slave needed by excess workload of 1 units
          May 31, 2019 2:23:53 PM INFO hudson.plugins.ec2.EC2Cloud getNewOrExistingAvailableSlave
          SlaveTemplate{ami='ami-0efbb291c6e8cc847', labels='docker'}. Cannot provision - no capacity for instances: 0
          May 31, 2019 2:23:53 PM WARNING hudson.plugins.ec2.EC2Cloud provision
          Can't raise nodes for SlaveTemplate{ami='ami-0efbb291c6e8cc847', labels='docker'}
          {code}

          If I go to the EC2 console and terminate the instance manually the plugin will spawn a new one and use it.

          It seems like there is some mismatch in the plugin logic. The part responsible for calculating the number of instances and checking the cap sees the EC2 instance. However the part responsible for picking up running EC2 instances doesn't seem to be able to find it.

          We use a single subnet, security group and vpc (I've seen some reports about this causing problems).

          We use instanceCap = 1 setting as we are testing the plugin, this might make this problem more visible than with a higher cap.
          Hide
          jbochenski Jakub Bochenski added a comment - - edited

          This is happening at least since 1.43 and it just happened on 1.44

          I have only one EC2 cloud configured, but I also have an ECS cloud (they use separate agent labels).

          This is our cloud configuration done via groovy script:

          final cloud = new AmazonEC2Cloud(
                  'ec2',
                  false,
                  config.ec2_access_key,
                  config.ec2_region,
                  config.ec2_ssh_key,
                  config.ec2_instance_cap,
                  [
          
          
                          new SlaveTemplate(
                                  config.ec2_ami_id,
                                  '',
                                  null,
                                  config.ec2_security_groups,
                                  '/tmp',
                                  InstanceType.fromValue(config.ec2_instance_type),
                                  false,
                                  config.ec2_label,
                                  Node.Mode.NORMAL,
                                  "ec2 (${config.ec2_ami_id})",
                                  '',
                                  '/tmp',
                                  '',
                                  '1',
                                  config.ec2_remote_user,
                                  new UnixData(null, null, null, null),
                                  '',
                                  false,
                                  config.ec2_subnet_id,
                                  [
                                          Name: 'acme', 
                                          Contact : 'acme@acme.com',
                                  ].collect{ new EC2Tag(it.key,it.value) },
                                  '30',
                                  false,
                                  '',
                                  config.ec2_arn_role,
                                  true,
                                  false,
                                  false,
                                  '1800',
                                  false,
                                  '',
                                  false,
                                  false,
                                  false,
                                  false
                          )],
                  config.ec2_arn_role,
                  ''
          )
          Show
          jbochenski Jakub Bochenski added a comment - - edited This is happening at least since 1.43 and it just happened on 1.44 I have only one EC2 cloud configured, but I also have an ECS cloud (they use separate agent labels). This is our cloud configuration done via groovy script: final cloud = new AmazonEC2Cloud( 'ec2' , false , config.ec2_access_key, config.ec2_region, config.ec2_ssh_key, config.ec2_instance_cap, [ new SlaveTemplate( config.ec2_ami_id, '', null , config.ec2_security_groups, '/tmp' , InstanceType.fromValue(config.ec2_instance_type), false , config.ec2_label, Node.Mode.NORMAL, "ec2 (${config.ec2_ami_id})" , '', '/tmp' , '', '1' , config.ec2_remote_user, new UnixData( null , null , null , null ), '', false , config.ec2_subnet_id, [ Name: 'acme' , Contact : 'acme@acme.com' , ].collect{ new EC2Tag(it.key,it.value) }, '30' , false , '', config.ec2_arn_role, true , false , false , '1800' , false , '', false , false , false , false )], config.ec2_arn_role, '' )
          Hide
          jbochenski Jakub Bochenski added a comment -

          Can you try with 2 ?

          If I reproduce the issue with instance cap = 1, then increase the cap to 2 I will get a new agent spawned (but only 1)

          Now trying to reproduce this with 2 instances getting orphaned.

          I also tried setting instance cap on slave template to 2 (it was blank before) – doesn't seem to help

          Show
          jbochenski Jakub Bochenski added a comment - Can you try with 2 ? If I reproduce the issue with instance cap = 1, then increase the cap to 2 I will get a new agent spawned (but only 1) Now trying to reproduce this with 2 instances getting orphaned. I also tried setting instance cap on slave template to 2 (it was blank before) – doesn't seem to help
          Hide
          jbochenski Jakub Bochenski added a comment -

          I'm now getting this situation with instance cap = 2. I have two matching instances on EC2, both are active.
          Plugin is looping with above message, with no agents available for the builds.

          Now when I terminated one of the instances an interesting thing happened. Jenkins was able to pick up the other instance and reconnect it

          SlaveTemplate{ami='ami-0efbb291c6e8cc847', labels='docker docker-bakery'}. Cannot provision - no capacity for instances: 0
          
          Jun 27, 2019 11:35:07 AM WARNING hudson.plugins.ec2.EC2Cloud provision
          
          Can't raise nodes for SlaveTemplate{ami='ami-0efbb291c6e8cc847', labels='docker docker-bakery'}
          
          Jun 27, 2019 11:35:16 AM INFO hudson.plugins.ec2.EC2Cloud provision
          
          SlaveTemplate{ami='ami-0efbb291c6e8cc847', labels='docker docker-bakery'}. Attempting to provision slave needed by excess workload of 1 units
          
          Jun 27, 2019 11:35:17 AM INFO hudson.plugins.ec2.SlaveTemplate logProvisionInfo
          
          SlaveTemplate{ami='ami-0efbb291c6e8cc847', labels='docker docker-bakery'}. Considering launching
          
          Jun 27, 2019 11:35:17 AM INFO hudson.plugins.ec2.SlaveTemplate setupRootDevice
          
          AMI had xvda
          
          Jun 27, 2019 11:35:17 AM INFO hudson.plugins.ec2.SlaveTemplate setupRootDevice
          
          {DeleteOnTermination: true,SnapshotId: snap-0b70f104d64ae4a48,VolumeSize: 8,VolumeType: gp2,Encrypted: false,}
          
          Jun 27, 2019 11:35:17 AM INFO hudson.plugins.ec2.SlaveTemplate logProvisionInfo
          
          SlaveTemplate{ami='ami-0efbb291c6e8cc847', labels='docker docker-bakery'}. Setting Instance Initiated Shutdown Behavior : ShutdownBehavior.Terminate
          
          Jun 27, 2019 11:35:17 AM INFO hudson.plugins.ec2.SlaveTemplate logProvisionInfo
          
          SlaveTemplate{ami='ami-0efbb291c6e8cc847', labels='docker docker-bakery'}. Looking for existing instances with describe-instance: {Filters: SNAP
          
          Jun 27, 2019 11:35:18 AM INFO hudson.plugins.ec2.SlaveTemplate logProvisionInfo
          
          SlaveTemplate{ami='ami-0efbb291c6e8cc847', labels='docker docker-bakery'}. checkInstance: i-0e454aea630ccb88f. true - Instance is not connected to Jenkins
          
          Show
          jbochenski Jakub Bochenski added a comment - I'm now getting this situation with instance cap = 2. I have two matching instances on EC2, both are active. Plugin is looping with above message, with no agents available for the builds. Now when I terminated one of the instances an interesting thing happened. Jenkins was able to pick up the other instance and reconnect it SlaveTemplate{ami= 'ami-0efbb291c6e8cc847' , labels= 'docker docker-bakery' }. Cannot provision - no capacity for instances: 0 Jun 27, 2019 11:35:07 AM WARNING hudson.plugins.ec2.EC2Cloud provision Can 't raise nodes for SlaveTemplate{ami=' ami-0efbb291c6e8cc847 ', labels=' docker docker-bakery'} Jun 27, 2019 11:35:16 AM INFO hudson.plugins.ec2.EC2Cloud provision SlaveTemplate{ami= 'ami-0efbb291c6e8cc847' , labels= 'docker docker-bakery' }. Attempting to provision slave needed by excess workload of 1 units Jun 27, 2019 11:35:17 AM INFO hudson.plugins.ec2.SlaveTemplate logProvisionInfo SlaveTemplate{ami= 'ami-0efbb291c6e8cc847' , labels= 'docker docker-bakery' }. Considering launching Jun 27, 2019 11:35:17 AM INFO hudson.plugins.ec2.SlaveTemplate setupRootDevice AMI had xvda Jun 27, 2019 11:35:17 AM INFO hudson.plugins.ec2.SlaveTemplate setupRootDevice {DeleteOnTermination: true ,SnapshotId: snap-0b70f104d64ae4a48,VolumeSize: 8,VolumeType: gp2,Encrypted: false ,} Jun 27, 2019 11:35:17 AM INFO hudson.plugins.ec2.SlaveTemplate logProvisionInfo SlaveTemplate{ami= 'ami-0efbb291c6e8cc847' , labels= 'docker docker-bakery' }. Setting Instance Initiated Shutdown Behavior : ShutdownBehavior.Terminate Jun 27, 2019 11:35:17 AM INFO hudson.plugins.ec2.SlaveTemplate logProvisionInfo SlaveTemplate{ami= 'ami-0efbb291c6e8cc847' , labels= 'docker docker-bakery' }. Looking for existing instances with describe-instance: {Filters: SNAP Jun 27, 2019 11:35:18 AM INFO hudson.plugins.ec2.SlaveTemplate logProvisionInfo SlaveTemplate{ami= 'ami-0efbb291c6e8cc847' , labels= 'docker docker-bakery' }. checkInstance: i-0e454aea630ccb88f. true - Instance is not connected to Jenkins
          Hide
          jbochenski Jakub Bochenski added a comment -

          Above looks like maybe there is some "off by one" error, when the plugin won't attempt to re-connect instances if it's at instance cap

          Show
          jbochenski Jakub Bochenski added a comment - Above looks like maybe there is some "off by one" error, when the plugin won't attempt to re-connect instances if it's at instance cap
          Hide
          jbochenski Jakub Bochenski added a comment -

          I double checked this.
          If I'm at cap = 1 with 1 orphaned instance and increase the cap to 2 then the plugin will spawn a new instance.
          If I'm at cap = 2 with 2 orphaned instances and terminate one of the instances manually then the plugin will reconnect the other instance

          Show
          jbochenski Jakub Bochenski added a comment - I double checked this. If I'm at cap = 1 with 1 orphaned instance and increase the cap to 2 then the plugin will spawn a new instance. If I'm at cap = 2 with 2 orphaned instances and terminate one of the instances manually then the plugin will reconnect the other instance
          jbochenski Jakub Bochenski made changes -
          Environment Jenkins ver. 2.176.1
          ec2 plugin 1.43, 1.44
          Hide
          jbochenski Jakub Bochenski added a comment - - edited

          FABRIZIO MANFREDIRaihaan Shouhell I know this is OSS and there are is not SLA. Still could you tell me if I and when I can expect any assistance from you?

          Show
          jbochenski Jakub Bochenski added a comment - - edited FABRIZIO MANFREDI Raihaan Shouhell I know this is OSS and there are is not SLA. Still could you tell me if I and when I can expect any assistance from you?
          Hide
          thoulen FABRIZIO MANFREDI added a comment -

          I believe I found the problem, I trying to put in the 1.44.2 that should be release in a couple of days.

          One more question what do you mean with orphaned, stop state or no longer in the jenkins interface ? 

          Did you apply the all the IAM role requested specify in the ec2 plugin page ? 

          Show
          thoulen FABRIZIO MANFREDI added a comment - I believe I found the problem, I trying to put in the 1.44.2 that should be release in a couple of days. One more question what do you mean with orphaned, stop state or no longer in the jenkins interface ?  Did you apply the all the IAM role requested specify in the ec2 plugin page ? 
          Hide
          jbochenski Jakub Bochenski added a comment - - edited

          > One more question what do you mean with orphaned, stop state or no longer in the jenkins interface ?

          It's not available as an agent in jenkins. It's still in running state when I check the status in AWS console

          > Did you apply the all the IAM role requested specify in the ec2 plugin page ?

          I believe I did, since this is a random error it doesn't happen every time. E.g. the instances get terminated after idle timeout

          Show
          jbochenski Jakub Bochenski added a comment - - edited > One more question what do you mean with orphaned, stop state or no longer in the jenkins interface ? It's not available as an agent in jenkins. It's still in running state when I check the status in AWS console > Did you apply the all the IAM role requested specify in the ec2 plugin page ? I believe I did, since this is a random error it doesn't happen every time. E.g. the instances get terminated after idle timeout
          Hide
          jbochenski Jakub Bochenski added a comment -

          > I believe I found the problem, I trying to put in the 1.44.2 that should be release in a couple of days.

          FABRIZIO MANFREDI it's been a month now and I can't see any new releases after 1.44.1. Any updates?

          Show
          jbochenski Jakub Bochenski added a comment - > I believe I found the problem, I trying to put in the 1.44.2 that should be release in a couple of days. FABRIZIO MANFREDI it's been a month now and I can't see any new releases after 1.44.1. Any updates?
          Hide
          thoulen FABRIZIO MANFREDI added a comment -

          Can you test the 1.45 ? 

          Show
          thoulen FABRIZIO MANFREDI added a comment - Can you test the 1.45 ? 
          Hide
          jbochenski Jakub Bochenski added a comment -

          FABRIZIO MANFREDI the same problem is happening with 1.45

          Show
          jbochenski Jakub Bochenski added a comment - FABRIZIO MANFREDI the same problem is happening with 1.45
          jbochenski Jakub Bochenski made changes -
          Environment Jenkins ver. 2.176.1
          ec2 plugin 1.43, 1.44
          Jenkins ver. 2.176.1
          ec2 plugin 1.43, 1.44, 1.45
          Hide
          raihaan Raihaan Shouhell added a comment -

          I'm not sure how to replicate this issue :/

          Show
          raihaan Raihaan Shouhell added a comment - I'm not sure how to replicate this issue :/
          Hide
          jbochenski Jakub Bochenski added a comment -

          Raihaan Shouhell I've provided the full configuration via a groovy script above. What else do you need?

          Show
          jbochenski Jakub Bochenski added a comment - Raihaan Shouhell I've provided the full configuration via a groovy script above. What else do you need?
          Hide
          raihaan Raihaan Shouhell added a comment -

          I have made a cloud, set the instanceCap to 1 and restarted without running into the orphaning problem. Is there a way to reproduce this consistently from your end? Also what is the number of instances that get run in your setup?

          Show
          raihaan Raihaan Shouhell added a comment - I have made a cloud, set the instanceCap to 1 and restarted without running into the orphaning problem. Is there a way to reproduce this consistently from your end? Also what is the number of instances that get run in your setup?
          Hide
          sirzic cedric lecoz added a comment - - edited

          Hi all,
          I've done an upgrade of our jenkins last week, ec2 plugin moved from 1.43 to 1.45.
          This issue had already been seen on 1.43, but rarelly.
          On new version I get at least an occurence a day (upgrade was core + plugin, everything a couple month old).

          Going thought logs to try to figure out what was happening before I found this ticket, I saw the following traces, adding it here in case it could help debug the problem.
          In all the cases, my EC2 instance is started correctly, it's just that jenkins doesn't see it.

          When it works:

          Sep 05, 2019 9:25:53 AM hudson.plugins.ec2.EC2Cloud provision
          INFO: SlaveTemplate{ami='ami-****', labels='build-yocto-persistent'}. Attempting provision finished, excess workload: -1
          Sep 05, 2019 9:25:53 AM hudson.plugins.ec2.EC2Cloud provision
          INFO: We have now 27 computers, waiting for 1 more
          Sep 05, 2019 9:25:53 AM hudson.slaves.NodeProvisioner$StandardStrategyImpl apply
          INFO: Started provisioning EC2 (ec2-project) - build-yocto-persistent from ec2-ec2-project with 2 executors. Remaining excess workload: -1
          INFO: SlaveTemplate{ami='ami-****', labels='build-yocto-persistent'} Node EC2 (ec2-project) - build-yocto-persistent (i-****) moved to RUNNING state in 5 seconds and is ready to be connected by Jenkins
          Sep 05, 2019 9:25:58 AM hudson.plugins.ec2.EC2Cloud log
          INFO: Launching instance: i-****
          Sep 05, 2019 9:25:58 AM hudson.plugins.ec2.EC2Cloud log
          Sep 05, 2019 9:25:58 AM hudson.plugins.ec2.EC2Cloud log
          INFO: Connecting to 10.1.0.234 on port 22, with timeout 10000.
          Sep 05, 2019 9:26:03 AM hudson.slaves.NodeProvisioner$2 run
          INFO: EC2 (ec2-project) - build-yocto-persistent provisioning successfully completed. We have now 27 computer(s)
          Sep 05, 2019 9:26:03 AM com.tsystems.sbs.LogFileFilterOutputStream <init>
          

          When It does not work:

          Sep 05, 2019 11:51:13 AM hudson.plugins.ec2.EC2Cloud provision
          INFO: SlaveTemplate{ami='ami-****', labels='build-yocto-persistent'}. Attempting provision finished, excess workload: -1
          Sep 05, 2019 11:51:13 AM hudson.plugins.ec2.EC2Cloud provision
          INFO: We have now 27 computers, waiting for 1 more
          Sep 05, 2019 11:51:13 AM hudson.slaves.NodeProvisioner$StandardStrategyImpl apply
          INFO: Started provisioning EC2 (ec2-project) - build-yocto-persistent from ec2-ec2-project with 2 executors. Remaining excess workload: -1
          Sep 05, 2019 11:51:13 AM hudson.plugins.ec2.EC2Cloud$1 call
          WARNING: SlaveTemplate{ami='ami-****', labels='build-yocto-persistent'}. Node stopped is neither pending, neither running, its {2}. Terminate provisioning
          Sep 05, 2019 11:51:14 AM hudson.plugins.repo.ChangeLog saveChangeLog
          INFO: No logs found
          

          In that case the Node stopped is neither pending, neither running ... trace popped in less that a second instead of the 5 seconds when no problem.

          An other observation I made, is in my cloudtrail logs,
          when it works, I can see the following calls to AWS :

            -09:25:53 StartInstances  using  the  i-**** instance ID.
            -09:25:53 DescribeInstances, using the i-**** instance Id as seen in following requestParameters
              "requestParameters": {
                  "instancesSet": {
                      "items": [
                          {
                              "instanceId": "i-****"
                          }
                      ]
                  },
                  "filterSet": {}
              },
            -09:25:55 CreateGrant (for decryption)
            -09:25:58  DescribeInstance using the i-**** instance Id as seen in above requestParameters.
            - ...
          

          when it does not work:

            -11:51:13 StartInstances  using  the  i-**** instance ID.
            -11:51:14 DescribeInstances, using the i-**** instance Id  as seen in above requestParameters
            -11:51:15 CreateGrant (for decryption)
            -11:51:19 DescribeInstance empty parameters  as seen in following requestParameters
              "requestParameters": {
                  "instancesSet": {},
                  "filterSet": {}
              },
          

          I did a bit more testing, everytime I reproduce the issue, I do not have a correct (with instanceId) DescribeInstance after the first one.

           
          EDIT: As Jakub in the next comment I reproduce the issue on instances with a cap=1. those instances have specificities like a second drive, so cap needs to be 1.
          On generic instances. with cap > 1 I haven't seen the problem
           

          All the best,

          Cedric

          Show
          sirzic cedric lecoz added a comment - - edited Hi all, I've done an upgrade of our jenkins last week, ec2 plugin moved from 1.43 to 1.45. This issue had already been seen on 1.43, but rarelly. On new version I get at least an occurence a day (upgrade was core + plugin, everything a couple month old). Going thought logs to try to figure out what was happening before I found this ticket, I saw the following traces, adding it here in case it could help debug the problem. In all the cases, my EC2 instance is started correctly, it's just that jenkins doesn't see it. When it works: Sep 05, 2019 9:25:53 AM hudson.plugins.ec2.EC2Cloud provision INFO: SlaveTemplate{ami= 'ami-****' , labels= 'build-yocto-persistent' }. Attempting provision finished, excess workload: -1 Sep 05, 2019 9:25:53 AM hudson.plugins.ec2.EC2Cloud provision INFO: We have now 27 computers, waiting for 1 more Sep 05, 2019 9:25:53 AM hudson.slaves.NodeProvisioner$StandardStrategyImpl apply INFO: Started provisioning EC2 (ec2-project) - build-yocto-persistent from ec2-ec2-project with 2 executors. Remaining excess workload: -1 INFO: SlaveTemplate{ami= 'ami-****' , labels= 'build-yocto-persistent' } Node EC2 (ec2-project) - build-yocto-persistent (i-****) moved to RUNNING state in 5 seconds and is ready to be connected by Jenkins Sep 05, 2019 9:25:58 AM hudson.plugins.ec2.EC2Cloud log INFO: Launching instance: i-**** Sep 05, 2019 9:25:58 AM hudson.plugins.ec2.EC2Cloud log Sep 05, 2019 9:25:58 AM hudson.plugins.ec2.EC2Cloud log INFO: Connecting to 10.1.0.234 on port 22, with timeout 10000. Sep 05, 2019 9:26:03 AM hudson.slaves.NodeProvisioner$2 run INFO: EC2 (ec2-project) - build-yocto-persistent provisioning successfully completed. We have now 27 computer(s) Sep 05, 2019 9:26:03 AM com.tsystems.sbs.LogFileFilterOutputStream <init> When It does not work: Sep 05, 2019 11:51:13 AM hudson.plugins.ec2.EC2Cloud provision INFO: SlaveTemplate{ami= 'ami-****' , labels= 'build-yocto-persistent' }. Attempting provision finished, excess workload: -1 Sep 05, 2019 11:51:13 AM hudson.plugins.ec2.EC2Cloud provision INFO: We have now 27 computers, waiting for 1 more Sep 05, 2019 11:51:13 AM hudson.slaves.NodeProvisioner$StandardStrategyImpl apply INFO: Started provisioning EC2 (ec2-project) - build-yocto-persistent from ec2-ec2-project with 2 executors. Remaining excess workload: -1 Sep 05, 2019 11:51:13 AM hudson.plugins.ec2.EC2Cloud$1 call WARNING: SlaveTemplate{ami= 'ami-****' , labels= 'build-yocto-persistent' }. Node stopped is neither pending, neither running, its {2}. Terminate provisioning Sep 05, 2019 11:51:14 AM hudson.plugins.repo.ChangeLog saveChangeLog INFO: No logs found In that case the Node stopped is neither pending, neither running ... trace popped in less that a second instead of the 5 seconds when no problem. An other observation I made, is in my cloudtrail logs, when it works, I can see the following calls to AWS : -09:25:53 StartInstances using the i-**** instance ID. -09:25:53 DescribeInstances, using the i-**** instance Id as seen in following requestParameters "requestParameters" : { "instancesSet" : { "items" : [ { "instanceId" : "i-****" } ] }, "filterSet" : {} }, -09:25:55 CreateGrant ( for decryption) -09:25:58 DescribeInstance using the i-**** instance Id as seen in above requestParameters. - ... when it does not work: -11:51:13 StartInstances using the i-**** instance ID. -11:51:14 DescribeInstances, using the i-**** instance Id as seen in above requestParameters -11:51:15 CreateGrant ( for decryption) -11:51:19 DescribeInstance empty parameters as seen in following requestParameters "requestParameters" : { "instancesSet" : {}, "filterSet" : {} }, I did a bit more testing, everytime I reproduce the issue, I do not have a correct (with instanceId) DescribeInstance after the first one.   EDIT: As Jakub in the next comment I reproduce the issue on instances with a cap=1. those instances have specificities like a second drive, so cap needs to be 1. On generic instances. with cap > 1 I haven't seen the problem   All the best, Cedric
          Hide
          jbochenski Jakub Bochenski added a comment -

          Is there a way to reproduce this consistently from your end? Also what is the number of instances that get run in your setup?

          It happens quite often after restart but I have no way to reproduce it 100%.

          Maybe the fact that the instance counting logic sees the EC2 machine (since it says there is no capacity), but the attaching node can't connect it for some reason would be a hint?

          lso what is the number of instances that get run in your setup?

          I'm not sure if I understand. The instance cap is 1 so we have at most 1 instance.

          Show
          jbochenski Jakub Bochenski added a comment - Is there a way to reproduce this consistently from your end? Also what is the number of instances that get run in your setup? It happens quite often after restart but I have no way to reproduce it 100%. Maybe the fact that the instance counting logic sees the EC2 machine (since it says there is no capacity), but the attaching node can't connect it for some reason would be a hint? lso what is the number of instances that get run in your setup? I'm not sure if I understand. The instance cap is 1 so we have at most 1 instance.
          Hide
          raihaan Raihaan Shouhell added a comment -

          From this log
          ```
          WARNING: SlaveTemplate

          {ami='ami-****', labels='build-yocto-persistent'}

          . Node stopped is neither pending, neither running, its {2}. Terminate provisioning
          ```
          It says that the node has been stopped. Btw are you on ondemand slaves or spots

          Show
          raihaan Raihaan Shouhell added a comment - From this log ``` WARNING: SlaveTemplate {ami='ami-****', labels='build-yocto-persistent'} . Node stopped is neither pending, neither running, its {2}. Terminate provisioning ``` It says that the node has been stopped. Btw are you on ondemand slaves or spots
          sirzic cedric lecoz made changes -
          Attachment jenkins_201909121030.log [ 48722 ]
          Hide
          sirzic cedric lecoz added a comment -

          Hi Raihaan Shouhell,
          That's what it says, but the EC2 instance was alive, I could ssh and work away, it's just that Jenkins was not aware of it. my instances are on demand.
          Attached jenkins_201909121030.log a log with a bit more data 2 differences EC2 had the issue (or similar) the first one (aws-audit-ec2) had an EC2 running and terminated an hour before (so it was still showing as terminated in my EC2 console). the second one the EC2 already existed, and was just stopped. I tried to clean the log at best, but I have too many jobs running on other ec2, it's noisy.
          BR,
          Cedric.

          Show
          sirzic cedric lecoz added a comment - Hi Raihaan Shouhell , That's what it says, but the EC2 instance was alive, I could ssh and work away, it's just that Jenkins was not aware of it. my instances are on demand. Attached jenkins_201909121030.log a log with a bit more data 2 differences EC2 had the issue (or similar) the first one (aws-audit-ec2) had an EC2 running and terminated an hour before (so it was still showing as terminated in my EC2 console). the second one the EC2 already existed, and was just stopped. I tried to clean the log at best, but I have too many jobs running on other ec2, it's noisy. BR, Cedric.
          Hide
          raihaan Raihaan Shouhell added a comment -

          cedric lecoz Jakub Bochenski could someone test   https://ci.jenkins.io/job/Plugins/job/ec2-plugin/job/PR-397/2/artifact/org/jenkins-ci/plugins/ec2/1.46-rc1050.a8a95e8dd7f5/ec2-1.46-rc1050.a8a95e8dd7f5.hpi and see if this issue still occurs? This retries on missing instances instead of giving up immediately.

          Show
          raihaan Raihaan Shouhell added a comment - cedric lecoz Jakub Bochenski could someone test   https://ci.jenkins.io/job/Plugins/job/ec2-plugin/job/PR-397/2/artifact/org/jenkins-ci/plugins/ec2/1.46-rc1050.a8a95e8dd7f5/ec2-1.46-rc1050.a8a95e8dd7f5.hpi and see if this issue still occurs? This retries on missing instances instead of giving up immediately.
          Hide
          sirzic cedric lecoz added a comment -

          HI Raihaan Shouhell,
          I updated to that new version this morning, few tests I did were ok. I added a job which will start / destroy / start ..an ec2 using the plugin every 15min, and asked the team to ping me if they see the problem happen. if we don't see the problem, will try to update this ticket by next Thursday.
          BR,
          Cedric.

          Show
          sirzic cedric lecoz added a comment - HI Raihaan Shouhell , I updated to that new version this morning, few tests I did were ok. I added a job which will start / destroy / start ..an ec2 using the plugin every 15min, and asked the team to ping me if they see the problem happen. if we don't see the problem, will try to update this ticket by next Thursday. BR, Cedric.
          sirzic cedric lecoz made changes -
          Attachment jenkins.temp_dsl.log [ 48747 ]
          Hide
          sirzic cedric lecoz added a comment -

          Hi Raihaan Shouhell,

          Reproduced it twice this morning, attached jenkins.temp_dsl.log one of the log.
          Plugin manager still shows

          {1.46-rc1050.a8a95e8dd7f5}

          for EC2 plugin

          C.

          Show
          sirzic cedric lecoz added a comment - Hi Raihaan Shouhell , Reproduced it twice this morning, attached jenkins.temp_dsl.log one of the log. Plugin manager still shows {1.46-rc1050.a8a95e8dd7f5} for EC2 plugin C.
          raihaan Raihaan Shouhell made changes -
          Attachment ec2.hpi [ 48761 ]
          Hide
          raihaan Raihaan Shouhell added a comment - - edited

          [^ec2.hpi] cedric lecoz

          For your latest issue the linked HPI should solve it. The issue you seem to see is when starting from a stopped instance due to eventual consistency of AWS APIs it occasionally sees a freshly started instance as stopped as a result for newly started instances I added a retry to deal with this.

          Show
          raihaan Raihaan Shouhell added a comment - - edited [^ec2.hpi] cedric lecoz For your latest issue the linked HPI should solve it. The issue you seem to see is when starting from a stopped instance due to eventual consistency of AWS APIs it occasionally sees a freshly started instance as stopped as a result for newly started instances I added a retry to deal with this.
          Hide
          sirzic cedric lecoz added a comment -

          ok tks, will try asap but that may not be before the WE, jenkins is slightly too busy during the week

          Show
          sirzic cedric lecoz added a comment - ok tks, will try asap but that may not be before the WE, jenkins is slightly too busy during the week
          Hide
          sirzic cedric lecoz added a comment - - edited

          hi Raihaan Shouhell,
          Is the ec2.hpi plugin you attached here the same which was built by https://github.com/jenkinsci/ec2-plugin/pull/398 ?
          It's easier to add to my ci env (automated) when the plugin comes directly from ci.jenkins.io, and easier to track too

          I am asking because it does not looks like PR-398 includes what I tested from PR-397.

          tks,
          C/

          Show
          sirzic cedric lecoz added a comment - - edited hi Raihaan Shouhell , Is the ec2.hpi plugin you attached here the same which was built by https://github.com/jenkinsci/ec2-plugin/pull/398 ? It's easier to add to my ci env (automated) when the plugin comes directly from ci.jenkins.io, and easier to track too I am asking because it does not looks like PR-398 includes what I tested from PR-397. tks, C/
          raihaan Raihaan Shouhell made changes -
          Attachment ec2.hpi [ 48761 ]
          Hide
          raihaan Raihaan Shouhell added a comment -

          cedric lecoz yes it is i attached it directly because CI was struggling to build it yesterday. I have removed the attachment.

          Show
          raihaan Raihaan Shouhell added a comment - cedric lecoz yes it is i attached it directly because CI was struggling to build it yesterday. I have removed the attachment.
          sirzic cedric lecoz made changes -
          Hide
          sirzic cedric lecoz added a comment -

          Hi Raihaan Shouhell,
          Using the 1.46-rc1050.43f9773eed95 plugin, I reproduced the issue when starting a new EC2 after the previous one was terminated, see attached log start_fresh_1.46-rc1050.43f9773eed95.txt. (what I believe was fixed in PR-397).

          Issue from a stopped slave has not yet been reproduced.

          BR,
          Cedric.

          Show
          sirzic cedric lecoz added a comment - Hi Raihaan Shouhell , Using the 1.46-rc1050.43f9773eed95 plugin, I reproduced the issue when starting a new EC2 after the previous one was terminated, see attached log start_fresh_1.46-rc1050.43f9773eed95.txt . (what I believe was fixed in PR-397). Issue from a stopped slave has not yet been reproduced. BR, Cedric.
          Hide
          sirzic cedric lecoz added a comment -

          Hi,
          Status update, since last week I have not reproduced the issue when starting a stopped instance.
          I have reproduced a dozen of times the issue when the previous instance was terminated.

          I just saw there was a new PR-399 1.46-rc1052.8c6d855421ac associated to this ticket, so I pushed it to our Jenkins, will keep you updated.
          C/

          Show
          sirzic cedric lecoz added a comment - Hi, Status update, since last week I have not reproduced the issue when starting a stopped instance. I have reproduced a dozen of times the issue when the previous instance was terminated. I just saw there was a new PR-399 1.46-rc1052.8c6d855421ac associated to this ticket, so I pushed it to our Jenkins, will keep you updated. C/
          Hide
          sirzic cedric lecoz added a comment -

          same problem (starting a new instance when previous instance has been terminated). seen on PR399.
          C/

          Show
          sirzic cedric lecoz added a comment - same problem (starting a new instance when previous instance has been terminated). seen on PR399. C/
          Hide
          jbochenski Jakub Bochenski added a comment - - edited

          I'm still seeing issues in 1.45. The instance is in running state but plugin can't see t be able to connect it.

          I can't test it on 1.46 because of JENKINS-59564

          Sep 30, 2019 9:32:03 AM INFO hudson.plugins.ec2.EC2Cloud provision
          
          SlaveTemplate{ami='ami-02769bd03e603e42f', labels='docker docker-bakery'}. Attempting to provision slave needed by excess workload of 1 units
          
          Sep 30, 2019 9:32:04 AM INFO hudson.plugins.ec2.SlaveTemplate logProvisionInfo
          
          SlaveTemplate{ami='ami-02769bd03e603e42f', labels='docker docker-bakery'}. Considering launching
          
          Sep 30, 2019 9:32:04 AM INFO hudson.plugins.ec2.SlaveTemplate setupRootDevice
          
          AMI had xvda
          
          Sep 30, 2019 9:32:04 AM INFO hudson.plugins.ec2.SlaveTemplate setupRootDevice
          
          {DeleteOnTermination: true,SnapshotId: snap-0f2d5ab1c6f918116,VolumeSize: 8,VolumeType: gp2,Encrypted: false,}
          
          Sep 30, 2019 9:32:04 AM INFO hudson.plugins.ec2.SlaveTemplate logProvisionInfo
          
          SlaveTemplate{ami='ami-02769bd03e603e42f', labels='docker docker-bakery'}. Setting Instance Initiated Shutdown Behavior : ShutdownBehavior.Terminate
          
          Sep 30, 2019 9:32:04 AM INFO hudson.plugins.ec2.SlaveTemplate logProvisionInfo
          
          SlaveTemplate{ami='ami-02769bd03e603e42f', labels='docker docker-bakery'}. Looking for existing instances with describe-instance: {Filters: [{Name: image-id,Values: [ami-02769bd03e603e42f]}, {Name: instance-type,Values: [t3.micro]}, {Name: key-name,Values: [j4a-bochja]}, {Name: subnet-id,Values: [subnet-0a167eb56a247e891]}, {Name: instance.group-id,Values: [sg-0a4f5e5ac5bb602e4]}, {Name: tag:Name,Values: [ew1-j4a-jenkins-slave-ec2]}, {Name: tag:DeploymentName,Values: [ew1-j4a]}, {Name: tag:CostCenter,Values: [31505]}, {Name: tag:DeploymentType,Values: [dev]}, {Name: tag:DeploymentGroup,Values: [ew1-j4a]}, {Name: tag:jenkins_server_url,Values: [https://acme.com/]}, {Name: tag:jenkins_slave_type,Values: [demand_ec2 (ami-02769bd03e603e42f)]}],InstanceIds: [],}
          
          Sep 30, 2019 9:32:05 AM INFO hudson.plugins.ec2.CloudHelper getInstance
          
          Unexpected number of reservations reported by EC2 for instance id 'i-0571c7e8c36a6b783', expected 1 result, found []. Instance seems to be dead.
          
          Sep 30, 2019 9:32:05 AM WARNING hudson.plugins.ec2.EC2Cloud provision
          
          SlaveTemplate{ami='ami-02769bd03e603e42f', labels='docker docker-bakery'}. Exception during provisioning
          com.amazonaws.AmazonClientException: Unexpected number of reservations reported by EC2 for instance id 'i-0571c7e8c36a6b783', expected 1 result, found []. Instance seems to be dead.
          	at hudson.plugins.ec2.CloudHelper.getInstance(CloudHelper.java:54)
          	at hudson.plugins.ec2.CloudHelper.getInstanceWithRetry(CloudHelper.java:25)
          	at hudson.plugins.ec2.EC2AbstractSlave.fetchLiveInstanceData(EC2AbstractSlave.java:566)
          	at hudson.plugins.ec2.EC2AbstractSlave.<init>(EC2AbstractSlave.java:165)
          	at hudson.plugins.ec2.EC2OndemandSlave.<init>(EC2OndemandSlave.java:56)
          	at hudson.plugins.ec2.SlaveTemplate.newOndemandSlave(SlaveTemplate.java:1104)
          	at hudson.plugins.ec2.SlaveTemplate.toSlaves(SlaveTemplate.java:773)
          	at hudson.plugins.ec2.SlaveTemplate.provisionOndemand(SlaveTemplate.java:745)
          	at hudson.plugins.ec2.SlaveTemplate.provisionOndemand(SlaveTemplate.java:585)
          	at hudson.plugins.ec2.SlaveTemplate.provision(SlaveTemplate.java:540)
          	at hudson.plugins.ec2.EC2Cloud.getNewOrExistingAvailableSlave(EC2Cloud.java:589)
          	at hudson.plugins.ec2.EC2Cloud.provision(EC2Cloud.java:615)
          	at hudson.slaves.NodeProvisioner$StandardStrategyImpl.apply(NodeProvisioner.java:715)
          	at hudson.slaves.NodeProvisioner.update(NodeProvisioner.java:320)
          	at hudson.slaves.NodeProvisioner.access$000(NodeProvisioner.java:62)
          	at hudson.slaves.NodeProvisioner$NodeProvisionerInvoker.doRun(NodeProvisioner.java:807)
          	at hudson.triggers.SafeTimerTask.run(SafeTimerTask.java:72)
          	at jenkins.security.ImpersonatingScheduledExecutorService$1.run(ImpersonatingScheduledExecutorService.java:58)
          	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
          	at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
          	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
          	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
          	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
          	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
          	at java.lang.Thread.run(Thread.java:748)
          
          Sep 30, 2019 9:32:13 AM INFO hudson.plugins.ec2.EC2Cloud provision
          
          SlaveTemplate{ami='ami-02769bd03e603e42f', labels='docker docker-bakery'}. Attempting to provision slave needed by excess workload of 1 units
          
          Sep 30, 2019 9:32:14 AM INFO hudson.plugins.ec2.EC2Cloud getNewOrExistingAvailableSlave
          
          SlaveTemplate{ami='ami-02769bd03e603e42f', labels='docker docker-bakery'}. Cannot provision - no capacity for instances: 0
          
          Sep 30, 2019 9:32:14 AM WARNING hudson.plugins.ec2.EC2Cloud provision
          
          Can't raise nodes for SlaveTemplate{ami='ami-02769bd03e603e42f', labels='docker docker-bakery'}
          
          Sep 30, 2019 9:32:23 AM INFO hudson.plugins.ec2.EC2Cloud provision
          
          SlaveTemplate{ami='ami-02769bd03e603e42f', labels='docker docker-bakery'}. Attempting to provision slave needed by excess workload of 1 units
          
          Sep 30, 2019 9:32:24 AM INFO hudson.plugins.ec2.EC2Cloud getNewOrExistingAvailableSlave
          
          SlaveTemplate{ami='ami-02769bd03e603e42f', labels='docker docker-bakery'}. Cannot provision - no capacity for instances: 0
          
          Show
          jbochenski Jakub Bochenski added a comment - - edited I'm still seeing issues in 1.45. The instance is in running state but plugin can't see t be able to connect it. I can't test it on 1.46 because of JENKINS-59564 Sep 30, 2019 9:32:03 AM INFO hudson.plugins.ec2.EC2Cloud provision SlaveTemplate{ami= 'ami-02769bd03e603e42f' , labels= 'docker docker-bakery' }. Attempting to provision slave needed by excess workload of 1 units Sep 30, 2019 9:32:04 AM INFO hudson.plugins.ec2.SlaveTemplate logProvisionInfo SlaveTemplate{ami= 'ami-02769bd03e603e42f' , labels= 'docker docker-bakery' }. Considering launching Sep 30, 2019 9:32:04 AM INFO hudson.plugins.ec2.SlaveTemplate setupRootDevice AMI had xvda Sep 30, 2019 9:32:04 AM INFO hudson.plugins.ec2.SlaveTemplate setupRootDevice {DeleteOnTermination: true ,SnapshotId: snap-0f2d5ab1c6f918116,VolumeSize: 8,VolumeType: gp2,Encrypted: false ,} Sep 30, 2019 9:32:04 AM INFO hudson.plugins.ec2.SlaveTemplate logProvisionInfo SlaveTemplate{ami= 'ami-02769bd03e603e42f' , labels= 'docker docker-bakery' }. Setting Instance Initiated Shutdown Behavior : ShutdownBehavior.Terminate Sep 30, 2019 9:32:04 AM INFO hudson.plugins.ec2.SlaveTemplate logProvisionInfo SlaveTemplate{ami= 'ami-02769bd03e603e42f' , labels= 'docker docker-bakery' }. Looking for existing instances with describe-instance: {Filters: [{Name: image-id,Values: [ami-02769bd03e603e42f]}, {Name: instance-type,Values: [t3.micro]}, {Name: key-name,Values: [j4a-bochja]}, {Name: subnet-id,Values: [subnet-0a167eb56a247e891]}, {Name: instance.group-id,Values: [sg-0a4f5e5ac5bb602e4]}, {Name: tag:Name,Values: [ew1-j4a-jenkins-slave-ec2]}, {Name: tag:DeploymentName,Values: [ew1-j4a]}, {Name: tag:CostCenter,Values: [31505]}, {Name: tag:DeploymentType,Values: [dev]}, {Name: tag:DeploymentGroup,Values: [ew1-j4a]}, {Name: tag:jenkins_server_url,Values: [https: //acme.com/]}, {Name: tag:jenkins_slave_type,Values: [demand_ec2 (ami-02769bd03e603e42f)]}],InstanceIds: [],} Sep 30, 2019 9:32:05 AM INFO hudson.plugins.ec2.CloudHelper getInstance Unexpected number of reservations reported by EC2 for instance id 'i-0571c7e8c36a6b783' , expected 1 result, found []. Instance seems to be dead. Sep 30, 2019 9:32:05 AM WARNING hudson.plugins.ec2.EC2Cloud provision SlaveTemplate{ami= 'ami-02769bd03e603e42f' , labels= 'docker docker-bakery' }. Exception during provisioning com.amazonaws.AmazonClientException: Unexpected number of reservations reported by EC2 for instance id 'i-0571c7e8c36a6b783' , expected 1 result, found []. Instance seems to be dead. at hudson.plugins.ec2.CloudHelper.getInstance(CloudHelper.java:54) at hudson.plugins.ec2.CloudHelper.getInstanceWithRetry(CloudHelper.java:25) at hudson.plugins.ec2.EC2AbstractSlave.fetchLiveInstanceData(EC2AbstractSlave.java:566) at hudson.plugins.ec2.EC2AbstractSlave.<init>(EC2AbstractSlave.java:165) at hudson.plugins.ec2.EC2OndemandSlave.<init>(EC2OndemandSlave.java:56) at hudson.plugins.ec2.SlaveTemplate.newOndemandSlave(SlaveTemplate.java:1104) at hudson.plugins.ec2.SlaveTemplate.toSlaves(SlaveTemplate.java:773) at hudson.plugins.ec2.SlaveTemplate.provisionOndemand(SlaveTemplate.java:745) at hudson.plugins.ec2.SlaveTemplate.provisionOndemand(SlaveTemplate.java:585) at hudson.plugins.ec2.SlaveTemplate.provision(SlaveTemplate.java:540) at hudson.plugins.ec2.EC2Cloud.getNewOrExistingAvailableSlave(EC2Cloud.java:589) at hudson.plugins.ec2.EC2Cloud.provision(EC2Cloud.java:615) at hudson.slaves.NodeProvisioner$StandardStrategyImpl.apply(NodeProvisioner.java:715) at hudson.slaves.NodeProvisioner.update(NodeProvisioner.java:320) at hudson.slaves.NodeProvisioner.access$000(NodeProvisioner.java:62) at hudson.slaves.NodeProvisioner$NodeProvisionerInvoker.doRun(NodeProvisioner.java:807) at hudson.triggers.SafeTimerTask.run(SafeTimerTask.java:72) at jenkins.security.ImpersonatingScheduledExecutorService$1.run(ImpersonatingScheduledExecutorService.java:58) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang. Thread .run( Thread .java:748) Sep 30, 2019 9:32:13 AM INFO hudson.plugins.ec2.EC2Cloud provision SlaveTemplate{ami= 'ami-02769bd03e603e42f' , labels= 'docker docker-bakery' }. Attempting to provision slave needed by excess workload of 1 units Sep 30, 2019 9:32:14 AM INFO hudson.plugins.ec2.EC2Cloud getNewOrExistingAvailableSlave SlaveTemplate{ami= 'ami-02769bd03e603e42f' , labels= 'docker docker-bakery' }. Cannot provision - no capacity for instances: 0 Sep 30, 2019 9:32:14 AM WARNING hudson.plugins.ec2.EC2Cloud provision Can 't raise nodes for SlaveTemplate{ami=' ami-02769bd03e603e42f ', labels=' docker docker-bakery'} Sep 30, 2019 9:32:23 AM INFO hudson.plugins.ec2.EC2Cloud provision SlaveTemplate{ami= 'ami-02769bd03e603e42f' , labels= 'docker docker-bakery' }. Attempting to provision slave needed by excess workload of 1 units Sep 30, 2019 9:32:24 AM INFO hudson.plugins.ec2.EC2Cloud getNewOrExistingAvailableSlave SlaveTemplate{ami= 'ami-02769bd03e603e42f' , labels= 'docker docker-bakery' }. Cannot provision - no capacity for instances: 0

            People

            • Assignee:
              thoulen FABRIZIO MANFREDI
              Reporter:
              jbochenski Jakub Bochenski
            • Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

              • Created:
                Updated: