Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-55556

EC2 unable to retrieve private IP (+ other buggy behaviour)

    Details

    • Type: Bug
    • Status: Closed (View Workflow)
    • Priority: Blocker
    • Resolution: Fixed
    • Component/s: ec2-plugin
    • Labels:
      None
    • Environment:
      docker jenkins/jenkins:lts-slim
      ec2-plugin 1.42
    • Similar Issues:

      Description

      I must say any version > 1.39 seems ultra buggy for our use case (ec2 spot instances @ c4.xlarge):

       

      • when launching manually the plugin can not pick up the private IP and tries to connect to "null:22" endlessly
      • when launched "on-demand" the plugin will launch 10+ spot instances, non of them will work though

       

      Downgrading to 1.39 makes the plugin work on the exact same setup. Curious why AWS is not stepping in to give this plugin some love...

        Attachments

          Activity

          Hide
          thoulen FABRIZIO MANFREDI added a comment -

          Hi,

          For the first problem, Can share more details on your configuration ?, how is your AWS network configuration , node configuration 

          In the 1.42 should not have changes on the connection ( in the next version there are some improvements)

          For the second problem, can you share any error message ? Did you update the IAM roles ?

          Show
          thoulen FABRIZIO MANFREDI added a comment - Hi, For the first problem, Can share more details on your configuration ?, how is your AWS network configuration , node configuration  In the 1.42 should not have changes on the connection ( in the next version there are some improvements) For the second problem, can you share any error message ? Did you update the IAM roles ?
          Hide
          lifeofguenter Gunter Grodotzki added a comment -

          When downgrading from 1.42 to 1.39 everything works with the exact same setup (no changes, just downgrading by manually uploading the hpi).

          The second problem is a result of the first problem, but for some reason it would spin up 10+ instances almost instantly (could be that new setting "launch new instances right away"?) - the error was the same, that is unable to connect to "null:22"

          Setup:

          • VPC with private/public subnets - private subnets outgoing via natgw (https://registry.terraform.io/modules/terraform-aws-modules/vpc/aws/1.51.0)
          • jenkins nodes are launched of a custom ami (debian + openjdk8/encrypted boot) with their own custom instance-profile
          • jenkins nodes are launched in private subnet with no public ip - a security group that explicitly only gives the jenkins-master access via port 22

           

          Hope this info helps, else let me know  

          Show
          lifeofguenter Gunter Grodotzki added a comment - When downgrading from 1.42 to 1.39 everything works with the exact same setup (no changes, just downgrading by manually uploading the hpi). The second problem is a result of the first problem, but for some reason it would spin up 10+ instances almost instantly (could be that new setting "launch new instances right away"?) - the error was the same, that is unable to connect to "null:22" Setup: VPC with private/public subnets - private subnets outgoing via natgw ( https://registry.terraform.io/modules/terraform-aws-modules/vpc/aws/1.51.0) jenkins nodes are launched of a custom ami (debian + openjdk8/encrypted boot) with their own custom instance-profile jenkins nodes are launched in private subnet with no public ip - a security group that explicitly only gives the jenkins-master access via port 22   Hope this info helps, else let me know  
          Hide
          thoulen FABRIZIO MANFREDI added a comment -

          A couple of questions :

          • did you update the IAM policy connected to the user/role used for raise new node ? (
            ...
            Effect": "Allow",
            "Action": [
            "iam:ListInstanceProfilesForRole",
            "iam:PassRole"
            ],)
          • Is Jenkins master with Public ip ? 
          • Jenkins master and slaves are in the same VPC  and subnet? 
          • What is the configuration of the ec-plugin (use public dns, ...), the option enabled?

           

          For the number of the nodes started, now the plugin is much more "reactive" to the status of the queue

           

          Show
          thoulen FABRIZIO MANFREDI added a comment - A couple of questions : did you update the IAM policy connected to the user/role used for raise new node ? ( ... Effect": "Allow", "Action": [ "iam:ListInstanceProfilesForRole", "iam:PassRole" ], ) Is Jenkins master with Public ip ?  Jenkins master and slaves are in the same VPC  and subnet?  What is the configuration of the ec-plugin (use public dns, ...), the option enabled?   For the number of the nodes started, now the plugin is much more "reactive" to the status of the queue  
          Hide
          joshuaspence Joshua Spence added a comment - - edited

          Hitting this issue as well. I was able to work around it by not using spot instances.

          Show
          joshuaspence Joshua Spence added a comment - - edited Hitting this issue as well. I was able to work around it by not using spot instances.
          Hide
          lifeofguenter Gunter Grodotzki added a comment -

          FABRIZIO MANFREDI

          • yes, those IAM actions are in place
          • jenkins master is with private IP but reachable over a public IP behind an ALB
          • jenkins master and nodes are in the same VPC, private subnets, but might be in different subnet-ids depending on launched AZ
          • public-dns is not enabled

           

          Joshua Spence might be correct, we are launching spot ec2 c4.xlarge - so might be an issue with spot instances

          Show
          lifeofguenter Gunter Grodotzki added a comment - FABRIZIO MANFREDI yes, those IAM actions are in place jenkins master is with private IP but reachable over a public IP behind an ALB jenkins master and nodes are in the same VPC, private subnets, but might be in different subnet-ids depending on launched AZ public-dns is not enabled   Joshua Spence might be correct, we are launching spot ec2 c4.xlarge - so might be an issue with spot instances
          Hide
          shaun Shaun Lawrie added a comment - - edited

          I have the same symptoms with spot instance private IP's not being identified by the ec2-plugin but ours also have public IP's assigned in the configuration.

          I downgraded to 1.39 so it behaves in the meantime.

          Show
          shaun Shaun Lawrie added a comment - - edited I have the same symptoms with spot instance private IP's not being identified by the ec2-plugin but ours also have public IP's assigned in the configuration. I downgraded to 1.39 so it behaves in the meantime.
          Hide
          herophuong Phuong Le added a comment -

          We are using spot instances too. Configuring master to connect to slaves using neither public IP or private IP works. Always

           

          Jan 21, 2019 6:41:58 AM hudson.plugins.ec2.EC2Cloud
          INFO: Failed to connect via ssh: There was a problem while connecting to null:22

           

          The plugin is indeed unusable for spot instance use case since 1.40, there was always different error in each version.

           

          Show
          herophuong Phuong Le added a comment - We are using spot instances too. Configuring master to connect to slaves using neither public IP or private IP works. Always   Jan 21, 2019 6:41:58 AM hudson.plugins.ec2.EC2Cloud INFO: Failed to connect via ssh: There was a problem while connecting to null:22   The plugin is indeed unusable for spot instance use case since 1.40, there was always different error in each version.  
          Hide
          vdczzz Victor Chavez added a comment -

          On Jenkins 2.150.2 EC2 plugin 1.42 we get this behavior intermittently. We oscillate between this error and the error in JENKINS-55639.

          With the null:22 error we additionally get the horrible side effect that the script continues to connect to the master node and runs the init script. Just by chance the init script I had in place wasn't destructive to the master. Now I have a little addition to my init script at the top:

          master=jenkins-master
          host=$(hostname -s)
          if [ $host == $master ]
          then
           echo "This is the master node! Exiting!!!"
           exit 1
          else
           echo "Server is a spot node, apparently. Better yet, it's NOT the master node. Continuing..."
          fi
          Show
          vdczzz Victor Chavez added a comment - On Jenkins 2.150.2 EC2 plugin 1.42 we get this behavior intermittently. We oscillate between this error and the error in JENKINS-55639 . With the null:22 error we additionally get the horrible side effect that the script continues to connect to the master  node and runs the init script. Just by chance the init script I had in place wasn't destructive to the master. Now I have a little addition to my init script at the top: master=jenkins-master host=$(hostname -s) if [ $host == $master ] then echo "This is the master node! Exiting!!!" exit 1 else echo "Server is a spot node, apparently. Better yet, it's NOT the master node. Continuing..." fi
          Hide
          thoulen FABRIZIO MANFREDI added a comment -

          In the 1.43  will be added an option to specify how to connect to slave (private or public)

          Show
          thoulen FABRIZIO MANFREDI added a comment - In the 1.43  will be added an option to specify how to connect to slave (private or public)
          Hide
          eric_knecht Eric Knecht added a comment -

          Is there an estimate of when 1.43 will be released?

          Show
          eric_knecht Eric Knecht added a comment - Is there an estimate of when 1.43 will be released?
          Hide
          thoulen FABRIZIO MANFREDI added a comment -

          1.43 has been released

          Show
          thoulen FABRIZIO MANFREDI added a comment - 1.43 has been released

            People

            • Assignee:
              thoulen FABRIZIO MANFREDI
              Reporter:
              lifeofguenter Gunter Grodotzki
            • Votes:
              4 Vote for this issue
              Watchers:
              13 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: