Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-55483

ec2-plugin Wrong order of workers termination

    Details

    • Similar Issues:
    • Released As:
      1.45

      Description

      Dear colleagues, we found issue when we're using ec2-plugin. Problem appears when aws spot instance (jenkins slave) is terminating because of "Idle termination timeout".
      EC2 plugin tries to first Cancel and Terminate AWS spot worker(slave) and after that remove node from Jenkins.

      AWS instance Cancel and Termination process takes longer period and during this time Jenkins can try to build any new job on this let's say "available" node. Job failed because node is already in terminating state within AWS.

      The better handling of node termination should be - put the node offline and after that cancel and remove it from aws.

      1. "put the node offline or disconnect" (I dont know exact method)
      2. ec2.cancelSpotInstanceRequests(...)
      3. ec2.terminateInstances(...)
      4. Jenkins.getInstance().removeNode(...)

      Please see attached job.log file where you can see end of failed job.

        Attachments

          Activity

          Hide
          thoulen FABRIZIO MANFREDI added a comment -

          Hi, are you sure that the spot instance has not be retired by AWS ?

          I have to check the spot instance code, because is a bit different from the on-demand, but it should not possible to assign any new jobs to a node that reached the idle timeout.

          Show
          thoulen FABRIZIO MANFREDI added a comment - Hi, are you sure that the spot instance has not be retired by AWS ? I have to check the spot instance code, because is a bit different from the on-demand, but it should not possible to assign any new jobs to a node that reached the idle timeout.
          Hide
          polanjir Jiri Polansky added a comment - - edited

          Spot instance was terminated by ec2 plugin (Event name: CancelSpotInstanceRequests). Node was available for some moment during instance spot termination and node removal from Jenkins. Therefore a new job was assigned.

          Show
          polanjir Jiri Polansky added a comment - - edited Spot instance was terminated by ec2 plugin (Event name: CancelSpotInstanceRequests). Node was available for some moment during instance spot termination and node removal from Jenkins. Therefore a new job was assigned.

            People

            • Assignee:
              thoulen FABRIZIO MANFREDI
              Reporter:
              polanjir Jiri Polansky
            • Votes:
              1 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: