Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-56514

Triggering many (100+) concurrent builds causes some builds to not get parameters set

    Details

    • Type: Bug
    • Status: Resolved (View Workflow)
    • Priority: Major
    • Resolution: Duplicate
    • Labels:
      None
    • Environment:
    • Similar Issues:

      Description

      The problem is that certain job runs are not passed the parameters they should be getting and they fail.

      This seems to happens more when 100 or more job runs are triggered concurrently by a trigger. Sometimes with as little as 50.

       

      The failed jobs seem to get an executor on a worker, and start running but fail because the parameters they should have received aren't there. So the processing that expects valid parameters values fails.

       

      To test this setup I setup 2 Pipeline groovy jobs. The content of both jobs is attached.

       Parent job

      • runs on a worker provisioned by the GCE plugin
      • is triggered with some parameters, including 1 that tells it how many instances of the child job to trigger

      Child job

      • is triggered multiple times by the parent job and passed some parameters
      • each run of this job requires its own worker 
      • does something reasonably simple
           - verifies the worker it is running on is ready by checking for a file (this is just verifying that any necessary build caches like gradle, npm, pip, are on the worker)
           - using the parameters passed in by the parent job, it tries to download a file from GCS
           - then it sleeps for 15 mins, just to hold the worker
      • When this job fails, it is because the parameters it should have received are missing and it can't do the download of an artifact from GCS

       

      I have attached screen shots of what the parameters page looks like on a successful run as well as an unsuccessful run.

        Attachments

          Activity

          arash arash m created issue -
          arash arash m made changes -
          Field Original Value New Value
          Summary Triggering many (50+) concurrent builds causes some builds to not get parameters set Triggering many (100+) concurrent builds causes some builds to not get parameters set
          Description The problem is that certain job runs are not passed the parameters they should be getting and they fail.

          This problem seems to happens more when 50 or more job runs are triggered concurrently by a trigger.

           

          To test this setup I setup 2 Pipeline groovy jobs. The content of both jobs is attached.

           Parent job is triggered with some parameters, including 1 that tells it how many instances of the child job to trigger

          Child job - is triggered multiple times by the parent job and passed some parameters
          The problem is that certain job runs are not passed the parameters they should be getting and they fail.

          This seems to happens more when 100 or more job runs are triggered concurrently by a trigger. Sometimes with as little as 50.

           

          The failed jobs seem to get an executor on a worker, and start running but fail because the parameters they should have received aren't there. So the processing that expects valid parameters values fails.

           

          To test this setup I setup 2 Pipeline groovy jobs. The content of both jobs is attached.

           Parent job
          - runs on a worker provisioned by the GCE plugin
          - is triggered with some parameters, including 1 that tells it how many instances of the child job to trigger

          Child job
          - is triggered multiple times by the parent job and passed some parameters
          - each run of this job requires its own worker 
          - does something reasonably simple
             - verifies the worker it is running on is ready by checking for a file (this is just verifying that any necessary build caches like gradle, npm, pip, are on the worker)
             - using the parameters passed in by the parent job, it tries to download a file from GCS
             - then it sleeps for 15 mins, just to hold the worker
          - When this job fails, it is because the parameters it should have received are missing and it can't do the download of an artifact from GCS

           

          I have attached screen shots of what the parameters page looks like on a successful run as well as an unsuccessful run.
          Environment Jenkins is running in docker on an ubuntu 18.04 host in GCP

          Jenkins ver. 2.150.2
          Google compute engine plugin 2.0.0 (latest available)

          Using an instance template to spin up n1-standard-4 VMs using SSDs
          Jenkins is running in docker on an ubuntu 18.04 host in GCP

          Jenkins ver. 2.150.2
          Google compute engine plugin 2.0.0 (latest available)

          Using an instance template to spin up n1-standard-4 VMs using SSDs

          Jenkins master process is started with the following JVM Options for faster response when workers are needed. (Still experimenting with the right values).

          JAVA_OPTS="-Dhudson.slaves.NodeProvisioner.MARGIN=50 -Dhudson.slaves.NodeProvisioner.MARGIN0=0.85"
          Attachment successful-job-run.png [ 46360 ]
          Attachment unsuccessful-job-run.png [ 46361 ]
          Attachment child-job-example.groovy [ 46362 ]
          Attachment parent-job-example.groovy [ 46363 ]
          arash arash m made changes -
          Assignee Evan Brown [ evanbrown ] Karam Sivia [ karamsivia ]
          arash arash m made changes -
          Priority Minor [ 4 ] Major [ 3 ]
          zombiemoose Rachel Yen made changes -
          Assignee Karam Sivia [ karamsivia ] Rachel Yen [ zombiemoose ]
          craigbarber Craig Barber made changes -
          Status Open [ 1 ] Resolved [ 5 ]
          Resolution Duplicate [ 3 ]

            People

            • Assignee:
              zombiemoose Rachel Yen
              Reporter:
              arash arash m
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: