Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-73010

new agent pods won't start (on some jobs, for unknown reason)

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Not A Defect
    • Icon: Critical Critical
    • kubernetes-plugin
    • None

      Some jobs are not starting new agents. something is wrong with the name or the label.

      I could see this error from Jenkins master:

      SEVERE	hudson.triggers.SafeTimerTask#run: Timer task hudson.slaves.NodeProvisioner$NodeProvisionerInvoker@1828da12 failed
      java.lang.IllegalArgumentException
      	at hudson.slaves.NodeProvisioner$PlannedNode.<init>(NodeProvisioner.java:102)
      	at com.cloudbees.jenkins.plugins.amazonecs.ECSCloud.provision(ECSCloud.java:292)
      	at com.cloudbees.jenkins.plugins.amazonecs.ECSProvisioningStrategy.apply(ECSProvisioningStrategy.java:66)
      	at hudson.slaves.NodeProvisioner.update(NodeProvisioner.java:325)
      	at hudson.slaves.NodeProvisioner$NodeProvisionerInvoker.doRun(NodeProvisioner.java:823)
      	at hudson.triggers.SafeTimerTask.run(SafeTimerTask.java:92)
      	at jenkins.security.ImpersonatingScheduledExecutorService$1.run(ImpersonatingScheduledExecutorService.java:67)
      	at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
      	at java.base/java.util.concurrent.FutureTask.runAndReset(Unknown Source)
      	at java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(Unknown Source)
      	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
      	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
      

      how we are using the template and config:

      pipeline {
          agent {
              kubernetes {
                  cloud 'eks-....'
                  idleMinutes 90
                  defaultContainer 'jenkins-slave'
                  yaml """....
      """
      

      Some notes:

      • agent are not starting at all, we can't see them in the cluster at all, just this message in the job and that is it: ('some-job-1528-h16kf-7f4mq-kg9gz' is offline)
      • when renaming the job name from `some-job` to `asome-job`the job is starting right away (job rename from the jenkins UI, same pipeline code, rebuild)
      • after downgrading (Kubernetes plugin) to a very older version from a few months ago it started to work
      • I could not see other errors but the error I saw above
      • we are using jenkins version `Version 2.440.1`
      • also we have tested with another EKS cluster, and it's the same issue.

      if anyone needs more information please let me know.

            Unassigned Unassigned
            nimitack Adam Delarosa
            Votes:
            1 Vote for this issue
            Watchers:
            4 Start watching this issue

              Created:
              Updated:
              Resolved: