Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-61033

Provisioning thread hangs

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Cannot Reproduce
    • Icon: Major Major
    • docker-plugin
    • None
    • * Jenkins 2.190.3
      * docker-plugin:1.1.9

      Symptom

      • At some point in time, the docker plugin stops to provision any new docker agent.

      Evidence

      • I found out that at some point, the docker plugin tries to provision new containers and hangs:

       

      2020-02-05 23:55:49.244+0000 [id=67]    INFO    c.n.j.plugins.docker.DockerCloud#provision: Asked to provision 2 slave(s) for: null
      2020-02-05 23:55:49.467+0000 [id=67]    INFO    c.n.j.plugins.docker.DockerCloud#canAddProvisionedSlave: Not Provisioning '***/jenkins-agent:2.190.3.2'. Template instance limit of '8' reached on cloud '***'
      2020-02-05 23:55:49.467+0000 [id=67]    INFO    c.n.j.plugins.docker.DockerCloud#provision: Asked to provision 2 slave(s) for: null
      2020-02-05 23:55:49.583+0000 [id=67]    INFO    c.n.j.plugins.docker.DockerCloud#canAddProvisionedSlave: Not Provisioning '***/jenkins-agent:2.190.3.2'. Template instance limit of '8' reached on cloud '***'
      2020-02-05 23:55:59.244+0000 [id=71]    INFO    c.n.j.plugins.docker.DockerCloud#provision: Asked to provision 2 slave(s) for: null
      // NO MORE ENTRIES
      

       

      • After this last call, there is no more trace of any attempt from the docker plugin. Looking into the threaddump, I found out that the thread running the DockerCloud code is waiting on an external output: 

       

          "jenkins.util.Timer [#6]" id=71 (0x47) state=WAITING cpu=95%
          - waiting on <0x7891a9a1> (a java.util.concurrent.CountDownLatch$Sync)
          - locked <0x7891a9a1> (a java.util.concurrent.CountDownLatch$Sync)
          at sun.misc.Unsafe.park(Native Method)
          at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
          at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
          at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:997)
          at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1304)
          at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:231)
          at com.github.dockerjava.core.async.ResultCallbackTemplate.awaitCompletion(ResultCallbackTemplate.java:92)
          at com.github.dockerjava.netty.InvocationBuilder$ResponseCallback.awaitResult(InvocationBuilder.java:60)
          at com.github.dockerjava.netty.InvocationBuilder.get(InvocationBuilder.java:189)
          at io.jenkins.docker.client.ListContainersCmdExec.execute(ListContainersCmdExec.java:60)
          at io.jenkins.docker.client.ListContainersCmdExec.execute(ListContainersCmdExec.java:24)
          at com.github.dockerjava.netty.exec.AbstrSyncDockerCmdExec.exec(AbstrSyncDockerCmdExec.java:21)
          at com.github.dockerjava.core.command.AbstrDockerCmd.exec(AbstrDockerCmd.java:35)
          at com.nirima.jenkins.plugins.docker.DockerCloud.countContainersInDocker(DockerCloud.java:614)
          at com.nirima.jenkins.plugins.docker.DockerCloud.canAddProvisionedSlave(DockerCloud.java:632)
          at com.nirima.jenkins.plugins.docker.DockerCloud.provision(DockerCloud.java:352)
      

       

      • Hypothesis

      • It appears the wait happens in this code and I'm under the impression that there is not timeout so in case the call hangs, the thread will hang and the plugin won't provision anymore.

            Unassigned Unassigned
            pierrebtz Pierre Beitz
            Votes:
            1 Vote for this issue
            Watchers:
            3 Start watching this issue

              Created:
              Updated:
              Resolved: