Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-50567

Docker cloud can get blocked by idleMinutes set to 0

    XMLWordPrintable

    Details

    • Similar Issues:

      Description

      When plugin gets configured with idleMinutes == 0, not only the agents get deleted sooner then they are launched turning the setup practically useless as reported in JENKINS-47953. It even seem to cause the provisioning for serviced labels to stuck almost completely. Here is what I observed is happening:

      • Agents was created and launching yet disappearing instantly.
      • The plugin was logging "No such container"[1] 3 times in a row, roughly for every botched node.
      • Both stopped to happen after changing idleMinutes from 0 to 10.
      • I presume the JENKINS-47953 kicked in deleting the container while provisioning was in progress.
      • Eventually, and presumably because of this, all provisioning stopped with multiple pending launches that never completes[2][3] that are not done nor cancelled and yet they do not have a running thread in stacktrace. This is causing the plannedCapacity > demand so nothing else is provisioned.
        • I admit a do not quite understand how did the futures get in such state but this too stopped occurring right after fixing idleMinutes and cancelling dangling futures.

       

      Having said that, I suggest to make the 0 timeout unsupported and use some sane defaults even when configured explicitly (this can either happen during migration or manually by user not quite aware of this surprising consequences). An alternative would be ensuring the slave will only be disposed after launched/used.

      [1]

      Apr 04, 2018 3:10:56 AM com.github.dockerjava.core.async.ResultCallbackTemplate onError
      SEVERE: Error during callback
      com.github.dockerjava.api.exception.NotFoundException: {"message":"No such container: ffb2753cdd6ec73ed30477adf26ced6824e4cac434b3f740395e7e255efd7a13"}
      
      	at com.github.dockerjava.netty.handler.HttpResponseHandler.channelRead0(HttpResponseHandler.java:103)
      	at com.github.dockerjava.netty.handler.HttpResponseHandler.channelRead0(HttpResponseHandler.java:33)
      	at io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
      	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)
      	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)
      	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340)
      	at io.netty.handler.logging.LoggingHandler.channelRead(LoggingHandler.java:241)
      	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)
      	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)
      	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340)
      	at io.netty.channel.CombinedChannelDuplexHandler$DelegatingChannelHandlerContext.fireChannelRead(CombinedChannelDuplexHandler.java:438)
      	at io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:310)
      	at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:284)
      	at io.netty.channel.CombinedChannelDuplexHandler.channelRead(CombinedChannelDuplexHandler.java:253)
      	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)
      	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)
      	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340)
      	at io.netty.handler.timeout.IdleStateHandler.channelRead(IdleStateHandler.java:287)
      	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)
      	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)
      	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340)
      	at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1334)
      	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)
      	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)
      	at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:926)
      	at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:134)
      	at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:644)
      	at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:579)
      	at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:496)
      	at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:458)
      	at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:858)
      	at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:138)
      	at java.lang.Thread.run(Thread.java:748)
      

      [2]

      label = Jenkins.instance.getLabel('jslave-idm-docker')
      Jenkins.instance.clouds.each {
        assert it.canProvision(label)
      }
      def np = label.nodeProvisioner
      def pl = np.@pendingLaunches.get()
      println "Pending launches: " + pl.size()
      pl.each {
        println "\t${it.displayName} done=${it.future.isDone()} canceled=${it.future.isCancelled()} #=${System.identityHashCode(it.future)}"
        //it.future.cancel(true)
      }
      println np.provisioningState
      return null
      

      [3]

      Pending launches: 36
        Image of XXX done=false canceled=false #=423560755
        Image of XXX done=false canceled=false #=51689623
        Image of XXX done=false canceled=false #=1405078103
        Image of XXX done=false canceled=false #=1081078172
        Image of XXX done=false canceled=false #=1322747457
        Image of XXX done=false canceled=false #=1747017484
        Image of XXX done=false canceled=false #=1164037421
        Image of XXX done=false canceled=false #=2059216652
        Image of XXX done=false canceled=false #=145380244
        Image of XXX done=false canceled=false #=1779545266
        Image of XXX done=false canceled=false #=567415928
        Image of XXX done=false canceled=false #=843366921
        Image of XXX done=false canceled=false #=1747456294
        Image of XXX done=false canceled=false #=259759285
        Image of XXX done=false canceled=false #=1044068598
        Image of XXX done=false canceled=false #=1371528757
        Image of XXX done=false canceled=false #=1126642027
        Image of XXX done=false canceled=false #=1390627009
        Image of XXX done=false canceled=false #=2105699038
        Image of XXX done=false canceled=false #=1857421890
        Image of XXX done=false canceled=false #=341891734
        Image of XXX done=false canceled=false #=1544367515
        Image of XXX done=false canceled=false #=842491998
        Image of XXX done=false canceled=false #=1825425480
        Image of XXX done=false canceled=false #=2129333037
        Image of XXX done=false canceled=false #=1270845598
        Image of XXX done=false canceled=false #=1120105519
        Image of XXX done=false canceled=false #=1087273911
        Image of XXX done=false canceled=false #=1717220064
        Image of XXX done=false canceled=false #=1373042133
        Image of XXX done=false canceled=false #=149787084
        Image of XXX done=false canceled=false #=994918565
        Image of XXX done=false canceled=false #=145959070
        Image of XXX done=false canceled=false #=1848450048
        Image of XXX done=false canceled=false #=170558456
        Image of XXX done=false canceled=false #=1538499184
      StrategyState{label=jslave-idm-docker, snapshot=LoadStatisticsSnapshot{definedExecutors=0, onlineExecutors=0, connectingExecutors=0, busyExecutors=0, idleExecutors=0, availableExecutors=0, queueLength=14}, plannedCapacitySnapshot=36, additionalPlannedCapacity=0}
      
      

        Attachments

          Activity

          Hide
          pjdarton pjdarton added a comment -

          https://github.com/jenkinsci/docker-plugin/pull/623 ensured that the idleMinutes can't be zero.

          That was fixed in 1.1.4.

          Show
          pjdarton pjdarton added a comment - https://github.com/jenkinsci/docker-plugin/pull/623 ensured that the idleMinutes can't be zero. That was fixed in 1.1.4.

            People

            • Assignee:
              ndeloof Nicolas De Loof
              Reporter:
              olivergondza Oliver Gondža
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: