Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-24831

Executor gets assigned multiple times

    Details

    • Similar Issues:

      Description

      We have multiple slave nodes, on windows, all having the number of executors set to 1.

      Our workflow is as follows:
      Job 1 requests an executor with a specific label and calls Job 2 (waiting for it's completion)
      Job 2 requests another executor with same label and calls Job 3 (waiting for it's completion).

      Very seldom, we have runs where the same node (with the number of executors set to 1) gets assigned 2 times (we always assign 2 executors to a run).

        Attachments

          Activity

          Hide
          ociuhandu Octavian Ciuhandu added a comment -

          Job 2 calls another job (Job 3) that actually uses both parameters.
          Job 1 config.xml: http://paste.openstack.org/show/114710/
          Job 2 config.xml: http://paste.openstack.org/show/114711/

          By "gets assigned 2 times" I mean that when I check the 2 parameters (that are set in Job 1 and Job 2) they have the same value, e.g.:
          hyperv01: node1.domain.tld
          hyperv02: node1.domain.tld

          Since each job states that the parameter hyperv01/hyperv02 takes value $NODE_NAME and there's only one executor per slave node, this should be impossible.

          Show
          ociuhandu Octavian Ciuhandu added a comment - Job 2 calls another job (Job 3) that actually uses both parameters. Job 1 config.xml: http://paste.openstack.org/show/114710/ Job 2 config.xml: http://paste.openstack.org/show/114711/ By "gets assigned 2 times" I mean that when I check the 2 parameters (that are set in Job 1 and Job 2) they have the same value, e.g.: hyperv01: node1.domain.tld hyperv02: node1.domain.tld Since each job states that the parameter hyperv01/hyperv02 takes value $NODE_NAME and there's only one executor per slave node, this should be impossible.
          Hide
          ociuhandu Octavian Ciuhandu added a comment -

          As a note, it's something that happens only occasionally, we have seen this happening about around 5 times in the last week or so, and we have not been able to identify any possible reason for this.

          Show
          ociuhandu Octavian Ciuhandu added a comment - As a note, it's something that happens only occasionally, we have seen this happening about around 5 times in the last week or so, and we have not been able to identify any possible reason for this.
          Hide
          kshcherban Konstantin Shcherban added a comment - - edited

          I have exactly the same issue with Amazon EC2 Container Service plugin. Jenkins creates new ECS task with labeled slave that has only 1 executor and starts build there, then another job comes that uses the same label. For some reason it assigns new build to the same only executor. Once one of the builds finishes slave is killed and second build fails because of it. MatrixJob creates second build on executor

          FATAL: command execution failed
          java.nio.channels.ClosedChannelException
          	at org.jenkinsci.remoting.protocol.impl.ChannelApplicationLayer.onReadClosed(ChannelApplicationLayer.java:208)
          	at org.jenkinsci.remoting.protocol.ApplicationLayer.onRecvClosed(ApplicationLayer.java:222)
          	at org.jenkinsci.remoting.protocol.ProtocolStack$Ptr.onRecvClosed(ProtocolStack.java:832)
          	at org.jenkinsci.remoting.protocol.FilterLayer.onRecvClosed(FilterLayer.java:287)
          	at org.jenkinsci.remoting.protocol.impl.SSLEngineFilterLayer.onRecvClosed(SSLEngineFilterLayer.java:181)
          	at org.jenkinsci.remoting.protocol.impl.SSLEngineFilterLayer.switchToNoSecure(SSLEngineFilterLayer.java:283)
          	at org.jenkinsci.remoting.protocol.impl.SSLEngineFilterLayer.processWrite(SSLEngineFilterLayer.java:503)
          	at org.jenkinsci.remoting.protocol.impl.SSLEngineFilterLayer.processQueuedWrites(SSLEngineFilterLayer.java:248)
          	at org.jenkinsci.remoting.protocol.impl.SSLEngineFilterLayer.doSend(SSLEngineFilterLayer.java:200)
          	at org.jenkinsci.remoting.protocol.impl.SSLEngineFilterLayer.doCloseSend(SSLEngineFilterLayer.java:213)
          	at org.jenkinsci.remoting.protocol.ProtocolStack$Ptr.doCloseSend(ProtocolStack.java:800)
          	at org.jenkinsci.remoting.protocol.ApplicationLayer.doCloseWrite(ApplicationLayer.java:173)
          	at org.jenkinsci.remoting.protocol.impl.ChannelApplicationLayer$ByteBufferCommandTransport.closeWrite(ChannelApplicationLayer.java:311)
          	at hudson.remoting.Channel.close(Channel.java:1295)
          	at hudson.remoting.Channel.close(Channel.java:1263)
          	at hudson.slaves.SlaveComputer.closeChannel(SlaveComputer.java:708)
          	at hudson.slaves.SlaveComputer.access$800(SlaveComputer.java:96)
          	at hudson.slaves.SlaveComputer$3.run(SlaveComputer.java:626)
          	at jenkins.util.ContextResettingExecutorService$1.run(ContextResettingExecutorService.java:28)
          	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
          	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
          	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
          	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
          	at java.lang.Thread.run(Thread.java:748)
          Caused: java.io.IOException: Backing channel 'JNLP4-connect connection from ip-172-29-16-239.eu-west-1.compute.internal/172.29.16.239:9050' is disconnected.
          	at hudson.remoting.RemoteInvocationHandler.channelOrFail(RemoteInvocationHandler.java:192)
          	at hudson.remoting.RemoteInvocationHandler.invoke(RemoteInvocationHandler.java:257)
          	at com.sun.proxy.$Proxy100.isAlive(Unknown Source)
          	at hudson.Launcher$RemoteLauncher$ProcImpl.isAlive(Launcher.java:1043)
          	at hudson.Launcher$RemoteLauncher$ProcImpl.join(Launcher.java:1035)
          	at hudson.tasks.CommandInterpreter.join(CommandInterpreter.java:155)
          	at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:109)
          	at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:66)
          	at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:20)
          	at hudson.model.AbstractBuild$AbstractBuildExecution.perform(AbstractBuild.java:735)
          	at hudson.model.Build$BuildExecution.build(Build.java:206)
          	at hudson.model.Build$BuildExecution.doRun(Build.java:163)
          	at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:490)
          	at hudson.model.Run.execute(Run.java:1735)
          	at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:43)
          	at hudson.model.ResourceController.execute(ResourceController.java:97)
          	at hudson.model.Executor.run(Executor.java:405)
          Import campaign adwords_id= '239459797'
          Build step 'Execute shell' marked build as failure
          FATAL: null
          java.lang.NullPointerException
          	at org.jenkinsci.plugins.credentialsbinding.impl.UnbindableDir.tempDir(UnbindableDir.java:67)
          	at org.jenkinsci.plugins.credentialsbinding.impl.UnbindableDir.secretsDir(UnbindableDir.java:62)
          	at org.jenkinsci.plugins.credentialsbinding.impl.UnbindableDir.access$000(UnbindableDir.java:23)
          	at org.jenkinsci.plugins.credentialsbinding.impl.UnbindableDir$UnbinderImpl.unbind(UnbindableDir.java:84)
          	at org.jenkinsci.plugins.credentialsbinding.impl.SecretBuildWrapper$1.tearDown(SecretBuildWrapper.java:113)
          	at hudson.model.Build$BuildExecution.doRun(Build.java:174)
          	at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:490)
          	at hudson.model.Run.execute(Run.java:1735)
          	at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:43)
          	at hudson.model.ResourceController.execute(ResourceController.java:97)
          	at hudson.model.Executor.run(Executor.java:405)
          Finished: FAILURE

           

          Please see screenshot below.

           

          Jenkins version: 2.60.1
          Jnlp slave: https://hub.docker.com/r/jenkinsci/jnlp-slave/ latest
          AWS ECS plugin: 1.11

          Show
          kshcherban Konstantin Shcherban added a comment - - edited I have exactly the same issue with Amazon EC2 Container Service plugin . Jenkins creates new ECS task with labeled slave that has only 1 executor and starts build there, then another job comes that uses the same label. For some reason it assigns new build to the same only executor. Once one of the builds finishes slave is killed and second build fails because of it. MatrixJob creates second build on executor FATAL: command execution failed java.nio.channels.ClosedChannelException at org.jenkinsci.remoting.protocol.impl.ChannelApplicationLayer.onReadClosed(ChannelApplicationLayer.java:208) at org.jenkinsci.remoting.protocol.ApplicationLayer.onRecvClosed(ApplicationLayer.java:222) at org.jenkinsci.remoting.protocol.ProtocolStack$Ptr.onRecvClosed(ProtocolStack.java:832) at org.jenkinsci.remoting.protocol.FilterLayer.onRecvClosed(FilterLayer.java:287) at org.jenkinsci.remoting.protocol.impl.SSLEngineFilterLayer.onRecvClosed(SSLEngineFilterLayer.java:181) at org.jenkinsci.remoting.protocol.impl.SSLEngineFilterLayer.switchToNoSecure(SSLEngineFilterLayer.java:283) at org.jenkinsci.remoting.protocol.impl.SSLEngineFilterLayer.processWrite(SSLEngineFilterLayer.java:503) at org.jenkinsci.remoting.protocol.impl.SSLEngineFilterLayer.processQueuedWrites(SSLEngineFilterLayer.java:248) at org.jenkinsci.remoting.protocol.impl.SSLEngineFilterLayer.doSend(SSLEngineFilterLayer.java:200) at org.jenkinsci.remoting.protocol.impl.SSLEngineFilterLayer.doCloseSend(SSLEngineFilterLayer.java:213) at org.jenkinsci.remoting.protocol.ProtocolStack$Ptr.doCloseSend(ProtocolStack.java:800) at org.jenkinsci.remoting.protocol.ApplicationLayer.doCloseWrite(ApplicationLayer.java:173) at org.jenkinsci.remoting.protocol.impl.ChannelApplicationLayer$ByteBufferCommandTransport.closeWrite(ChannelApplicationLayer.java:311) at hudson.remoting.Channel.close(Channel.java:1295) at hudson.remoting.Channel.close(Channel.java:1263) at hudson.slaves.SlaveComputer.closeChannel(SlaveComputer.java:708) at hudson.slaves.SlaveComputer.access$800(SlaveComputer.java:96) at hudson.slaves.SlaveComputer$3.run(SlaveComputer.java:626) at jenkins.util.ContextResettingExecutorService$1.run(ContextResettingExecutorService.java:28) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang. Thread .run( Thread .java:748) Caused: java.io.IOException: Backing channel 'JNLP4-connect connection from ip-172-29-16-239.eu-west-1.compute.internal/172.29.16.239:9050' is disconnected. at hudson.remoting.RemoteInvocationHandler.channelOrFail(RemoteInvocationHandler.java:192) at hudson.remoting.RemoteInvocationHandler.invoke(RemoteInvocationHandler.java:257) at com.sun.proxy.$Proxy100.isAlive(Unknown Source) at hudson.Launcher$RemoteLauncher$ProcImpl.isAlive(Launcher.java:1043) at hudson.Launcher$RemoteLauncher$ProcImpl.join(Launcher.java:1035) at hudson.tasks.CommandInterpreter.join(CommandInterpreter.java:155) at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:109) at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:66) at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:20) at hudson.model.AbstractBuild$AbstractBuildExecution.perform(AbstractBuild.java:735) at hudson.model.Build$BuildExecution.build(Build.java:206) at hudson.model.Build$BuildExecution.doRun(Build.java:163) at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:490) at hudson.model.Run.execute(Run.java:1735) at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:43) at hudson.model.ResourceController.execute(ResourceController.java:97) at hudson.model.Executor.run(Executor.java:405) Import campaign adwords_id= '239459797' Build step 'Execute shell' marked build as failure FATAL: null java.lang.NullPointerException at org.jenkinsci.plugins.credentialsbinding.impl.UnbindableDir.tempDir(UnbindableDir.java:67) at org.jenkinsci.plugins.credentialsbinding.impl.UnbindableDir.secretsDir(UnbindableDir.java:62) at org.jenkinsci.plugins.credentialsbinding.impl.UnbindableDir.access$000(UnbindableDir.java:23) at org.jenkinsci.plugins.credentialsbinding.impl.UnbindableDir$UnbinderImpl.unbind(UnbindableDir.java:84) at org.jenkinsci.plugins.credentialsbinding.impl.SecretBuildWrapper$1.tearDown(SecretBuildWrapper.java:113) at hudson.model.Build$BuildExecution.doRun(Build.java:174) at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:490) at hudson.model.Run.execute(Run.java:1735) at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:43) at hudson.model.ResourceController.execute(ResourceController.java:97) at hudson.model.Executor.run(Executor.java:405) Finished: FAILURE   Please see screenshot below.   Jenkins version: 2.60.1 Jnlp slave: https://hub.docker.com/r/jenkinsci/jnlp-slave/  latest AWS ECS plugin:  1.11
          Hide
          ktomioka Katsuya Tomioka added a comment -

          I'm seeing the same issue with the nomad plugin 0.4, Jenkins 2.121.2. If two builds are started close each other (while waiting for executors to be provisioned) both jobs are assigned to the same executor.

          Show
          ktomioka Katsuya Tomioka added a comment - I'm seeing the same issue with the nomad plugin 0.4, Jenkins 2.121.2. If two builds are started close each other (while waiting for executors to be provisioned) both jobs are assigned to the same executor.
          Hide
          staceyf Stacey Fletcher added a comment -

          Was the root cause ever figured out here?  I am also seeing the same thing in our Jenkins.

          Show
          staceyf Stacey Fletcher added a comment - Was the root cause ever figured out here?  I am also seeing the same thing in our Jenkins.

            People

            • Assignee:
              Unassigned
              Reporter:
              ociuhandu Octavian Ciuhandu
            • Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

              • Created:
                Updated: