Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-40825

"Pipe not connected" errors when running multiple builds simultaneously

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Resolved (View Workflow)
    • Priority: Blocker
    • Resolution: Fixed
    • Component/s: kubernetes-plugin
    • Labels:
      None
    • Environment:
      Jenkins 2.60
      Kubernetes plugin 0.12
      Kubernetes 1.5.1 on GKE
      Kubernetes 1.7.0 on AWS
    • Similar Issues:

      Description

      Hi there,

      We have Jenkins running in Kubernetes with the Kubernetes plugin, and have been experiencing `java.io.IOException: Pipe not connected` errors when running multiple builds simultaneously. This seems to consistently happen when we run 8 or more builds (on the same pipeline). About 50% of the builds will succeed, and the other 50% will fail with the `Pipe not connected` exception. Most of the time it will fail at stage 1, but sometimes at stage 2.

      We're using the following pipeline:

      podTemplate(label: 'mypod', containers: [
        containerTemplate(name: 'debian', image: 'debian', ttyEnabled: true, command: 'cat'),
        containerTemplate(name: 'ubuntu', image: 'ubuntu', ttyEnabled: true, command: 'cat')
      ]) {
        node('mypod') {
          container('debian') {
            stage('stage 1') {
              sh 'echo hello'
              sh 'sleep 30'
              sh 'echo world'
            }
      
            stage('stage 2') {
              sh 'echo hello'
              sh 'sleep 30'
              sh 'echo world'
            }
          }
        }
      }
      

      And this is the log of such failed build:

      [Pipeline] podTemplate
      [Pipeline] {
      [Pipeline] node
      Still waiting to schedule task
      Waiting for next available executor on mypod
      Running on kubernetes-a0e59102b59b48ad99693ca32b94ab38-11a5bcd7df12e4 in /home/jenkins/workspace/kubernetes-test-3
      [Pipeline] {
      [Pipeline] container
      [Pipeline] {
      [Pipeline] stage
      [Pipeline] { (stage 1)
      [Pipeline] sh
      [kubernetes-test-3] Running shell script
      Executing shell script inside container [debian] of pod [kubernetes-a0e59102b59b48ad99693ca32b94ab38-11a5bcd7df12e4]
      Executing command: sh -c echo $$ > '/home/jenkins/workspace/kubernetes-test-3@tmp/durable-f201019b/pid'; jsc=durable-7534cabf595ac7f32ca72b4db83e0af1; JENKINS_SERVER_COOKIE=$jsc '/home/jenkins/workspace/kubernetes-test-3@tmp/durable-f201019b/script.sh' > '/home/jenkins/workspace/kubernetes-test-3@tmp/durable-f201019b/jenkins-log.txt' 2>&1; echo $? > '/home/jenkins/workspace/kubernetes-test-3@tmp/durable-f201019b/jenkins-result.txt' 
      # cd /home/jenkins/workspace/kubernetes-test-3
      sh -c echo $$ > '/home/jenkins/workspace/kubernetes-test-3@tmp/durable-f201019b/pid'; jsc=durable-7534cabf595ac7f32ca72b4db83e0af1; JENKINS_SERVER_COOKIE=$jsc '/home/jenkins/workspace/kubernetes-test-3@tmp/durable-f201019b/script.sh' > '/home/jenkins/workspace/kubernetes-test-3@tmp/durable-f201019b/jenkins-log.txt' 2>&1; echo $? > '/home/jenkins/workspace/kubernetes-test-3@tmp/durable-f201019b/jenkins-result.txt' 
      exit
      # # + echo hello
      hello
      [Pipeline] sh
      [kubernetes-test-3] Running shell script
      Executing shell script inside container [debian] of pod [kubernetes-a0e59102b59b48ad99693ca32b94ab38-11a5bcd7df12e4]
      Executing command: sh -c echo $$ > '/home/jenkins/workspace/kubernetes-test-3@tmp/durable-0eb192c0/pid'; jsc=durable-7534cabf595ac7f32ca72b4db83e0af1; JENKINS_SERVER_COOKIE=$jsc '/home/jenkins/workspace/kubernetes-test-3@tmp/durable-0eb192c0/script.sh' > '/home/jenkins/workspace/kubernetes-test-3@tmp/durable-0eb192c0/jenkins-log.txt' 2>&1; echo $? > '/home/jenkins/workspace/kubernetes-test-3@tmp/durable-0eb192c0/jenkins-result.txt' 
      # cd /home/jenkins/workspace/kubernetes-test-3
      sh -c echo $$ > '/home/jenkins/workspace/kubernetes-test-3@tmp/durable-0eb192c0/pid'; jsc=durable-7534cabf595ac7f32ca72b4db83e0af1; JENKINS_SERVER_COOKIE=$jsc '/home/jenkins/workspace/kubernetes-test-3@tmp/durable-0eb192c0/script.sh' > '/home/jenkins/workspace/kubernetes-test-3@tmp/durable-0eb192c0/jenkins-log.txt' 2>&1; echo $? > '/home/jenkins/workspace/kubernetes-test-3@tmp/durable-0eb192c0/jenkins-result.txt' 
      exit
      # + sleep 30
      # [Pipeline] sh
      [kubernetes-test-3] Running shell script
      Executing shell script inside container [debian] of pod [kubernetes-a0e59102b59b48ad99693ca32b94ab38-11a5bcd7df12e4]
      [Pipeline] }
      [Pipeline] // stage
      [Pipeline] }
      [Pipeline] // container
      [Pipeline] }
      [Pipeline] // node
      [Pipeline] }
      [Pipeline] // podTemplate
      [Pipeline] End of Pipeline
      java.io.IOException: Pipe not connected
      	at java.io.PipedOutputStream.write(PipedOutputStream.java:140)
      	at java.io.OutputStream.write(OutputStream.java:75)
      	at org.csanchez.jenkins.plugins.kubernetes.pipeline.ContainerExecDecorator$1.launch(ContainerExecDecorator.java:125)
      	at hudson.Launcher$ProcStarter.start(Launcher.java:384)
      	at org.jenkinsci.plugins.durabletask.BourneShellScript.launchWithCookie(BourneShellScript.java:147)
      	at org.jenkinsci.plugins.durabletask.FileMonitoringTask.launch(FileMonitoringTask.java:61)
      	at org.jenkinsci.plugins.workflow.steps.durable_task.DurableTaskStep$Execution.start(DurableTaskStep.java:158)
      	at org.jenkinsci.plugins.workflow.cps.DSL.invokeStep(DSL.java:184)
      	at org.jenkinsci.plugins.workflow.cps.DSL.invokeMethod(DSL.java:126)
      	at org.jenkinsci.plugins.workflow.cps.CpsScript.invokeMethod(CpsScript.java:108)
      	at groovy.lang.GroovyObject$invokeMethod.call(Unknown Source)
      	at org.codehaus.groovy.runtime.callsite.CallSiteArray.defaultCall(CallSiteArray.java:48)
      	at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:113)
      	at org.kohsuke.groovy.sandbox.impl.Checker$1.call(Checker.java:151)
      	at org.kohsuke.groovy.sandbox.GroovyInterceptor.onMethodCall(GroovyInterceptor.java:21)
      	at org.jenkinsci.plugins.scriptsecurity.sandbox.groovy.SandboxInterceptor.onMethodCall(SandboxInterceptor.java:115)
      	at org.kohsuke.groovy.sandbox.impl.Checker$1.call(Checker.java:149)
      	at org.kohsuke.groovy.sandbox.impl.Checker.checkedCall(Checker.java:146)
      	at org.kohsuke.groovy.sandbox.impl.Checker.checkedCall(Checker.java:123)
      	at org.kohsuke.groovy.sandbox.impl.Checker.checkedCall(Checker.java:123)
      	at org.kohsuke.groovy.sandbox.impl.Checker.checkedCall(Checker.java:123)
      	at org.kohsuke.groovy.sandbox.impl.Checker.checkedCall(Checker.java:123)
      	at com.cloudbees.groovy.cps.sandbox.SandboxInvoker.methodCall(SandboxInvoker.java:16)
      	at WorkflowScript.run(WorkflowScript:10)
      	at ___cps.transform___(Native Method)
      	at com.cloudbees.groovy.cps.impl.ContinuationGroup.methodCall(ContinuationGroup.java:57)
      	at com.cloudbees.groovy.cps.impl.FunctionCallBlock$ContinuationImpl.dispatchOrArg(FunctionCallBlock.java:109)
      	at com.cloudbees.groovy.cps.impl.FunctionCallBlock$ContinuationImpl.fixArg(FunctionCallBlock.java:82)
      	at sun.reflect.GeneratedMethodAccessor521.invoke(Unknown Source)
      	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
      	at java.lang.reflect.Method.invoke(Method.java:498)
      	at com.cloudbees.groovy.cps.impl.ContinuationPtr$ContinuationImpl.receive(ContinuationPtr.java:72)
      	at com.cloudbees.groovy.cps.impl.ConstantBlock.eval(ConstantBlock.java:21)
      	at com.cloudbees.groovy.cps.Next.step(Next.java:58)
      	at com.cloudbees.groovy.cps.Continuable.run0(Continuable.java:154)
      	at org.jenkinsci.plugins.workflow.cps.SandboxContinuable.access$001(SandboxContinuable.java:18)
      	at org.jenkinsci.plugins.workflow.cps.SandboxContinuable$1.call(SandboxContinuable.java:33)
      	at org.jenkinsci.plugins.workflow.cps.SandboxContinuable$1.call(SandboxContinuable.java:30)
      	at org.jenkinsci.plugins.scriptsecurity.sandbox.groovy.GroovySandbox.runInSandbox(GroovySandbox.java:108)
      	at org.jenkinsci.plugins.workflow.cps.SandboxContinuable.run0(SandboxContinuable.java:30)
      	at org.jenkinsci.plugins.workflow.cps.CpsThread.runNextChunk(CpsThread.java:163)
      	at org.jenkinsci.plugins.workflow.cps.CpsThreadGroup.run(CpsThreadGroup.java:324)
      	at org.jenkinsci.plugins.workflow.cps.CpsThreadGroup.access$100(CpsThreadGroup.java:78)
      	at org.jenkinsci.plugins.workflow.cps.CpsThreadGroup$2.call(CpsThreadGroup.java:236)
      	at org.jenkinsci.plugins.workflow.cps.CpsThreadGroup$2.call(CpsThreadGroup.java:224)
      	at org.jenkinsci.plugins.workflow.cps.CpsVmExecutorService$2.call(CpsVmExecutorService.java:63)
      	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
      	at hudson.remoting.SingleLaneExecutorService$1.run(SingleLaneExecutorService.java:112)
      	at jenkins.util.ContextResettingExecutorService$1.run(ContextResettingExecutorService.java:28)
      	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
      	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
      	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
      	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
      	at java.lang.Thread.run(Thread.java:745)
      Finished: FAILURE
      

      Something seems to be going wrong around https://github.com/jenkinsci/kubernetes-plugin/blob/master/src/main/java/org/csanchez/jenkins/plugins/kubernetes/pipeline/ContainerExecDecorator.java#L125.

        Attachments

          Issue Links

            Activity

            Hide
            0x89 Martin Sander added a comment -

            I did some quite extensive testing yesterday, and I was able to get rid of the resource leak (I think).

            Pull request here: https://github.com/jenkinsci/kubernetes-plugin/pull/180. I recommend also viewing it with whitespace changes ignored.

            I don't expect you to merge it like that, but would be happy to get feedback .

            Unfortunately, it does not completely get rid of the "pipe not connected" errors, but

            • it seems to fix the resource leak
            • the "pipe not connected" error seems to fail the build much less often
            • it seems that it most of the time comes from org.jenkinsci.plugins.workflow.steps.durable_task.DurableTaskStep, which
              • runs ps all ten seconds or so to check if the process is still alive
              • just prints a single error to the build log, even if that check fails multiple times (Set the logger for that class to FINE to see all failures)
              • luckily does not fail the build if one of those checks fail
            Show
            0x89 Martin Sander added a comment - I did some quite extensive testing yesterday, and I was able to get rid of the resource leak (I think). Pull request here: https://github.com/jenkinsci/kubernetes-plugin/pull/180 . I recommend also viewing it with whitespace changes ignored . I don't expect you to merge it like that, but would be happy to get feedback . Unfortunately, it does not completely get rid of the "pipe not connected" errors, but it seems to fix the resource leak the "pipe not connected" error seems to fail the build much less often it seems that it most of the time comes from org.jenkinsci.plugins.workflow.steps.durable_task.DurableTaskStep , which runs ps all ten seconds or so to check if the process is still alive just prints a single error to the build log, even if that check fails multiple times (Set the logger for that class to FINE to see all failures) luckily does not fail the build if one of those checks fail
            Hide
            iocanel Ioannis Canellos added a comment -

            Martin Sander Your assumption (that the decorator is called multiple times) is valid and is aligned with what I've seen so far. 

            That was were https://github.com/jenkinsci/kubernetes-plugin/pull/177 was aiming (to close() the listeners opened by the liveness checks).

            But it seems that this is affecting us in more ways and I feel you are on the right track.  Let me review your pull request and I'll get back to you.

            Show
            iocanel Ioannis Canellos added a comment - Martin Sander Your assumption (that the decorator is called multiple times) is valid and is aligned with what I've seen so far.  That was were https://github.com/jenkinsci/kubernetes-plugin/pull/177  was aiming (to close() the listeners opened by the liveness checks). But it seems that this is affecting us in more ways and I feel you are on the right track.  Let me review your pull request and I'll get back to you.
            Hide
            0x89 Martin Sander added a comment - - edited

            Ioannis Canellos:

            I might be on the right track, but I think I didn't go far enough.

            It actually is not only the Decorator that is reused, but even the Launcher is used more than once, launch is called more than once.
            I will verify this and probably issue another pull request from a different branch tomorrow.

            Show
            0x89 Martin Sander added a comment - - edited Ioannis Canellos : I might be on the right track, but I think I didn't go far enough. It actually is not only the Decorator that is reused, but even the Launcher is used more than once, launch is called more than once. I will verify this and probably issue another pull request from a different branch tomorrow.
            Hide
            0x89 Martin Sander added a comment -
            Show
            0x89 Martin Sander added a comment - New pull request: https://github.com/jenkinsci/kubernetes-plugin/pull/182 .
            Hide
            jredl Jesse Redl added a comment -

            Thanks for the fix, we've re-enabled out multi-container workflows within jenkins / kubernetes plugin after upgrading to the most recent release!

            Show
            jredl Jesse Redl added a comment - Thanks for the fix, we've re-enabled out multi-container workflows within jenkins / kubernetes plugin after upgrading to the most recent release!

              People

              • Assignee:
                csanchez Carlos Sanchez
                Reporter:
                soud Steven Oud
              • Votes:
                20 Vote for this issue
                Watchers:
                33 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: