Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-5977

Dead lock condition due to pipe clogging

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Component/s: core
    • Labels:
      None

      Description

      The way the remoting is written, a certain race condition can result in a dead lock.

      The root cause of this is that Channel.ReaderThread performs blocking write operation to OutputStream when it's executing the ProxyOutputStream.Chunk command. This synchronous behavior was necessary to ensure the proper in-order data arrival guarantee.

      Now, the situation is, the master initiates a FilePath.copyRecursiveTo from the master to a slave. This creates a piped stream pair over Channel, where the sender on the master are directly sending data over Channel, which gets to FastPipedOutputStream on the slave, then the forwarded Callable on the slave is reading from the corresponding FastPipedInputStream.

      The callable executing on the slave needs to load a class from the master, which results in this callable making an callback to the master to fetch a class file. But while this class file arrives as a Command, the master can send enough byte sequence to FastPipedOutputStream to the point that the buffer fills up and the write operation blocks.

      At this point, the class file is stuck in the network waiting to be read by the command reader thread, but that won't be able to do that until there'll be some space in the buffer, and there won't be any space in the buffer until the class file is retrieved, hence the dead lock.

      1. 34838_hang_on_archiving.txt
        18 kB
        glundh
      2. 34838_hang_on_emma_publisher.txt
        11 kB
        glundh
      3. deadlockpatch.txt
        3 kB
        glundh
      4. HUDSON-5977.stack
        17 kB
        olamy

        Issue Links

          Activity

          Hide
          kohsuke Kohsuke Kawaguchi added a comment -

          This is the thread executing Callable of FilePath.copyRecursiveTo.

          Thread t@15: (state = BLOCKED)
           - java.lang.Object.wait(long) @bci=0 (Interpreted frame)
           - hudson.remoting.Request.call(hudson.remoting.Channel) @bci=120, line=122 (Interpreted frame)
           - hudson.remoting.RemoteInvocationHandler.invoke(java.lang.Object, java.lang.reflect.Method, java.lang.Object[]) @bci=135, line=160 (Interpreted frame)
           - hudson.remoting.$Proxy5.fetch2(java.lang.String) @bci=16 (Interpreted frame)
           - hudson.remoting.RemoteClassLoader.findClass(java.lang.String) @bci=41, line=121 (Interpreted frame)
           - java.lang.ClassLoader.loadClass(java.lang.String, boolean) @bci=43, line=307 (Interpreted frame)
           - java.lang.ClassLoader.loadClass(java.lang.String) @bci=3, line=252 (Interpreted frame)
           - java.lang.ClassLoader.loadClassInternal(java.lang.String) @bci=2, line=320 (Interpreted frame)
           - hudson.FilePath$32.invoke(java.io.File, hudson.remoting.VirtualChannel) @bci=33, line=1403 (Interpreted frame)
           - hudson.FilePath$32.invoke(java.io.File, hudson.remoting.VirtualChannel) @bci=3, line=1400 (Interpreted frame)
           - hudson.FilePath$FileCallableWrapper.call() @bci=21, line=1962 (Interpreted frame)
           - hudson.remoting.UserRequest.perform(hudson.remoting.Channel) @bci=123, line=104 (Interpreted frame)
           - hudson.remoting.UserRequest.perform(hudson.remoting.Channel) @bci=2, line=48 (Interpreted frame)
           - hudson.remoting.Request$2.run() @bci=8, line=270 (Interpreted frame)
           - java.util.concurrent.Executors$RunnableAdapter.call() @bci=4, line=441 (Interpreted frame)
           - java.util.concurrent.FutureTask$Sync.innerRun() @bci=30, line=303 (Interpreted frame)
           - java.util.concurrent.FutureTask.run() @bci=4, line=138 (Interpreted frame)
           - java.util.concurrent.ThreadPoolExecutor$Worker.runTask(java.lang.Runnable) @bci=59, line=886 (Interpreted frame)
           - java.util.concurrent.ThreadPoolExecutor$Worker.run() @bci=28, line=908 (Interpreted frame)
           - java.lang.Thread.run() @bci=11, line=619 (Interpreted frame)
          

          This is the channel reader thread blocking on FastPipedOutputStream:

          Thread t@13: (state = BLOCKED)
           - java.lang.Object.wait(long) @bci=0 (Interpreted frame)
           - java.lang.Object.wait() @bci=2, line=485 (Interpreted frame)
           - hudson.remoting.FastPipedOutputStream.write(byte[], int, int) @bci=106, line=136 (Interpreted frame)
           - hudson.remoting.FastPipedOutputStream.write(byte[]) @bci=5, line=115 (Interpreted frame)
           - hudson.remoting.ProxyOutputStream$Chunk.execute(hudson.remoting.Channel) @bci=17, line=164 (Interpreted frame)
           - hudson.remoting.Channel$ReaderThread.run() @bci=119, line=867 (Interpreted frame)
          
          Show
          kohsuke Kohsuke Kawaguchi added a comment - This is the thread executing Callable of FilePath.copyRecursiveTo. Thread t@15: (state = BLOCKED) - java.lang.Object.wait(long) @bci=0 (Interpreted frame) - hudson.remoting.Request.call(hudson.remoting.Channel) @bci=120, line=122 (Interpreted frame) - hudson.remoting.RemoteInvocationHandler.invoke(java.lang.Object, java.lang.reflect.Method, java.lang.Object[]) @bci=135, line=160 (Interpreted frame) - hudson.remoting.$Proxy5.fetch2(java.lang.String) @bci=16 (Interpreted frame) - hudson.remoting.RemoteClassLoader.findClass(java.lang.String) @bci=41, line=121 (Interpreted frame) - java.lang.ClassLoader.loadClass(java.lang.String, boolean) @bci=43, line=307 (Interpreted frame) - java.lang.ClassLoader.loadClass(java.lang.String) @bci=3, line=252 (Interpreted frame) - java.lang.ClassLoader.loadClassInternal(java.lang.String) @bci=2, line=320 (Interpreted frame) - hudson.FilePath$32.invoke(java.io.File, hudson.remoting.VirtualChannel) @bci=33, line=1403 (Interpreted frame) - hudson.FilePath$32.invoke(java.io.File, hudson.remoting.VirtualChannel) @bci=3, line=1400 (Interpreted frame) - hudson.FilePath$FileCallableWrapper.call() @bci=21, line=1962 (Interpreted frame) - hudson.remoting.UserRequest.perform(hudson.remoting.Channel) @bci=123, line=104 (Interpreted frame) - hudson.remoting.UserRequest.perform(hudson.remoting.Channel) @bci=2, line=48 (Interpreted frame) - hudson.remoting.Request$2.run() @bci=8, line=270 (Interpreted frame) - java.util.concurrent.Executors$RunnableAdapter.call() @bci=4, line=441 (Interpreted frame) - java.util.concurrent.FutureTask$Sync.innerRun() @bci=30, line=303 (Interpreted frame) - java.util.concurrent.FutureTask.run() @bci=4, line=138 (Interpreted frame) - java.util.concurrent.ThreadPoolExecutor$Worker.runTask(java.lang.Runnable) @bci=59, line=886 (Interpreted frame) - java.util.concurrent.ThreadPoolExecutor$Worker.run() @bci=28, line=908 (Interpreted frame) - java.lang.Thread.run() @bci=11, line=619 (Interpreted frame) This is the channel reader thread blocking on FastPipedOutputStream: Thread t@13: (state = BLOCKED) - java.lang.Object.wait(long) @bci=0 (Interpreted frame) - java.lang.Object.wait() @bci=2, line=485 (Interpreted frame) - hudson.remoting.FastPipedOutputStream.write(byte[], int, int) @bci=106, line=136 (Interpreted frame) - hudson.remoting.FastPipedOutputStream.write(byte[]) @bci=5, line=115 (Interpreted frame) - hudson.remoting.ProxyOutputStream$Chunk.execute(hudson.remoting.Channel) @bci=17, line=164 (Interpreted frame) - hudson.remoting.Channel$ReaderThread.run() @bci=119, line=867 (Interpreted frame)
          Hide
          kohsuke Kohsuke Kawaguchi added a comment -

          The reverse chronological sequence of the log showing what the master has sent. Looking at the slave stack trace, it hasn't read a single byte yet, but the master has sent in the 370K data in "Pipe.Chunk(6,371226)", and the class file image afterward, which is never read.

          Mar 17, 2010 6:13:02 PM hudson.remoting.Channel
          FINE: Send Pipe.Chunk(6,7900)
          
          Mar 17, 2010 6:13:02 PM hudson.remoting.Channel
          FINE: Send Pipe.Chunk(6,8192)
          
          Mar 17, 2010 6:13:02 PM hudson.remoting.Channel
          FINE: Send Pipe.Chunk(6,7988)
          
          Mar 17, 2010 6:13:02 PM hudson.remoting.Channel
          FINE: Send Pipe.Chunk(6,8192)
          
          Mar 17, 2010 6:13:02 PM hudson.remoting.Channel
          FINE: Send Pipe.Chunk(6,7908)
          
          Mar 17, 2010 6:13:02 PM hudson.remoting.Channel
          FINE: Send Pipe.Chunk(6,8192)
          
          Mar 17, 2010 6:13:02 PM hudson.remoting.Channel
          FINE: Send Pipe.Chunk(6,7804)
          
          Mar 17, 2010 6:13:02 PM hudson.remoting.Channel
          FINE: Send Pipe.Chunk(6,8192)
          
          Mar 17, 2010 6:13:02 PM hudson.remoting.Channel
          FINE: Send Pipe.Chunk(6,7740)
          
          Mar 17, 2010 6:13:02 PM hudson.remoting.Channel
          FINE: Send Pipe.Chunk(6,8192)
          
          Mar 17, 2010 6:13:02 PM hudson.remoting.Channel
          FINE: Send Pipe.Chunk(6,7996)
          
          Mar 17, 2010 6:13:02 PM hudson.remoting.Channel
          FINE: Send Pipe.Chunk(6,8192)
          
          Mar 17, 2010 6:13:02 PM hudson.remoting.Channel
          FINE: Send Pipe.Chunk(6,7888)
          
          Mar 17, 2010 6:13:02 PM hudson.remoting.Channel
          FINE: Send Pipe.Chunk(6,8192)
          
          Mar 17, 2010 6:13:02 PM hudson.remoting.Channel
          FINE: Send Pipe.Chunk(6,7856)
          
          Mar 17, 2010 6:13:02 PM hudson.remoting.Channel
          FINE: Send Pipe.Chunk(6,8192)
          
          Mar 17, 2010 6:13:02 PM hudson.remoting.Channel
          FINE: Send Pipe.Chunk(6,7964)
          
          Mar 17, 2010 6:13:02 PM hudson.remoting.Channel
          FINE: Send Pipe.Chunk(6,8192)
          
          Mar 17, 2010 6:13:02 PM hudson.remoting.Channel
          FINE: Send Pipe.Chunk(6,7984)
          
          Mar 17, 2010 6:13:02 PM hudson.remoting.Channel
          FINE: Send Response[retVal=hudson.remoting.RemoteClassLoader$ClassFile@5f471136,exception=null]
          
          Mar 17, 2010 6:13:02 PM hudson.remoting.Channel
          FINE: Received RPCRequest(2,fetch2)
          
          Mar 17, 2010 6:13:02 PM hudson.remoting.Channel
          FINE: Send Pipe.Chunk(6,8192)
          
          Mar 17, 2010 6:13:02 PM hudson.remoting.Channel
          FINE: Send Pipe.Chunk(6,7768)
          
          Mar 17, 2010 6:13:02 PM hudson.remoting.Channel
          FINE: Send Pipe.Chunk(6,8192)
          
          Mar 17, 2010 6:13:02 PM hudson.remoting.Channel
          FINE: Send Pipe.Chunk(6,7928)
          
          Mar 17, 2010 6:13:02 PM hudson.remoting.Channel
          FINE: Send Pipe.Chunk(6,371226)
          
          Mar 17, 2010 6:13:02 PM hudson.remoting.Channel
          FINE: Received hudson.remoting.Pipe$ConnectCommand@47d7e1fb
          
          Mar 17, 2010 6:13:02 PM hudson.remoting.Channel
          FINE: Send UserRequest:hudson.FilePath$FileCallableWrapper@40af3623
          
          Mar 17, 2010 6:13:02 PM hudson.remoting.Channel
          FINE: Received Response[retVal=hudson.remoting.UserResponse@145e62e7,exception=null]
          
          Mar 17, 2010 6:13:02 PM hudson.remoting.Channel
          FINE: Send UserRequest:hudson.EnvVars$GetEnvVars@652dc559
          
          Mar 17, 2010 6:13:02 PM hudson.remoting.Channel
          FINE: Received Response[retVal=hudson.remoting.UserResponse@762c3f0,exception=null]
          
          Mar 17, 2010 6:13:01 PM hudson.remoting.Channel
          FINE: Send UserRequest:hudson.EnvVars$GetEnvVars@490606c0
          
          Mar 17, 2010 6:13:01 PM hudson.remoting.Channel
          FINE: Received Response[retVal=hudson.remoting.UserResponse@3fd15873,exception=null]
          
          Mar 17, 2010 6:13:01 PM hudson.remoting.Channel
          FINE: Send UserRequest:hudson.FilePath$FileCallableWrapper@26c76ec2
          
          Show
          kohsuke Kohsuke Kawaguchi added a comment - The reverse chronological sequence of the log showing what the master has sent. Looking at the slave stack trace, it hasn't read a single byte yet, but the master has sent in the 370K data in "Pipe.Chunk(6,371226)", and the class file image afterward, which is never read. Mar 17, 2010 6:13:02 PM hudson.remoting.Channel FINE: Send Pipe.Chunk(6,7900) Mar 17, 2010 6:13:02 PM hudson.remoting.Channel FINE: Send Pipe.Chunk(6,8192) Mar 17, 2010 6:13:02 PM hudson.remoting.Channel FINE: Send Pipe.Chunk(6,7988) Mar 17, 2010 6:13:02 PM hudson.remoting.Channel FINE: Send Pipe.Chunk(6,8192) Mar 17, 2010 6:13:02 PM hudson.remoting.Channel FINE: Send Pipe.Chunk(6,7908) Mar 17, 2010 6:13:02 PM hudson.remoting.Channel FINE: Send Pipe.Chunk(6,8192) Mar 17, 2010 6:13:02 PM hudson.remoting.Channel FINE: Send Pipe.Chunk(6,7804) Mar 17, 2010 6:13:02 PM hudson.remoting.Channel FINE: Send Pipe.Chunk(6,8192) Mar 17, 2010 6:13:02 PM hudson.remoting.Channel FINE: Send Pipe.Chunk(6,7740) Mar 17, 2010 6:13:02 PM hudson.remoting.Channel FINE: Send Pipe.Chunk(6,8192) Mar 17, 2010 6:13:02 PM hudson.remoting.Channel FINE: Send Pipe.Chunk(6,7996) Mar 17, 2010 6:13:02 PM hudson.remoting.Channel FINE: Send Pipe.Chunk(6,8192) Mar 17, 2010 6:13:02 PM hudson.remoting.Channel FINE: Send Pipe.Chunk(6,7888) Mar 17, 2010 6:13:02 PM hudson.remoting.Channel FINE: Send Pipe.Chunk(6,8192) Mar 17, 2010 6:13:02 PM hudson.remoting.Channel FINE: Send Pipe.Chunk(6,7856) Mar 17, 2010 6:13:02 PM hudson.remoting.Channel FINE: Send Pipe.Chunk(6,8192) Mar 17, 2010 6:13:02 PM hudson.remoting.Channel FINE: Send Pipe.Chunk(6,7964) Mar 17, 2010 6:13:02 PM hudson.remoting.Channel FINE: Send Pipe.Chunk(6,8192) Mar 17, 2010 6:13:02 PM hudson.remoting.Channel FINE: Send Pipe.Chunk(6,7984) Mar 17, 2010 6:13:02 PM hudson.remoting.Channel FINE: Send Response[retVal=hudson.remoting.RemoteClassLoader$ClassFile@5f471136,exception=null] Mar 17, 2010 6:13:02 PM hudson.remoting.Channel FINE: Received RPCRequest(2,fetch2) Mar 17, 2010 6:13:02 PM hudson.remoting.Channel FINE: Send Pipe.Chunk(6,8192) Mar 17, 2010 6:13:02 PM hudson.remoting.Channel FINE: Send Pipe.Chunk(6,7768) Mar 17, 2010 6:13:02 PM hudson.remoting.Channel FINE: Send Pipe.Chunk(6,8192) Mar 17, 2010 6:13:02 PM hudson.remoting.Channel FINE: Send Pipe.Chunk(6,7928) Mar 17, 2010 6:13:02 PM hudson.remoting.Channel FINE: Send Pipe.Chunk(6,371226) Mar 17, 2010 6:13:02 PM hudson.remoting.Channel FINE: Received hudson.remoting.Pipe$ConnectCommand@47d7e1fb Mar 17, 2010 6:13:02 PM hudson.remoting.Channel FINE: Send UserRequest:hudson.FilePath$FileCallableWrapper@40af3623 Mar 17, 2010 6:13:02 PM hudson.remoting.Channel FINE: Received Response[retVal=hudson.remoting.UserResponse@145e62e7,exception=null] Mar 17, 2010 6:13:02 PM hudson.remoting.Channel FINE: Send UserRequest:hudson.EnvVars$GetEnvVars@652dc559 Mar 17, 2010 6:13:02 PM hudson.remoting.Channel FINE: Received Response[retVal=hudson.remoting.UserResponse@762c3f0,exception=null] Mar 17, 2010 6:13:01 PM hudson.remoting.Channel FINE: Send UserRequest:hudson.EnvVars$GetEnvVars@490606c0 Mar 17, 2010 6:13:01 PM hudson.remoting.Channel FINE: Received Response[retVal=hudson.remoting.UserResponse@3fd15873,exception=null] Mar 17, 2010 6:13:01 PM hudson.remoting.Channel FINE: Send UserRequest:hudson.FilePath$FileCallableWrapper@26c76ec2
          Hide
          scm_issue_link SCM/JIRA link daemon added a comment -

          Code changed in hudson
          User: : mindless
          Path:
          trunk/hudson/plugins/copyartifact/src/main/java/hudson/plugins/copyartifact/CopyArtifact.java
          trunk/hudson/plugins/copyartifact/src/main/webapp/JENKINS-5977/deleteme.tmp
          http://jenkins-ci.org/commit/28825
          Log:
          [copyartifact] [FIXED JENKINS-5934] Add workaround for JENKINS-5977
          so slaves won't hang on copy of larger artifacts.

          Show
          scm_issue_link SCM/JIRA link daemon added a comment - Code changed in hudson User: : mindless Path: trunk/hudson/plugins/copyartifact/src/main/java/hudson/plugins/copyartifact/CopyArtifact.java trunk/hudson/plugins/copyartifact/src/main/webapp/ JENKINS-5977 /deleteme.tmp http://jenkins-ci.org/commit/28825 Log: [copyartifact] [FIXED JENKINS-5934] Add workaround for JENKINS-5977 so slaves won't hang on copy of larger artifacts.
          Hide
          kohsuke Kohsuke Kawaguchi added a comment -

          Another possibly related pipe clogging problem, this time the reader thread on the master is clogging:

          This is the thread on a slave that's sending stream. This is FilePath.read()

          "pool-1-thread-415" prio=6 tid=0x031df800 nid=0x604 runnable [0x02e6f000]
             java.lang.Thread.State: RUNNABLE
          	at java.net.SocketOutputStream.socketWrite0(Native Method)
          	at java.net.SocketOutputStream.socketWrite(Unknown Source)
          	at java.net.SocketOutputStream.write(Unknown Source)
          	at java.io.BufferedOutputStream.flushBuffer(Unknown Source)
          	at java.io.BufferedOutputStream.write(Unknown Source)
          	- locked <0x22fe3c60> (a java.io.BufferedOutputStream)
          	at java.io.ObjectOutputStream$BlockDataOutputStream.drain(Unknown Source)
          	at java.io.ObjectOutputStream$BlockDataOutputStream.write(Unknown Source)
          	at java.io.ObjectOutputStream.writeArray(Unknown Source)
          	at java.io.ObjectOutputStream.writeObject0(Unknown Source)
          	at java.io.ObjectOutputStream.defaultWriteFields(Unknown Source)
          	at java.io.ObjectOutputStream.writeSerialData(Unknown Source)
          	at java.io.ObjectOutputStream.writeOrdinaryObject(Unknown Source)
          	at java.io.ObjectOutputStream.writeObject0(Unknown Source)
          	at java.io.ObjectOutputStream.writeObject(Unknown Source)
          	at hudson.remoting.Channel.send(Channel.java:417)
          	- locked <0x22fda8d0> (a hudson.remoting.Channel)
          	at hudson.remoting.ProxyOutputStream.write(ProxyOutputStream.java:112)
          	- locked <0x2303de50> (a hudson.remoting.ProxyOutputStream)
          	at hudson.remoting.ProxyOutputStream.write(ProxyOutputStream.java:96)
          	at hudson.Util.copyStream(Util.java:406)
          	at hudson.FilePath$24.call(FilePath.java:1202)
          	at hudson.FilePath$24.call(FilePath.java:1197)
          	at hudson.remoting.UserRequest.perform(UserRequest.java:114)
          	at hudson.remoting.UserRequest.perform(UserRequest.java:48)
          	at hudson.remoting.Request$2.run(Request.java:270)
          	at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
          	at java.util.concurrent.FutureTask$Sync.innerRun(Unknown Source)
          	at java.util.concurrent.FutureTask.run(Unknown Source)
          	at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown Source)
          	at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
          	at hudson.remoting.Engine$1$1.run(Engine.java:58)
          	at java.lang.Thread.run(Unknown Source)
          

          This is the thread on the master hanging because of this:

          "Channel reader thread: hb-slave-trunk3" Id=53 TIMED_WAITING on [B@18c8d4d
          	at java.lang.Object.wait(Native Method)
          	-  waiting on [B@18c8d4d
          	at hudson.remoting.FastPipedOutputStream.write(FastPipedOutputStream.java:147)
          	at hudson.remoting.FastPipedOutputStream.write(FastPipedOutputStream.java:122)
          	at hudson.remoting.ProxyOutputStream$Chunk.execute(ProxyOutputStream.java:164)
          	at hudson.remoting.Channel$ReaderThread.run(Channel.java:867)
          
          Show
          kohsuke Kohsuke Kawaguchi added a comment - Another possibly related pipe clogging problem, this time the reader thread on the master is clogging: This is the thread on a slave that's sending stream. This is FilePath.read() "pool-1-thread-415" prio=6 tid=0x031df800 nid=0x604 runnable [0x02e6f000] java.lang.Thread.State: RUNNABLE at java.net.SocketOutputStream.socketWrite0(Native Method) at java.net.SocketOutputStream.socketWrite(Unknown Source) at java.net.SocketOutputStream.write(Unknown Source) at java.io.BufferedOutputStream.flushBuffer(Unknown Source) at java.io.BufferedOutputStream.write(Unknown Source) - locked <0x22fe3c60> (a java.io.BufferedOutputStream) at java.io.ObjectOutputStream$BlockDataOutputStream.drain(Unknown Source) at java.io.ObjectOutputStream$BlockDataOutputStream.write(Unknown Source) at java.io.ObjectOutputStream.writeArray(Unknown Source) at java.io.ObjectOutputStream.writeObject0(Unknown Source) at java.io.ObjectOutputStream.defaultWriteFields(Unknown Source) at java.io.ObjectOutputStream.writeSerialData(Unknown Source) at java.io.ObjectOutputStream.writeOrdinaryObject(Unknown Source) at java.io.ObjectOutputStream.writeObject0(Unknown Source) at java.io.ObjectOutputStream.writeObject(Unknown Source) at hudson.remoting.Channel.send(Channel.java:417) - locked <0x22fda8d0> (a hudson.remoting.Channel) at hudson.remoting.ProxyOutputStream.write(ProxyOutputStream.java:112) - locked <0x2303de50> (a hudson.remoting.ProxyOutputStream) at hudson.remoting.ProxyOutputStream.write(ProxyOutputStream.java:96) at hudson.Util.copyStream(Util.java:406) at hudson.FilePath$24.call(FilePath.java:1202) at hudson.FilePath$24.call(FilePath.java:1197) at hudson.remoting.UserRequest.perform(UserRequest.java:114) at hudson.remoting.UserRequest.perform(UserRequest.java:48) at hudson.remoting.Request$2.run(Request.java:270) at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source) at java.util.concurrent.FutureTask$Sync.innerRun(Unknown Source) at java.util.concurrent.FutureTask.run(Unknown Source) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown Source) at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at hudson.remoting.Engine$1$1.run(Engine.java:58) at java.lang.Thread.run(Unknown Source) This is the thread on the master hanging because of this: "Channel reader thread: hb-slave-trunk3" Id=53 TIMED_WAITING on [B@18c8d4d at java.lang.Object.wait(Native Method) - waiting on [B@18c8d4d at hudson.remoting.FastPipedOutputStream.write(FastPipedOutputStream.java:147) at hudson.remoting.FastPipedOutputStream.write(FastPipedOutputStream.java:122) at hudson.remoting.ProxyOutputStream$Chunk.execute(ProxyOutputStream.java:164) at hudson.remoting.Channel$ReaderThread.run(Channel.java:867)
          Hide
          kohsuke Kohsuke Kawaguchi added a comment -

          Details for my previous comment.

          Show
          kohsuke Kohsuke Kawaguchi added a comment - Details for my previous comment.
          Hide
          kohsuke Kohsuke Kawaguchi added a comment -

          Full log for my previous comment is at http://issues.jenkins-ci.org/secure/attachment/19251/log.zip

          Show
          kohsuke Kohsuke Kawaguchi added a comment - Full log for my previous comment is at http://issues.jenkins-ci.org/secure/attachment/19251/log.zip
          Show
          glundh glundh added a comment - Proof of concept fix: http://hudson.361315.n4.nabble.com/Discussion-regarding-JENKINS-5977-Dead-lock-condition-due-to-pipe-clogging-And-more-td2322497.html Please review.
          Hide
          glundh glundh added a comment -

          Proof-of-concept fix.

          Show
          glundh glundh added a comment - Proof-of-concept fix.
          Hide
          kohsuke Kohsuke Kawaguchi added a comment - - edited

          I'm testing a patch similar to what's proposed by GLundh, except it does throttling to avoid unbounded growth of the queue. Plan to integrate it to the trunk soon, but I'm bit cautious as the remoting affects many things.

          If there are other people willing to test the modification, I'll post the modified binary.

          Show
          kohsuke Kohsuke Kawaguchi added a comment - - edited I'm testing a patch similar to what's proposed by GLundh, except it does throttling to avoid unbounded growth of the queue. Plan to integrate it to the trunk soon, but I'm bit cautious as the remoting affects many things. If there are other people willing to test the modification, I'll post the modified binary.
          Hide
          glundh glundh added a comment -

          Kohsuke: Thanks for your comment. I'd be happy to take it for a spin on our test-server.

          And I would also really like to see the code (if that's possible). Would you mind branching your fix on the SVN or github?

          Br
          Gustaf - Sony Ericsson

          Show
          glundh glundh added a comment - Kohsuke: Thanks for your comment. I'd be happy to take it for a spin on our test-server. And I would also really like to see the code (if that's possible). Would you mind branching your fix on the SVN or github? Br Gustaf - Sony Ericsson
          Hide
          scm_issue_link SCM/JIRA link daemon added a comment -

          Code changed in hudson
          User: : kohsuke
          Path:
          trunk/hudson/main/remoting/src/main/java/hudson/remoting/Capability.java
          trunk/hudson/main/remoting/src/main/java/hudson/remoting/Channel.java
          trunk/hudson/main/remoting/src/main/java/hudson/remoting/Pipe.java
          trunk/hudson/main/remoting/src/main/java/hudson/remoting/PipeWindow.java
          trunk/hudson/main/remoting/src/main/java/hudson/remoting/ProxyOutputStream.java
          trunk/hudson/main/remoting/src/main/java/hudson/remoting/SynchronousExecutorService.java
          trunk/hudson/main/remoting/src/test/java/hudson/remoting/ChannelRunner.java
          trunk/hudson/main/remoting/src/test/java/hudson/remoting/PipeTest.java
          trunk/hudson/main/remoting/src/test/java/hudson/remoting/RmiTestBase.java
          trunk/hudson/main/remoting/src/test/java/hudson/remoting/WithRunner.java
          http://jenkins-ci.org/commit/34838
          Log:
          [FIXED JENKINS-5977] implemented a windowing mechanism to throttle the pipe usage, and the pipe operation no longer interferes with the main channel reader thread.

          Show
          scm_issue_link SCM/JIRA link daemon added a comment - Code changed in hudson User: : kohsuke Path: trunk/hudson/main/remoting/src/main/java/hudson/remoting/Capability.java trunk/hudson/main/remoting/src/main/java/hudson/remoting/Channel.java trunk/hudson/main/remoting/src/main/java/hudson/remoting/Pipe.java trunk/hudson/main/remoting/src/main/java/hudson/remoting/PipeWindow.java trunk/hudson/main/remoting/src/main/java/hudson/remoting/ProxyOutputStream.java trunk/hudson/main/remoting/src/main/java/hudson/remoting/SynchronousExecutorService.java trunk/hudson/main/remoting/src/test/java/hudson/remoting/ChannelRunner.java trunk/hudson/main/remoting/src/test/java/hudson/remoting/PipeTest.java trunk/hudson/main/remoting/src/test/java/hudson/remoting/RmiTestBase.java trunk/hudson/main/remoting/src/test/java/hudson/remoting/WithRunner.java http://jenkins-ci.org/commit/34838 Log: [FIXED JENKINS-5977] implemented a windowing mechanism to throttle the pipe usage, and the pipe operation no longer interferes with the main channel reader thread.
          Hide
          dogfood dogfood added a comment -

          Integrated in hudson_main_trunk #263
          [FIXED JENKINS-5977] implemented a windowing mechanism to throttle the pipe usage, and the pipe operation no longer interferes with the main channel reader thread.

          kohsuke :
          Files :

          • /trunk/hudson/main/remoting/src/test/java/hudson/remoting/ChannelRunner.java
          • /trunk/hudson/main/remoting/src/test/java/hudson/remoting/RmiTestBase.java
          • /trunk/hudson/main/remoting/src/main/java/hudson/remoting/Channel.java
          • /trunk/hudson/main/remoting/src/test/java/hudson/remoting/WithRunner.java
          • /trunk/hudson/main/remoting/src/main/java/hudson/remoting/PipeWindow.java
          • /trunk/hudson/main/remoting/src/main/java/hudson/remoting/Pipe.java
          • /trunk/hudson/main/remoting/src/main/java/hudson/remoting/ProxyOutputStream.java
          • /trunk/hudson/main/remoting/src/test/java/hudson/remoting/PipeTest.java
          • /trunk/hudson/main/remoting/src/main/java/hudson/remoting/SynchronousExecutorService.java
          • /trunk/hudson/main/remoting/src/main/java/hudson/remoting/Capability.java
          Show
          dogfood dogfood added a comment - Integrated in hudson_main_trunk #263 [FIXED JENKINS-5977] implemented a windowing mechanism to throttle the pipe usage, and the pipe operation no longer interferes with the main channel reader thread. kohsuke : Files : /trunk/hudson/main/remoting/src/test/java/hudson/remoting/ChannelRunner.java /trunk/hudson/main/remoting/src/test/java/hudson/remoting/RmiTestBase.java /trunk/hudson/main/remoting/src/main/java/hudson/remoting/Channel.java /trunk/hudson/main/remoting/src/test/java/hudson/remoting/WithRunner.java /trunk/hudson/main/remoting/src/main/java/hudson/remoting/PipeWindow.java /trunk/hudson/main/remoting/src/main/java/hudson/remoting/Pipe.java /trunk/hudson/main/remoting/src/main/java/hudson/remoting/ProxyOutputStream.java /trunk/hudson/main/remoting/src/test/java/hudson/remoting/PipeTest.java /trunk/hudson/main/remoting/src/main/java/hudson/remoting/SynchronousExecutorService.java /trunk/hudson/main/remoting/src/main/java/hudson/remoting/Capability.java
          Hide
          glundh glundh added a comment -

          Reopening.

          Issue seems to have worsened after the new patch.

          For instance "Archiving Artifacts" hangs far more often, but we do also see hangs during other phases. Sometimes in the middle of a build.

          I'm attaching two files containing thread dumps. The thread dumps was collected during two different occasions where jobs/slaves has hanged:

          During Archiving of artifacts
          During execution of the Emma publisher

          Each file contains two thread-dumps: From both the master and the offending slave.

          Br
          Gustaf

          Show
          glundh glundh added a comment - Reopening. Issue seems to have worsened after the new patch. For instance "Archiving Artifacts" hangs far more often, but we do also see hangs during other phases. Sometimes in the middle of a build. I'm attaching two files containing thread dumps. The thread dumps was collected during two different occasions where jobs/slaves has hanged: During Archiving of artifacts During execution of the Emma publisher Each file contains two thread-dumps: From both the master and the offending slave. Br Gustaf
          Hide
          glundh glundh added a comment - - edited

          1. 34838_hang_on_archiving.txt (18 kB)
          2. 34838_hang_on_emma_publisher.txt (11 kB)

          Show
          glundh glundh added a comment - - edited 1. 34838_hang_on_archiving.txt (18 kB) 2. 34838_hang_on_emma_publisher.txt (11 kB)
          Hide
          olamy olamy added a comment -

          here a new jstack.
          This issue made 1.378 unusable.

          Show
          olamy olamy added a comment - here a new jstack. This issue made 1.378 unusable.
          Hide
          mhaller1979 mhaller1979 added a comment -

          Same here, reverted back to 1.377 solved the problem of hanging maven processes on all our slaves.
          I'm unable to provide stack traces, as jstack was unable to attach to the maven processes on the slaves (even with -F)

          strace did not reveal much information either, the slave processes continously logged the following:

          16067 futex(0x7f680d4c1a44, 0x80 /* FUTEX_??? */, 1 <unfinished ...>
          16060 <... futex resumed> )             = -1 ETIMEDOUT (Connection timed out)
          16060 futex(0x4016b028, 0x81 /* FUTEX_??? */, 1) = 0
          16060 futex(0x7f6804e315e4, 0x80 /* FUTEX_??? */, 1 <unfinished ...>
          16067 <... futex resumed> )             = -1 ETIMEDOUT (Connection timed out)
          16067 futex(0x401a3428, 0x81 /* FUTEX_??? */, 1) = 0
          16067 futex(0x7f680d4c1a44, 0x80 /* FUTEX_??? */, 1) = -1 ETIMEDOUT (Connection timed out)
          16067 futex(0x401a3428, 0x81 /* FUTEX_??? */, 1) = 0
          16067 futex(0x7f680d4c1a44, 0x80 /* FUTEX_??? */, 1) = -1 ETIMEDOUT (Connection timed out)
          16067 futex(0x401a3428, 0x81 /* FUTEX_??? */, 1) = 0
          16067 futex(0x7f680d4c1a44, 0x80 /* FUTEX_??? */, 1) = -1 ETIMEDOUT (Connection timed out)
          16067 futex(0x401a3428, 0x81 /* FUTEX_??? */, 1) = 0
          16067 futex(0x7f680d4c1a44, 0x80 /* FUTEX_??? */, 1) = -1 ETIMEDOUT (Connection timed out)
          16067 futex(0x401a3428, 0x81 /* FUTEX_??? */, 1) = 0
          16067 futex(0x7f680d4c1a44, 0x80 /* FUTEX_??? */, 1) = -1 ETIMEDOUT (Connection timed out)
          16067 futex(0x401a3428, 0x81 /* FUTEX_??? */, 1) = 0
          
          Show
          mhaller1979 mhaller1979 added a comment - Same here, reverted back to 1.377 solved the problem of hanging maven processes on all our slaves. I'm unable to provide stack traces, as jstack was unable to attach to the maven processes on the slaves (even with -F) strace did not reveal much information either, the slave processes continously logged the following: 16067 futex(0x7f680d4c1a44, 0x80 /* FUTEX_??? */, 1 <unfinished ...> 16060 <... futex resumed> ) = -1 ETIMEDOUT (Connection timed out) 16060 futex(0x4016b028, 0x81 /* FUTEX_??? */, 1) = 0 16060 futex(0x7f6804e315e4, 0x80 /* FUTEX_??? */, 1 <unfinished ...> 16067 <... futex resumed> ) = -1 ETIMEDOUT (Connection timed out) 16067 futex(0x401a3428, 0x81 /* FUTEX_??? */, 1) = 0 16067 futex(0x7f680d4c1a44, 0x80 /* FUTEX_??? */, 1) = -1 ETIMEDOUT (Connection timed out) 16067 futex(0x401a3428, 0x81 /* FUTEX_??? */, 1) = 0 16067 futex(0x7f680d4c1a44, 0x80 /* FUTEX_??? */, 1) = -1 ETIMEDOUT (Connection timed out) 16067 futex(0x401a3428, 0x81 /* FUTEX_??? */, 1) = 0 16067 futex(0x7f680d4c1a44, 0x80 /* FUTEX_??? */, 1) = -1 ETIMEDOUT (Connection timed out) 16067 futex(0x401a3428, 0x81 /* FUTEX_??? */, 1) = 0 16067 futex(0x7f680d4c1a44, 0x80 /* FUTEX_??? */, 1) = -1 ETIMEDOUT (Connection timed out) 16067 futex(0x401a3428, 0x81 /* FUTEX_??? */, 1) = 0 16067 futex(0x7f680d4c1a44, 0x80 /* FUTEX_??? */, 1) = -1 ETIMEDOUT (Connection timed out) 16067 futex(0x401a3428, 0x81 /* FUTEX_??? */, 1) = 0
          Hide
          robertredd robertredd added a comment -

          I had the same problem here with a timeout occurring on the Maven process, the archive artifacts process, and the publishing cobertura step in different cases on all nodes both Windows and Linux.

          Show
          robertredd robertredd added a comment - I had the same problem here with a timeout occurring on the Maven process, the archive artifacts process, and the publishing cobertura step in different cases on all nodes both Windows and Linux.
          Hide
          pgweiss pgweiss added a comment -

          FWIW, I've reverted to trunk svn 34836, i.e. right before Kohsuke's patch, and I've applied GLundh's deadlockpatch.txt. For the past day, all my jobs with really big artifacts, that previously were hanging, are now operating flawlessly.

          Show
          pgweiss pgweiss added a comment - FWIW, I've reverted to trunk svn 34836, i.e. right before Kohsuke's patch, and I've applied GLundh's deadlockpatch.txt. For the past day, all my jobs with really big artifacts, that previously were hanging, are now operating flawlessly.
          Hide
          skybird Florian Rosenauer added a comment - - edited

          It seems it has side-effects on CVS on our slaves too (Master: Linux, 2 slaves with Linux). After the CVS process is finshed, nothing happens at the Hudson Console output, it hangs. This happens every three or four jobs.
          Reverting back from 1.378 to 1.377 resolves it.

          Show
          skybird Florian Rosenauer added a comment - - edited It seems it has side-effects on CVS on our slaves too (Master: Linux, 2 slaves with Linux). After the CVS process is finshed, nothing happens at the Hudson Console output, it hangs. This happens every three or four jobs. Reverting back from 1.378 to 1.377 resolves it.
          Hide
          gorrus gorrus added a comment -

          Experiencing build timeouts on different slaves with 1.378, everything is back to normal with rollback to 1.377. Please fix.

          Show
          gorrus gorrus added a comment - Experiencing build timeouts on different slaves with 1.378, everything is back to normal with rollback to 1.377. Please fix.
          Hide
          scm_issue_link SCM/JIRA link daemon added a comment -

          Code changed in hudson
          User: : kohsuke
          Path:
          branches/rc/remoting/src/main/java/hudson/remoting/Channel.java
          branches/rc/remoting/src/main/java/hudson/remoting/PipeWindow.java
          branches/rc/remoting/src/main/java/hudson/remoting/ProxyOutputStream.java
          branches/rc/remoting/src/test/java/hudson/remoting/PipeTest.java
          trunk/www/changelog.html
          http://jenkins-ci.org/commit/35460
          Log:
          [FIXED JENKINS-5977 JENKINS-7572] the incorrect use of WeakHashMap resulted in two instances of PipeWindows for the same OID.

          Show
          scm_issue_link SCM/JIRA link daemon added a comment - Code changed in hudson User: : kohsuke Path: branches/rc/remoting/src/main/java/hudson/remoting/Channel.java branches/rc/remoting/src/main/java/hudson/remoting/PipeWindow.java branches/rc/remoting/src/main/java/hudson/remoting/ProxyOutputStream.java branches/rc/remoting/src/test/java/hudson/remoting/PipeTest.java trunk/www/changelog.html http://jenkins-ci.org/commit/35460 Log: [FIXED JENKINS-5977 JENKINS-7572] the incorrect use of WeakHashMap resulted in two instances of PipeWindows for the same OID.
          Hide
          fpavageau fpavageau added a comment -

          I tried 2 builds with 1.379, which didn't get stuck, however I'm getting lots of output both in Tomcat's logs and in the build logs:

          Oct 2, 2010 11:10:54 PM hudson.remoting.PipeWindow$Real increase
          INFO: increase(11,151)->129768
          Oct 2, 2010 11:10:54 PM hudson.remoting.PipeWindow$Real increase
          INFO: increase(11,1304)->131072
          Oct 2, 2010 11:10:54 PM hudson.remoting.PipeWindow$Real decrease
          INFO: decrease(11,1339)->129733
          Oct 2, 2010 11:10:54 PM hudson.remoting.PipeWindow$Real increase
          INFO: increase(11,1339)->131072
          

          I guess I'm staying with 1.377 for the time being, even though I really like the new maven logs starting with 1.378.

          Show
          fpavageau fpavageau added a comment - I tried 2 builds with 1.379, which didn't get stuck, however I'm getting lots of output both in Tomcat's logs and in the build logs: Oct 2, 2010 11:10:54 PM hudson.remoting.PipeWindow$Real increase INFO: increase(11,151)->129768 Oct 2, 2010 11:10:54 PM hudson.remoting.PipeWindow$Real increase INFO: increase(11,1304)->131072 Oct 2, 2010 11:10:54 PM hudson.remoting.PipeWindow$Real decrease INFO: decrease(11,1339)->129733 Oct 2, 2010 11:10:54 PM hudson.remoting.PipeWindow$Real increase INFO: increase(11,1339)->131072 I guess I'm staying with 1.377 for the time being, even though I really like the new maven logs starting with 1.378.
          Hide
          esmalling Eric Smalling added a comment -

          I too am seeing the increase/decrease logs in 1.377. Too bad too since I need 1.379 to fix JENKINS-7546

          Show
          esmalling Eric Smalling added a comment - I too am seeing the increase/decrease logs in 1.377. Too bad too since I need 1.379 to fix JENKINS-7546
          Hide
          esmalling Eric Smalling added a comment -

          Unable to test JENKINS-7546 due to this issue in 1.379

          Show
          esmalling Eric Smalling added a comment - Unable to test JENKINS-7546 due to this issue in 1.379
          Hide
          swpalmer swpalmer added a comment - - edited

          I'm seeing the following exceptions after a successful build with Hudson 1.379. They seem to be related to the problems with the pipes to the slave nodes.

          FATAL: command execution failed
          hudson.util.IOException2: Failed to join the process
          at hudson.Proc$RemoteProc.join(Proc.java:355)
          at hudson.Launcher$ProcStarter.join(Launcher.java:280)
          at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:82)
          at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:58)
          at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:19)
          at hudson.model.AbstractBuild$AbstractRunner.perform(AbstractBuild.java:601)
          at hudson.model.Build$RunnerImpl.build(Build.java:174)
          at hudson.model.Build$RunnerImpl.doRun(Build.java:138)
          at hudson.model.AbstractBuild$AbstractRunner.run(AbstractBuild.java:416)
          at hudson.model.Run.run(Run.java:1280)
          at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46)
          at hudson.model.ResourceController.execute(ResourceController.java:88)
          at hudson.model.Executor.run(Executor.java:137)
          Caused by: java.util.concurrent.ExecutionException: hudson.remoting.RequestAbortedException: java.net.SocketException: Connection reset
          at hudson.remoting.Request$1.get(Request.java:218)
          at hudson.remoting.Request$1.get(Request.java:172)
          at hudson.remoting.FutureAdapter.get(FutureAdapter.java:55)
          at hudson.Proc$RemoteProc.join(Proc.java:347)
          ... 12 more
          Caused by: hudson.remoting.RequestAbortedException: java.net.SocketException: Connection reset
          at hudson.remoting.Request.abort(Request.java:257)
          at hudson.remoting.Channel.terminate(Channel.java:681)
          at hudson.remoting.Channel$ReaderThread.run(Channel.java:973)
          Caused by: java.net.SocketException: Connection reset
          at java.net.SocketInputStream.read(SocketInputStream.java:168)
          at java.io.BufferedInputStream.fill(BufferedInputStream.java:218)
          at java.io.BufferedInputStream.read(BufferedInputStream.java:237)
          at java.io.ObjectInputStream$PeekInputStream.peek(ObjectInputStream.java:2249)
          at java.io.ObjectInputStream$BlockDataInputStream.peek(ObjectInputStream.java:2542)
          at java.io.ObjectInputStream$BlockDataInputStream.peekByte(ObjectInputStream.java:2552)
          at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1297)
          at java.io.ObjectInputStream.readObject(ObjectInputStream.java:351)
          at hudson.remoting.Channel$ReaderThread.run(Channel.java:948)
          FATAL: Unable to delete script file C:\DOCUME~1\Hudson\LOCALS~1\Temp\hudson8908395142501843194.bat
          hudson.util.IOException2: remote file operation failed: C:\DOCUME~1\Hudson\LOCALS~1\Temp\hudson8908395142501843194.bat at hudson.remoting.Channel@15509e1:DCM-IG-01
          at hudson.FilePath.act(FilePath.java:749)
          at hudson.FilePath.act(FilePath.java:735)
          at hudson.FilePath.delete(FilePath.java:990)
          at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:92)
          at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:58)
          at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:19)
          at hudson.model.AbstractBuild$AbstractRunner.perform(AbstractBuild.java:601)
          at hudson.model.Build$RunnerImpl.build(Build.java:174)
          at hudson.model.Build$RunnerImpl.doRun(Build.java:138)
          at hudson.model.AbstractBuild$AbstractRunner.run(AbstractBuild.java:416)
          at hudson.model.Run.run(Run.java:1280)
          at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46)
          at hudson.model.ResourceController.execute(ResourceController.java:88)
          at hudson.model.Executor.run(Executor.java:137)
          Caused by: hudson.remoting.ChannelClosedException: channel is already closed
          at hudson.remoting.Channel.send(Channel.java:467)
          at hudson.remoting.Request.call(Request.java:105)
          at hudson.remoting.Channel.call(Channel.java:630)
          at hudson.FilePath.act(FilePath.java:742)
          ... 13 more
          Caused by: java.net.SocketException: Connection reset
          at java.net.SocketInputStream.read(SocketInputStream.java:168)
          at java.io.BufferedInputStream.fill(BufferedInputStream.java:218)
          at java.io.BufferedInputStream.read(BufferedInputStream.java:237)
          at java.io.ObjectInputStream$PeekInputStream.peek(ObjectInputStream.java:2249)
          at java.io.ObjectInputStream$BlockDataInputStream.peek(ObjectInputStream.java:2542)
          at java.io.ObjectInputStream$BlockDataInputStream.peekByte(ObjectInputStream.java:2552)
          at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1297)
          at java.io.ObjectInputStream.readObject(ObjectInputStream.java:351)
          at hudson.remoting.Channel$ReaderThread.run(Channel.java:948)
          FATAL: channel is already closed
          hudson.remoting.ChannelClosedException: channel is already closed
          at hudson.remoting.Channel.send(Channel.java:467)
          at hudson.remoting.Request.call(Request.java:105)
          at hudson.remoting.Channel.call(Channel.java:630)
          at hudson.Launcher$RemoteLauncher.kill(Launcher.java:744)
          at hudson.model.AbstractBuild$AbstractRunner.run(AbstractBuild.java:421)
          at hudson.model.Run.run(Run.java:1280)
          at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46)
          at hudson.model.ResourceController.execute(ResourceController.java:88)
          at hudson.model.Executor.run(Executor.java:137)
          Caused by: java.net.SocketException: Connection reset
          at java.net.SocketInputStream.read(SocketInputStream.java:168)
          at java.io.BufferedInputStream.fill(BufferedInputStream.java:218)
          at java.io.BufferedInputStream.read(BufferedInputStream.java:237)
          at java.io.ObjectInputStream$PeekInputStream.peek(ObjectInputStream.java:2249)
          at java.io.ObjectInputStream$BlockDataInputStream.peek(ObjectInputStream.java:2542)
          at java.io.ObjectInputStream$BlockDataInputStream.peekByte(ObjectInputStream.java:2552)
          at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1297)
          at java.io.ObjectInputStream.readObject(ObjectInputStream.java:351)
          at hudson.remoting.Channel$ReaderThread.run(Channel.java:948)
          Page generated: Oct 5, 2010 9:11:10 AMHudson ver. 1.379

          Show
          swpalmer swpalmer added a comment - - edited I'm seeing the following exceptions after a successful build with Hudson 1.379. They seem to be related to the problems with the pipes to the slave nodes. FATAL: command execution failed hudson.util.IOException2: Failed to join the process at hudson.Proc$RemoteProc.join(Proc.java:355) at hudson.Launcher$ProcStarter.join(Launcher.java:280) at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:82) at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:58) at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:19) at hudson.model.AbstractBuild$AbstractRunner.perform(AbstractBuild.java:601) at hudson.model.Build$RunnerImpl.build(Build.java:174) at hudson.model.Build$RunnerImpl.doRun(Build.java:138) at hudson.model.AbstractBuild$AbstractRunner.run(AbstractBuild.java:416) at hudson.model.Run.run(Run.java:1280) at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46) at hudson.model.ResourceController.execute(ResourceController.java:88) at hudson.model.Executor.run(Executor.java:137) Caused by: java.util.concurrent.ExecutionException: hudson.remoting.RequestAbortedException: java.net.SocketException: Connection reset at hudson.remoting.Request$1.get(Request.java:218) at hudson.remoting.Request$1.get(Request.java:172) at hudson.remoting.FutureAdapter.get(FutureAdapter.java:55) at hudson.Proc$RemoteProc.join(Proc.java:347) ... 12 more Caused by: hudson.remoting.RequestAbortedException: java.net.SocketException: Connection reset at hudson.remoting.Request.abort(Request.java:257) at hudson.remoting.Channel.terminate(Channel.java:681) at hudson.remoting.Channel$ReaderThread.run(Channel.java:973) Caused by: java.net.SocketException: Connection reset at java.net.SocketInputStream.read(SocketInputStream.java:168) at java.io.BufferedInputStream.fill(BufferedInputStream.java:218) at java.io.BufferedInputStream.read(BufferedInputStream.java:237) at java.io.ObjectInputStream$PeekInputStream.peek(ObjectInputStream.java:2249) at java.io.ObjectInputStream$BlockDataInputStream.peek(ObjectInputStream.java:2542) at java.io.ObjectInputStream$BlockDataInputStream.peekByte(ObjectInputStream.java:2552) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1297) at java.io.ObjectInputStream.readObject(ObjectInputStream.java:351) at hudson.remoting.Channel$ReaderThread.run(Channel.java:948) FATAL: Unable to delete script file C:\DOCUME~1\Hudson\LOCALS~1\Temp\hudson8908395142501843194.bat hudson.util.IOException2: remote file operation failed: C:\DOCUME~1\Hudson\LOCALS~1\Temp\hudson8908395142501843194.bat at hudson.remoting.Channel@15509e1:DCM-IG-01 at hudson.FilePath.act(FilePath.java:749) at hudson.FilePath.act(FilePath.java:735) at hudson.FilePath.delete(FilePath.java:990) at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:92) at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:58) at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:19) at hudson.model.AbstractBuild$AbstractRunner.perform(AbstractBuild.java:601) at hudson.model.Build$RunnerImpl.build(Build.java:174) at hudson.model.Build$RunnerImpl.doRun(Build.java:138) at hudson.model.AbstractBuild$AbstractRunner.run(AbstractBuild.java:416) at hudson.model.Run.run(Run.java:1280) at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46) at hudson.model.ResourceController.execute(ResourceController.java:88) at hudson.model.Executor.run(Executor.java:137) Caused by: hudson.remoting.ChannelClosedException: channel is already closed at hudson.remoting.Channel.send(Channel.java:467) at hudson.remoting.Request.call(Request.java:105) at hudson.remoting.Channel.call(Channel.java:630) at hudson.FilePath.act(FilePath.java:742) ... 13 more Caused by: java.net.SocketException: Connection reset at java.net.SocketInputStream.read(SocketInputStream.java:168) at java.io.BufferedInputStream.fill(BufferedInputStream.java:218) at java.io.BufferedInputStream.read(BufferedInputStream.java:237) at java.io.ObjectInputStream$PeekInputStream.peek(ObjectInputStream.java:2249) at java.io.ObjectInputStream$BlockDataInputStream.peek(ObjectInputStream.java:2542) at java.io.ObjectInputStream$BlockDataInputStream.peekByte(ObjectInputStream.java:2552) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1297) at java.io.ObjectInputStream.readObject(ObjectInputStream.java:351) at hudson.remoting.Channel$ReaderThread.run(Channel.java:948) FATAL: channel is already closed hudson.remoting.ChannelClosedException: channel is already closed at hudson.remoting.Channel.send(Channel.java:467) at hudson.remoting.Request.call(Request.java:105) at hudson.remoting.Channel.call(Channel.java:630) at hudson.Launcher$RemoteLauncher.kill(Launcher.java:744) at hudson.model.AbstractBuild$AbstractRunner.run(AbstractBuild.java:421) at hudson.model.Run.run(Run.java:1280) at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46) at hudson.model.ResourceController.execute(ResourceController.java:88) at hudson.model.Executor.run(Executor.java:137) Caused by: java.net.SocketException: Connection reset at java.net.SocketInputStream.read(SocketInputStream.java:168) at java.io.BufferedInputStream.fill(BufferedInputStream.java:218) at java.io.BufferedInputStream.read(BufferedInputStream.java:237) at java.io.ObjectInputStream$PeekInputStream.peek(ObjectInputStream.java:2249) at java.io.ObjectInputStream$BlockDataInputStream.peek(ObjectInputStream.java:2542) at java.io.ObjectInputStream$BlockDataInputStream.peekByte(ObjectInputStream.java:2552) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1297) at java.io.ObjectInputStream.readObject(ObjectInputStream.java:351) at hudson.remoting.Channel$ReaderThread.run(Channel.java:948) Page generated: Oct 5, 2010 9:11:10 AMHudson ver. 1.379
          Hide
          kohsuke Kohsuke Kawaguchi added a comment -

          Logging issue is covered in JENKINS-7662 and is already fixed. The connection drop problem is not a deadlock problem, so please file a separate issue for that. Thanks!

          Show
          kohsuke Kohsuke Kawaguchi added a comment - Logging issue is covered in JENKINS-7662 and is already fixed. The connection drop problem is not a deadlock problem, so please file a separate issue for that. Thanks!
          Hide
          scm_issue_link SCM/JIRA link daemon added a comment -

          Code changed in jenkins
          User: alanharder
          Path:
          src/main/java/hudson/plugins/copyartifact/FilePathCopyMethod.java
          http://jenkins-ci.org/commit/copyartifact-plugin/8b2ceaaeb1b58f6ab075e5cb260691c36e25fca4
          Log:
          Remove workaround code for JENKINS-5977 now that this issue is fixed
          in the minimum core version required by this plugin.

          Show
          scm_issue_link SCM/JIRA link daemon added a comment - Code changed in jenkins User: alanharder Path: src/main/java/hudson/plugins/copyartifact/FilePathCopyMethod.java http://jenkins-ci.org/commit/copyartifact-plugin/8b2ceaaeb1b58f6ab075e5cb260691c36e25fca4 Log: Remove workaround code for JENKINS-5977 now that this issue is fixed in the minimum core version required by this plugin.

            People

            • Assignee:
              kohsuke Kohsuke Kawaguchi
              Reporter:
              kohsuke Kohsuke Kawaguchi
            • Votes:
              27 Vote for this issue
              Watchers:
              26 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: