Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-7809

Remote Launcher randomly returns no data.

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Component/s: core
    • Labels:
      None
    • Environment:
      Hudson 1.378-1.383

      Description

      Since 1.378, the remoting functions of hudson have become unstable. When using a Launcher to execute commands remotely on a slave, sometimes it fails to return any data. The following stack trace is found in the hudson log when this occurs.

      Oct 15, 2010 10:12:17 AM hudson.remoting.ProxyOutputStream$Chunk$1 run
      WARNING: Failed to write to stream
      java.io.IOException: Pipe closed
              at java.io.PipedInputStream.checkStateForReceive(PipedInputStream.java:244)
              at java.io.PipedInputStream.receive(PipedInputStream.java:210)
              at java.io.PipedOutputStream.write(PipedOutputStream.java:132)
              at java.io.OutputStream.write(OutputStream.java:58)
              at hudson.util.DelegatingOutputStream.write(DelegatingOutputStream.java:51)
              at hudson.remoting.ProxyOutputStream$Chunk$1.run(ProxyOutputStream.java:185)
              at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
              at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
              at java.util.concurrent.FutureTask.run(FutureTask.java:138)
              at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
              at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
              at java.lang.Thread.run(Thread.java:619)
      

      This is currently causing issues with the perforce plugin (JENKINS-7664), and I've been able to reproduce it using the hudson_main_trunk#331 build on ci.hudson-labs.org as well.

        Issue Links

          Activity

          rpetti Rob Petti created issue -
          rpetti Rob Petti made changes -
          Field Original Value New Value
          Link This issue is blocking JENKINS-7664 [ JENKINS-7664 ]
          rpetti Rob Petti made changes -
          Priority Major [ 3 ] Blocker [ 1 ]
          Hide
          torbent torbent added a comment - - edited

          Oh, so that's what's causing all those warnings?
          I was just about to submit an issue on them
          I'm the submitter of JENKINS-7664 so I can, at least anecdotally, confirm the connection.

          We're still running 1.379, by the way, so it's not limited to 1.380

          Show
          torbent torbent added a comment - - edited Oh, so that's what's causing all those warnings? I was just about to submit an issue on them I'm the submitter of JENKINS-7664 so I can, at least anecdotally, confirm the connection. We're still running 1.379, by the way, so it's not limited to 1.380
          Hide
          rpetti Rob Petti added a comment -

          Updated the description to make it more clear which versions are affected. I've also tested with 1.381, and the problem still exists.

          Show
          rpetti Rob Petti added a comment - Updated the description to make it more clear which versions are affected. I've also tested with 1.381, and the problem still exists.
          rpetti Rob Petti made changes -
          Environment Hudson 1.380 Hudson 1.378-1.381
          Hide
          torbent torbent added a comment -

          I've seen these (or like) messages appear in the log around times where the Cobertura plugin fails. It transfers the XML files from the slave(s) to the master for processing, but somehow one (or more) files end up too short and can no longer be parsed as valid xml.
          The error message explicitly mentions a filename on the master, not the slave. When inspected, the file is indeed incomplete.

          (Just to state that it's not only affecting the perforce plugin)

          Show
          torbent torbent added a comment - I've seen these (or like) messages appear in the log around times where the Cobertura plugin fails. It transfers the XML files from the slave(s) to the master for processing, but somehow one (or more) files end up too short and can no longer be parsed as valid xml. The error message explicitly mentions a filename on the master, not the slave. When inspected, the file is indeed incomplete. (Just to state that it's not only affecting the perforce plugin)
          Hide
          torbent torbent added a comment -

          I believe these are related and mimic my observations on Cobertura.

          Show
          torbent torbent added a comment - I believe these are related and mimic my observations on Cobertura.
          torbent torbent made changes -
          Link This issue is related to JENKINS-7836 [ JENKINS-7836 ]
          torbent torbent made changes -
          Link This issue is related to JENKINS-7871 [ JENKINS-7871 ]
          Hide
          rpetti Rob Petti added a comment -

          Still seeing this behavior in 1.383.

          Show
          rpetti Rob Petti added a comment - Still seeing this behavior in 1.383.
          rpetti Rob Petti made changes -
          Environment Hudson 1.378-1.381 Hudson 1.378-1.383
          Hide
          lkishalmi lkishalmi added a comment -

          This issue seems to be caused by fixing JENKINS-5977.
          However JENKINS-5977 solved our overall Hudson master/slave communications made our Hudson grid stable we are experiencing this bug as well.

          Show
          lkishalmi lkishalmi added a comment - This issue seems to be caused by fixing JENKINS-5977 . However JENKINS-5977 solved our overall Hudson master/slave communications made our Hudson grid stable we are experiencing this bug as well.
          lkishalmi lkishalmi made changes -
          Link This issue is related to JENKINS-5977 [ JENKINS-5977 ]
          ickersep ickersep made changes -
          Link This issue is blocking JENKINS-8064 [ JENKINS-8064 ]
          Hide
          rpetti Rob Petti added a comment -

          We've gotten around our issue by flushing the pipe before closing it, since the new threading model for remote channels seems to close it immediately even when there's still stuff in the pipe. This problem is still affecting other plugins, it seems, so I'll leave the issue open.

          Show
          rpetti Rob Petti added a comment - We've gotten around our issue by flushing the pipe before closing it, since the new threading model for remote channels seems to close it immediately even when there's still stuff in the pipe. This problem is still affecting other plugins, it seems, so I'll leave the issue open.
          Hide
          kohsuke Kohsuke Kawaguchi added a comment - - edited

          @rpetti
          I checked the code and the close doesn't happen immediately — it gets queued in the same way as the rest of the I/O operations. I believe your flush fix refers to JENKINS-7664, but the reporter responded that that didn't work.

          So I doubt if this is caused by close() failing to report data buffered earlier.

          Still looking into the cause.

          Show
          kohsuke Kohsuke Kawaguchi added a comment - - edited @rpetti I checked the code and the close doesn't happen immediately — it gets queued in the same way as the rest of the I/O operations. I believe your flush fix refers to JENKINS-7664 , but the reporter responded that that didn't work. So I doubt if this is caused by close() failing to report data buffered earlier. Still looking into the cause.
          Hide
          scm_issue_link SCM/JIRA link daemon added a comment -

          Code changed in hudson
          User: : kohsuke
          Path:
          trunk/hudson/main/remoting/src/test/java/hudson/remoting/PipeTest.java
          http://jenkins-ci.org/commit/37070
          Log:
          added a test in an attempt to reproduce / pin-point JENKINS-7809 but this was not it.

          Show
          scm_issue_link SCM/JIRA link daemon added a comment - Code changed in hudson User: : kohsuke Path: trunk/hudson/main/remoting/src/test/java/hudson/remoting/PipeTest.java http://jenkins-ci.org/commit/37070 Log: added a test in an attempt to reproduce / pin-point JENKINS-7809 but this was not it.
          Hide
          kohsuke Kohsuke Kawaguchi added a comment -

          Is anyone seeing this outside the perforce plugin?

          I notice that the perforce plugin does something bit different with stream remoting.

          Show
          kohsuke Kohsuke Kawaguchi added a comment - Is anyone seeing this outside the perforce plugin? I notice that the perforce plugin does something bit different with stream remoting.
          Hide
          fpavageau fpavageau added a comment -

          Yes, it's happening with the Cobertura plugin as well, where the coverage.xml files end up truncated on master. Cf. JENKINS-7836.

          Show
          fpavageau fpavageau added a comment - Yes, it's happening with the Cobertura plugin as well, where the coverage.xml files end up truncated on master. Cf. JENKINS-7836 .
          Hide
          kohsuke Kohsuke Kawaguchi added a comment -

          JENKINS-7664 is updated with my proposed change.

          Show
          kohsuke Kohsuke Kawaguchi added a comment - JENKINS-7664 is updated with my proposed change.
          Hide
          hquiroz hquiroz added a comment -

          It's happening with Hudson 1.387 as well. We have trouble with the Cobertura report when the build was remote.

          Show
          hquiroz hquiroz added a comment - It's happening with Hudson 1.387 as well. We have trouble with the Cobertura report when the build was remote.
          mindless Alan Harder made changes -
          Link This issue is related to JENKINS-7745 [ JENKINS-7745 ]
          rpetti Rob Petti made changes -
          Link This issue is blocking JENKINS-8245 [ JENKINS-8245 ]
          Hide
          hbjastad hbjastad added a comment -

          It's happening with Hudson 1.388 as well. We have trouble with various parts of the build when the build is remote:

          • The Cobertura report
          • site:deploy
          • Maven site archiving

          Had to revert back to 1.377

          In case it is of any help, here is a stack trace:

          java.lang.NegativeArraySizeException
          	at hudson.remoting.ProxyOutputStream$Chunk.<init>(ProxyOutputStream.java:175)
          	at hudson.remoting.ProxyOutputStream._write(ProxyOutputStream.java:123)
          	at hudson.remoting.ProxyOutputStream.write(ProxyOutputStream.java:103)
          	at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:65)
          	at java.io.BufferedOutputStream.write(BufferedOutputStream.java:109)
          	at java.util.zip.DeflaterOutputStream.deflate(DeflaterOutputStream.java:161)
          	at java.util.zip.DeflaterOutputStream.write(DeflaterOutputStream.java:118)
          	at java.util.zip.GZIPOutputStream.write(GZIPOutputStream.java:72)
          	at java.io.BufferedOutputStream.write(BufferedOutputStream.java:105)
          	at org.apache.tools.tar.TarBuffer.writeBlock(TarBuffer.java:410)
          	at org.apache.tools.tar.TarBuffer.writeRecord(TarBuffer.java:351)
          	at hudson.org.apache.tools.tar.TarOutputStream.writeEOFRecord(TarOutputStream.java:356)
          	at hudson.org.apache.tools.tar.TarOutputStream.finish(TarOutputStream.java:137)
          	at hudson.org.apache.tools.tar.TarOutputStream.close(TarOutputStream.java:149)
          	at hudson.util.io.TarArchiver.close(TarArchiver.java:114)
          	at hudson.FilePath.writeToTar(FilePath.java:1545)
          	at hudson.FilePath.copyRecursiveTo(FilePath.java:1470)
          	at hudson.FilePath.copyRecursiveTo(FilePath.java:1399)
          	at hudson.maven.reporters.MavenSiteArchiver.postExecute(MavenSiteArchiver.java:81)
          	at hudson.maven.MavenModuleSetBuild$Builder.postExecute(MavenModuleSetBuild.java:822)
          	at hudson.maven.MavenBuilder$Adapter.postExecute(MavenBuilder.java:300)
          	at hudson.maven.agent.PluginManagerInterceptor$1MojoIntercepterImpl.callPost(PluginManagerInterceptor.java:170)
          	at hudson.maven.agent.PluginManagerInterceptor.executeMojo(PluginManagerInterceptor.java:183)
          	at org.apache.maven.lifecycle.DefaultLifecycleExecutor.executeGoals(DefaultLifecycleExecutor.java:559)
          	at org.apache.maven.lifecycle.DefaultLifecycleExecutor.executeStandaloneGoal(DefaultLifecycleExecutor.java:513)
          	at org.apache.maven.lifecycle.DefaultLifecycleExecutor.executeGoal(DefaultLifecycleExecutor.java:483)
          	at org.apache.maven.lifecycle.DefaultLifecycleExecutor.executeGoalAndHandleFailures(DefaultLifecycleExecutor.java:331)
          	at org.apache.maven.lifecycle.DefaultLifecycleExecutor.executeTaskSegments(DefaultLifecycleExecutor.java:292)
          	at org.apache.maven.lifecycle.DefaultLifecycleExecutor.execute(DefaultLifecycleExecutor.java:142)
          	at org.apache.maven.lifecycle.LifecycleExecutorInterceptor.execute(LifecycleExecutorInterceptor.java:65)
          	at org.apache.maven.DefaultMaven.doExecute(DefaultMaven.java:336)
          	at org.apache.maven.DefaultMaven.execute(DefaultMaven.java:129)
          	at org.apache.maven.cli.MavenCli.main(MavenCli.java:301)
          	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
          	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
          	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
          	at java.lang.reflect.Method.invoke(Method.java:597)
          	at org.codehaus.classworlds.Launcher.launchEnhanced(Launcher.java:315)
          	at org.codehaus.classworlds.Launcher.launch(Launcher.java:255)
          	at hudson.maven.agent.Main.launch(Main.java:165)
          	at hudson.maven.MavenBuilder.call(MavenBuilder.java:165)
          	at hudson.maven.MavenModuleSetBuild$Builder.call(MavenModuleSetBuild.java:744)
          	at hudson.maven.MavenModuleSetBuild$Builder.call(MavenModuleSetBuild.java:688)
          	at hudson.remoting.UserRequest.perform(UserRequest.java:114)
          	at hudson.remoting.UserRequest.perform(UserRequest.java:48)
          	at hudson.remoting.Request$2.run(Request.java:270)
          	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
          	at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
          	at java.util.concurrent.FutureTask.run(FutureTask.java:138)
          	at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
          	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
          	at java.lang.Thread.run(Thread.java:662)
          
          Show
          hbjastad hbjastad added a comment - It's happening with Hudson 1.388 as well. We have trouble with various parts of the build when the build is remote: The Cobertura report site:deploy Maven site archiving Had to revert back to 1.377 In case it is of any help, here is a stack trace: java.lang.NegativeArraySizeException at hudson.remoting.ProxyOutputStream$Chunk.<init>(ProxyOutputStream.java:175) at hudson.remoting.ProxyOutputStream._write(ProxyOutputStream.java:123) at hudson.remoting.ProxyOutputStream.write(ProxyOutputStream.java:103) at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:65) at java.io.BufferedOutputStream.write(BufferedOutputStream.java:109) at java.util.zip.DeflaterOutputStream.deflate(DeflaterOutputStream.java:161) at java.util.zip.DeflaterOutputStream.write(DeflaterOutputStream.java:118) at java.util.zip.GZIPOutputStream.write(GZIPOutputStream.java:72) at java.io.BufferedOutputStream.write(BufferedOutputStream.java:105) at org.apache.tools.tar.TarBuffer.writeBlock(TarBuffer.java:410) at org.apache.tools.tar.TarBuffer.writeRecord(TarBuffer.java:351) at hudson.org.apache.tools.tar.TarOutputStream.writeEOFRecord(TarOutputStream.java:356) at hudson.org.apache.tools.tar.TarOutputStream.finish(TarOutputStream.java:137) at hudson.org.apache.tools.tar.TarOutputStream.close(TarOutputStream.java:149) at hudson.util.io.TarArchiver.close(TarArchiver.java:114) at hudson.FilePath.writeToTar(FilePath.java:1545) at hudson.FilePath.copyRecursiveTo(FilePath.java:1470) at hudson.FilePath.copyRecursiveTo(FilePath.java:1399) at hudson.maven.reporters.MavenSiteArchiver.postExecute(MavenSiteArchiver.java:81) at hudson.maven.MavenModuleSetBuild$Builder.postExecute(MavenModuleSetBuild.java:822) at hudson.maven.MavenBuilder$Adapter.postExecute(MavenBuilder.java:300) at hudson.maven.agent.PluginManagerInterceptor$1MojoIntercepterImpl.callPost(PluginManagerInterceptor.java:170) at hudson.maven.agent.PluginManagerInterceptor.executeMojo(PluginManagerInterceptor.java:183) at org.apache.maven.lifecycle.DefaultLifecycleExecutor.executeGoals(DefaultLifecycleExecutor.java:559) at org.apache.maven.lifecycle.DefaultLifecycleExecutor.executeStandaloneGoal(DefaultLifecycleExecutor.java:513) at org.apache.maven.lifecycle.DefaultLifecycleExecutor.executeGoal(DefaultLifecycleExecutor.java:483) at org.apache.maven.lifecycle.DefaultLifecycleExecutor.executeGoalAndHandleFailures(DefaultLifecycleExecutor.java:331) at org.apache.maven.lifecycle.DefaultLifecycleExecutor.executeTaskSegments(DefaultLifecycleExecutor.java:292) at org.apache.maven.lifecycle.DefaultLifecycleExecutor.execute(DefaultLifecycleExecutor.java:142) at org.apache.maven.lifecycle.LifecycleExecutorInterceptor.execute(LifecycleExecutorInterceptor.java:65) at org.apache.maven.DefaultMaven.doExecute(DefaultMaven.java:336) at org.apache.maven.DefaultMaven.execute(DefaultMaven.java:129) at org.apache.maven.cli.MavenCli.main(MavenCli.java:301) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.codehaus.classworlds.Launcher.launchEnhanced(Launcher.java:315) at org.codehaus.classworlds.Launcher.launch(Launcher.java:255) at hudson.maven.agent.Main.launch(Main.java:165) at hudson.maven.MavenBuilder.call(MavenBuilder.java:165) at hudson.maven.MavenModuleSetBuild$Builder.call(MavenModuleSetBuild.java:744) at hudson.maven.MavenModuleSetBuild$Builder.call(MavenModuleSetBuild.java:688) at hudson.remoting.UserRequest.perform(UserRequest.java:114) at hudson.remoting.UserRequest.perform(UserRequest.java:48) at hudson.remoting.Request$2.run(Request.java:270) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662)
          hudsonfsc hudsonfsc made changes -
          Assignee hudsonfsc [ hudsonfsc ]
          hudsonfsc hudsonfsc made changes -
          Assignee hudsonfsc [ hudsonfsc ] kohsuke [ kohsuke ]
          Hide
          kohsuke Kohsuke Kawaguchi added a comment -

          I wonder if this was a different manifestation of JENKINS-7745, which we fixed in 1.397. I encourage people to upgrade to 1.397 and report their results.

          Show
          kohsuke Kohsuke Kawaguchi added a comment - I wonder if this was a different manifestation of JENKINS-7745 , which we fixed in 1.397. I encourage people to upgrade to 1.397 and report their results.
          Hide
          scm_issue_link SCM/JIRA link daemon added a comment -

          Code changed in jenkins
          User: Kohsuke Kawaguchi
          Path:
          remoting/src/main/java/hudson/remoting/PipeWindow.java
          http://jenkins-ci.org/commit/core/7b40e2e091db073df18a27d27e2019a290fa3bee
          Log:
          being more defensive, in seeing JENKINS-7809 and the following stack trace:

          java.lang.NegativeArraySizeException
          at hudson.remoting.ProxyOutputStream$Chunk.<init>(ProxyOutputStream.java:175)
          at hudson.remoting.ProxyOutputStream._write(ProxyOutputStream.java:123)
          at hudson.remoting.ProxyOutputStream.write(ProxyOutputStream.java:103)

          Show
          scm_issue_link SCM/JIRA link daemon added a comment - Code changed in jenkins User: Kohsuke Kawaguchi Path: remoting/src/main/java/hudson/remoting/PipeWindow.java http://jenkins-ci.org/commit/core/7b40e2e091db073df18a27d27e2019a290fa3bee Log: being more defensive, in seeing JENKINS-7809 and the following stack trace: java.lang.NegativeArraySizeException at hudson.remoting.ProxyOutputStream$Chunk.<init>(ProxyOutputStream.java:175) at hudson.remoting.ProxyOutputStream._write(ProxyOutputStream.java:123) at hudson.remoting.ProxyOutputStream.write(ProxyOutputStream.java:103)
          Hide
          rpetti Rob Petti added a comment -

          I was able to reproduce JENKINS-7664 with that fix, so that doesn't seem to be the cause.

          In the perforce plugin, we are joining the RemoteProc created by the RemoteLauncher, and assuming that all the data has made it through the channel at that point. Unfortunately this doesn't seem to be the case, and occasionally you can run into a race condition where the process is joined, but there are still Chunks on their way through the Channel. Given that we close the stream right after the join, this generates exceptions and missing/truncated data.

          IMO, joining a RemoteProc should ensure that ALL data generated by that process has successfully made it through the channel before terminating. In the meantime, I've reimplemented RemoteLauncher in the perforce plugin to close the pipes on the slave end when it's LocalLauncher is finished. This will ensure that all the data makes it through (since the EOF is the last thing to be sent).

          What I can't figure out is why this doesn't seem to be affecting things like build task logs... We're closing the stream as fast as possible, but perhaps the build listener normally stays open for longer periods?

          Show
          rpetti Rob Petti added a comment - I was able to reproduce JENKINS-7664 with that fix, so that doesn't seem to be the cause. In the perforce plugin, we are joining the RemoteProc created by the RemoteLauncher, and assuming that all the data has made it through the channel at that point. Unfortunately this doesn't seem to be the case, and occasionally you can run into a race condition where the process is joined, but there are still Chunks on their way through the Channel. Given that we close the stream right after the join, this generates exceptions and missing/truncated data. IMO, joining a RemoteProc should ensure that ALL data generated by that process has successfully made it through the channel before terminating. In the meantime, I've reimplemented RemoteLauncher in the perforce plugin to close the pipes on the slave end when it's LocalLauncher is finished. This will ensure that all the data makes it through (since the EOF is the last thing to be sent). What I can't figure out is why this doesn't seem to be affecting things like build task logs... We're closing the stream as fast as possible, but perhaps the build listener normally stays open for longer periods?
          Hide
          kohsuke Kohsuke Kawaguchi added a comment -

          OK, I think I understand what's going on here.

          In Jenkins, the process I/O goes the opposite direction — we provide OutputStream that the process writes its stdout to, as opposed to drain the stream via read. And it doesn't close the stream that it writes to when the process has closed its stdout to indicate EOF. So the plugin like perforce that needs to read from stdout gets no EOF signal in the stream.

          The only way out in the current design is to close the output stream after joining the process, which is what you do in HudsonPipedOutputStream. But this only works if the data has been indeed written to the output stream before the join successfully returns.

          This is the case for local execution, as Proc.join() internally waits for the stream pumping threads to complete, but for remote execution, this isn't a guarantee enough — it merely means that the last bits of data has left the remote end to start its journey, but it still involves additional steps before the data actually gets written to locally exported OutputStream, hence the race condition.

          So the short term fix is to ensure that RemoteProc.join() does make sure that all the data arrived locally before returning. The long term proper fix is to allow code to read directly from stdout/stderr as InputStream, thereby eliminating this pseudo-EOF business.

          Show
          kohsuke Kohsuke Kawaguchi added a comment - OK, I think I understand what's going on here. In Jenkins, the process I/O goes the opposite direction — we provide OutputStream that the process writes its stdout to, as opposed to drain the stream via read. And it doesn't close the stream that it writes to when the process has closed its stdout to indicate EOF. So the plugin like perforce that needs to read from stdout gets no EOF signal in the stream. The only way out in the current design is to close the output stream after joining the process, which is what you do in HudsonPipedOutputStream. But this only works if the data has been indeed written to the output stream before the join successfully returns. This is the case for local execution, as Proc.join() internally waits for the stream pumping threads to complete, but for remote execution, this isn't a guarantee enough — it merely means that the last bits of data has left the remote end to start its journey, but it still involves additional steps before the data actually gets written to locally exported OutputStream, hence the race condition. So the short term fix is to ensure that RemoteProc.join() does make sure that all the data arrived locally before returning. The long term proper fix is to allow code to read directly from stdout/stderr as InputStream, thereby eliminating this pseudo-EOF business.
          Hide
          rpetti Rob Petti added a comment -

          No longer a blocking issue, as a work-around has been found.

          Show
          rpetti Rob Petti added a comment - No longer a blocking issue, as a work-around has been found.
          rpetti Rob Petti made changes -
          Priority Blocker [ 1 ] Major [ 3 ]
          Hide
          scm_issue_link SCM/JIRA link daemon added a comment -

          Code changed in jenkins
          User: Kohsuke Kawaguchi
          Path:
          changelog.html
          core/src/main/java/hudson/Launcher.java
          core/src/main/java/hudson/Proc.java
          remoting/src/main/java/hudson/remoting/Channel.java
          test/src/test/java/hudson/ProcTest.java
          http://jenkins-ci.org/commit/core/3a3690b28135287ed78d6a966a5632a16277f05b
          Log:
          [FIXED JENKINS-7809] fixed a race condition in obtaining the tail of the output from remote process.

          Show
          scm_issue_link SCM/JIRA link daemon added a comment - Code changed in jenkins User: Kohsuke Kawaguchi Path: changelog.html core/src/main/java/hudson/Launcher.java core/src/main/java/hudson/Proc.java remoting/src/main/java/hudson/remoting/Channel.java test/src/test/java/hudson/ProcTest.java http://jenkins-ci.org/commit/core/3a3690b28135287ed78d6a966a5632a16277f05b Log: [FIXED JENKINS-7809] fixed a race condition in obtaining the tail of the output from remote process.
          scm_issue_link SCM/JIRA link daemon made changes -
          Status Open [ 1 ] Resolved [ 5 ]
          Resolution Fixed [ 1 ]
          Hide
          dogfood dogfood added a comment -

          Integrated in jenkins_main_trunk #514
          [FIXED JENKINS-7809] fixed a race condition in obtaining the tail of the output from remote process.

          Kohsuke Kawaguchi :
          Files :

          • test/src/test/java/hudson/ProcTest.java
          • core/src/main/java/hudson/Proc.java
          • changelog.html
          • core/src/main/java/hudson/Launcher.java
          • remoting/src/main/java/hudson/remoting/Channel.java
          Show
          dogfood dogfood added a comment - Integrated in jenkins_main_trunk #514 [FIXED JENKINS-7809] fixed a race condition in obtaining the tail of the output from remote process. Kohsuke Kawaguchi : Files : test/src/test/java/hudson/ProcTest.java core/src/main/java/hudson/Proc.java changelog.html core/src/main/java/hudson/Launcher.java remoting/src/main/java/hudson/remoting/Channel.java
          Hide
          scm_issue_link SCM/JIRA link daemon added a comment -

          Code changed in jenkins
          User: Kohsuke Kawaguchi
          Path:
          changelog.html
          core/src/main/java/hudson/Launcher.java
          core/src/main/java/hudson/Proc.java
          core/src/main/java/hudson/os/windows/WindowsRemoteLauncher.java
          core/src/test/java/hudson/LauncherTest.java
          remoting/src/main/java/hudson/remoting/RemoteInputStream.java
          test/src/main/java/org/jvnet/hudson/test/FakeLauncher.java
          test/src/test/java/hudson/ProcTest.java
          http://jenkins-ci.org/commit/core/96dd84b5d0d37e1765c3615bcba048eb62e38ad6
          Log:
          [FIXED JENKINS-7809] expose child process stdout as InputStream.

          I added a mode where one can launch a child process without the built-in pumping support, making it trivial to read until EOF in the distributed environments.

          This concludes the fix.

          Show
          scm_issue_link SCM/JIRA link daemon added a comment - Code changed in jenkins User: Kohsuke Kawaguchi Path: changelog.html core/src/main/java/hudson/Launcher.java core/src/main/java/hudson/Proc.java core/src/main/java/hudson/os/windows/WindowsRemoteLauncher.java core/src/test/java/hudson/LauncherTest.java remoting/src/main/java/hudson/remoting/RemoteInputStream.java test/src/main/java/org/jvnet/hudson/test/FakeLauncher.java test/src/test/java/hudson/ProcTest.java http://jenkins-ci.org/commit/core/96dd84b5d0d37e1765c3615bcba048eb62e38ad6 Log: [FIXED JENKINS-7809] expose child process stdout as InputStream. I added a mode where one can launch a child process without the built-in pumping support, making it trivial to read until EOF in the distributed environments. This concludes the fix.
          Hide
          dogfood dogfood added a comment -
          Show
          dogfood dogfood added a comment - Integrated in jenkins_main_trunk #517
          vlatombe Vincent Latombe made changes -
          Link This issue is blocking JENKINS-7800 [ JENKINS-7800 ]

            People

            • Assignee:
              kohsuke Kohsuke Kawaguchi
              Reporter:
              rpetti Rob Petti
            • Votes:
              37 Vote for this issue
              Watchers:
              38 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: