Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-23043

Build hangs while copying artifacts to slave

    Details

    • Similar Issues:

      Description

      After upgraded from 1.553 to 1.563 build on slave hangs for around 7-8mins when copying artifcates.
      Master and slave are running JDK 1.7.0_55.
      Slave ist started via JNLP.

      Possible related thread on master:
      "Executor #0 for DE-DD-0414 : executing srm_dev_unit_tests #2159" prio=6 tid=0x3afae400 nid=0x13e4 in Object.wait() [0x3db5f000]
      java.lang.Thread.State: WAITING (on object monitor)
      at java.lang.Object.wait(Native Method)
      at java.lang.Object.wait(Object.java:503)
      at org.jenkinsci.remoting.nio.FifoBuffer.write(FifoBuffer.java:336)

      • locked <0x16b61528> (a org.jenkinsci.remoting.nio.FifoBuffer)
        at org.jenkinsci.remoting.nio.NioChannelHub$NioTransport.writeBlock(NioChannelHub.java:196)
        at hudson.remoting.AbstractByteArrayCommandTransport.write(AbstractByteArrayCommandTransport.java:83)
        at hudson.remoting.Channel.send(Channel.java:545)
      • locked <0x16b43950> (a hudson.remoting.Channel)
        at hudson.remoting.ProxyOutputStream._write(ProxyOutputStream.java:163)
      • locked <0x16f1eb48> (a hudson.remoting.ProxyOutputStream)
        at hudson.remoting.ProxyOutputStream.write(ProxyOutputStream.java:109)
        at hudson.remoting.RemoteOutputStream.write(RemoteOutputStream.java:110)
        at java.security.DigestOutputStream.write(Unknown Source)
        at hudson.remoting.RemoteOutputStream.write(RemoteOutputStream.java:110)
        at hudson.Util.copyStream(Util.java:448)
        at hudson.FilePath$37.invoke(FilePath.java:1827)
        at hudson.FilePath$37.invoke(FilePath.java:1821)
        at hudson.FilePath.act(FilePath.java:920)
        at hudson.FilePath.act(FilePath.java:893)
        at hudson.FilePath.copyTo(FilePath.java:1821)
        at hudson.plugins.copyartifact.FingerprintingCopyMethod.copyOne(FingerprintingCopyMethod.java:79)
        at hudson.plugins.copyartifact.FingerprintingCopyMethod.copyAll(FingerprintingCopyMethod.java:68)
        at hudson.plugins.copyartifact.CopyArtifact.perform(CopyArtifact.java:368)
        at hudson.plugins.copyartifact.CopyArtifact.perform(CopyArtifact.java:306)
        at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:20)
        at hudson.model.AbstractBuild$AbstractBuildExecution.perform(AbstractBuild.java:745)
        at hudson.model.Build$BuildExecution.build(Build.java:198)
        at hudson.model.Build$BuildExecution.doRun(Build.java:159)
        at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:518)
        at hudson.model.Run.execute(Run.java:1706)
        at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:43)
        at hudson.model.ResourceController.execute(ResourceController.java:88)
        at hudson.model.Executor.run(Executor.java:231)

        Attachments

          Issue Links

            Activity

            Hide
            ffissore Federico Fissore added a comment -

            Have you found any workaround?

            Show
            ffissore Federico Fissore added a comment - Have you found any workaround?
            Hide
            svensteiniger Sven Steiniger added a comment -

            No, we didn't.
            Execution time of unit-tests jumped from 1 minute to 10..18 minutes.

            Show
            svensteiniger Sven Steiniger added a comment - No, we didn't. Execution time of unit-tests jumped from 1 minute to 10..18 minutes.
            Hide
            kfritsche Karl Fritsche added a comment -

            Did you tried with the newest version of jenkins (1.566)?
            From the changelog it seems there were some problems with the remote client.

            Show
            kfritsche Karl Fritsche added a comment - Did you tried with the newest version of jenkins (1.566)? From the changelog it seems there were some problems with the remote client.
            Hide
            svensteiniger Sven Steiniger added a comment -

            Yes, I tried but it still has the same effect.
            Jenkins master and slave are running 1.7.0_55-b13 (Master 32bit client-vm, Slave 64bit server-vm).
            I also tried to start the slave with an 32bit jvm. But the wait still occurs.

            Stack dumps are the same as in the initial bug report.

            Show
            svensteiniger Sven Steiniger added a comment - Yes, I tried but it still has the same effect. Jenkins master and slave are running 1.7.0_55-b13 (Master 32bit client-vm, Slave 64bit server-vm). I also tried to start the slave with an 32bit jvm. But the wait still occurs. Stack dumps are the same as in the initial bug report.
            Hide
            stephenconnolly Stephen Connolly added a comment -

            I believe this is related to JENKINS-25218 on the basis of the following portion of the stack trace:

            "Executor #0 for DE-DD-0414 : executing srm_dev_unit_tests #2159" prio=6 tid=0x3afae400 nid=0x13e4 in Object.wait() [0x3db5f000]
               java.lang.Thread.State: WAITING (on object monitor)
            	at java.lang.Object.wait(Native Method)
            	at java.lang.Object.wait(Object.java:503)
            	at org.jenkinsci.remoting.nio.FifoBuffer.write(FifoBuffer.java:336)
            	- locked <0x16b61528> (a org.jenkinsci.remoting.nio.FifoBuffer)
            	at org.jenkinsci.remoting.nio.NioChannelHub$NioTransport.writeBlock(NioChannelHub.java:196)
            	at hudson.remoting.AbstractByteArrayCommandTransport.write(AbstractByteArrayCommandTransport.java:83)
            	at hudson.remoting.Channel.send(Channel.java:545)
            	- locked <0x16b43950> (a hudson.remoting.Channel)
            	at hudson.remoting.ProxyOutputStream._write(ProxyOutputStream.java:163)
            	- locked <0x16f1eb48> (a hudson.remoting.ProxyOutputStream)
            	at hudson.remoting.ProxyOutputStream.write(ProxyOutputStream.java:109)
            	at hudson.remoting.RemoteOutputStream.write(RemoteOutputStream.java:110)
            	at java.security.DigestOutputStream.write(Unknown Source)
            	at hudson.remoting.RemoteOutputStream.write(RemoteOutputStream.java:110)
            	at hudson.Util.copyStream(Util.java:448)
            	at hudson.FilePath$37.invoke(FilePath.java:1827)
            	at hudson.FilePath$37.invoke(FilePath.java:1821)
            	at hudson.FilePath.act(FilePath.java:920)
            	at hudson.FilePath.act(FilePath.java:893)
            	at hudson.FilePath.copyTo(FilePath.java:1821)
            	at hudson.plugins.copyartifact.FingerprintingCopyMethod.copyOne(FingerprintingCopyMethod.java:79)
            	at hudson.plugins.copyartifact.FingerprintingCopyMethod.copyAll(FingerprintingCopyMethod.java:68)
            	at hudson.plugins.copyartifact.CopyArtifact.perform(CopyArtifact.java:368)
            	at hudson.plugins.copyartifact.CopyArtifact.perform(CopyArtifact.java:306)
            	at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:20)
            	at hudson.model.AbstractBuild$AbstractBuildExecution.perform(AbstractBuild.java:745)
            	at hudson.model.Build$BuildExecution.build(Build.java:198)
            	at hudson.model.Build$BuildExecution.doRun(Build.java:159)
            	at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:518)
            	at hudson.model.Run.execute(Run.java:1706)
            	at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:43)
            	at hudson.model.ResourceController.execute(ResourceController.java:88)
            	at hudson.model.Executor.run(Executor.java:231)
            

            While the NioChannelHub is not blocked:

            "NioChannelHub keys=1 gen=1280943: Computer.threadPoolForRemoting [#2]" daemon prio=6 tid=0x3abda400 nid=0x12f0 runnable [0x39f6f000]
               java.lang.Thread.State: RUNNABLE
            	at sun.nio.ch.WindowsSelectorImpl$SubSelector.poll0(Native Method)
            	at sun.nio.ch.WindowsSelectorImpl$SubSelector.poll(Unknown Source)
            	at sun.nio.ch.WindowsSelectorImpl$SubSelector.access$400(Unknown Source)
            	at sun.nio.ch.WindowsSelectorImpl.doSelect(Unknown Source)
            	at sun.nio.ch.SelectorImpl.lockAndDoSelect(Unknown Source)
            	- locked <0x13e58190> (a sun.nio.ch.Util$2)
            	- locked <0x13e581a0> (a java.util.Collections$UnmodifiableSet)
            	- locked <0x13e58118> (a sun.nio.ch.WindowsSelectorImpl)
            	at sun.nio.ch.SelectorImpl.select(Unknown Source)
            	at sun.nio.ch.SelectorImpl.select(Unknown Source)
            	at org.jenkinsci.remoting.nio.NioChannelHub.run(NioChannelHub.java:478)
            	at jenkins.util.ContextResettingExecutorService$1.run(ContextResettingExecutorService.java:28)
            	at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
            	at java.util.concurrent.FutureTask.run(Unknown Source)
            	at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
            	at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
            	at java.lang.Thread.run(Unknown Source)
            

            I find it suspicious that the writing thread is stuck in the same point as the livelock of JENKINS-25218.

            It would be good if the original reporter could confirm if the issue is still present and if the issue is bypassed by disabling the NioHub transport

            Show
            stephenconnolly Stephen Connolly added a comment - I believe this is related to JENKINS-25218 on the basis of the following portion of the stack trace: "Executor #0 for DE-DD-0414 : executing srm_dev_unit_tests #2159" prio=6 tid=0x3afae400 nid=0x13e4 in Object .wait() [0x3db5f000] java.lang. Thread .State: WAITING (on object monitor) at java.lang. Object .wait(Native Method) at java.lang. Object .wait( Object .java:503) at org.jenkinsci.remoting.nio.FifoBuffer.write(FifoBuffer.java:336) - locked <0x16b61528> (a org.jenkinsci.remoting.nio.FifoBuffer) at org.jenkinsci.remoting.nio.NioChannelHub$NioTransport.writeBlock(NioChannelHub.java:196) at hudson.remoting.AbstractByteArrayCommandTransport.write(AbstractByteArrayCommandTransport.java:83) at hudson.remoting.Channel.send(Channel.java:545) - locked <0x16b43950> (a hudson.remoting.Channel) at hudson.remoting.ProxyOutputStream._write(ProxyOutputStream.java:163) - locked <0x16f1eb48> (a hudson.remoting.ProxyOutputStream) at hudson.remoting.ProxyOutputStream.write(ProxyOutputStream.java:109) at hudson.remoting.RemoteOutputStream.write(RemoteOutputStream.java:110) at java.security.DigestOutputStream.write(Unknown Source) at hudson.remoting.RemoteOutputStream.write(RemoteOutputStream.java:110) at hudson.Util.copyStream(Util.java:448) at hudson.FilePath$37.invoke(FilePath.java:1827) at hudson.FilePath$37.invoke(FilePath.java:1821) at hudson.FilePath.act(FilePath.java:920) at hudson.FilePath.act(FilePath.java:893) at hudson.FilePath.copyTo(FilePath.java:1821) at hudson.plugins.copyartifact.FingerprintingCopyMethod.copyOne(FingerprintingCopyMethod.java:79) at hudson.plugins.copyartifact.FingerprintingCopyMethod.copyAll(FingerprintingCopyMethod.java:68) at hudson.plugins.copyartifact.CopyArtifact.perform(CopyArtifact.java:368) at hudson.plugins.copyartifact.CopyArtifact.perform(CopyArtifact.java:306) at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:20) at hudson.model.AbstractBuild$AbstractBuildExecution.perform(AbstractBuild.java:745) at hudson.model.Build$BuildExecution.build(Build.java:198) at hudson.model.Build$BuildExecution.doRun(Build.java:159) at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:518) at hudson.model.Run.execute(Run.java:1706) at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:43) at hudson.model.ResourceController.execute(ResourceController.java:88) at hudson.model.Executor.run(Executor.java:231) While the NioChannelHub is not blocked: "NioChannelHub keys=1 gen=1280943: Computer.threadPoolForRemoting [#2]" daemon prio=6 tid=0x3abda400 nid=0x12f0 runnable [0x39f6f000] java.lang. Thread .State: RUNNABLE at sun.nio.ch.WindowsSelectorImpl$SubSelector.poll0(Native Method) at sun.nio.ch.WindowsSelectorImpl$SubSelector.poll(Unknown Source) at sun.nio.ch.WindowsSelectorImpl$SubSelector.access$400(Unknown Source) at sun.nio.ch.WindowsSelectorImpl.doSelect(Unknown Source) at sun.nio.ch.SelectorImpl.lockAndDoSelect(Unknown Source) - locked <0x13e58190> (a sun.nio.ch.Util$2) - locked <0x13e581a0> (a java.util.Collections$UnmodifiableSet) - locked <0x13e58118> (a sun.nio.ch.WindowsSelectorImpl) at sun.nio.ch.SelectorImpl.select(Unknown Source) at sun.nio.ch.SelectorImpl.select(Unknown Source) at org.jenkinsci.remoting.nio.NioChannelHub.run(NioChannelHub.java:478) at jenkins.util.ContextResettingExecutorService$1.run(ContextResettingExecutorService.java:28) at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source) at java.util.concurrent.FutureTask.run(Unknown Source) at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at java.lang. Thread .run(Unknown Source) I find it suspicious that the writing thread is stuck in the same point as the livelock of JENKINS-25218 . It would be good if the original reporter could confirm if the issue is still present and if the issue is bypassed by disabling the NioHub transport
            Hide
            svensteiniger Sven Steiniger added a comment -

            Yes, the issue still exists in current version (1.617).

            Nevertheless thank you for the tip to disable NioHub. After some searching it works!
            We got used to the build start delays during the last year and now execution of unit tests dropped from 7..16 minutes to 1 minute!

            As advice for other users. In jenkins.xml search the line with <arguments>.
            There you simple add -Djenkins.slaves.NioChannelSelector.disabled=true.
            That's it.

            Show
            svensteiniger Sven Steiniger added a comment - Yes, the issue still exists in current version (1.617). Nevertheless thank you for the tip to disable NioHub. After some searching it works! We got used to the build start delays during the last year and now execution of unit tests dropped from 7..16 minutes to 1 minute! As advice for other users. In jenkins.xml search the line with <arguments> . There you simple add -Djenkins.slaves.NioChannelSelector.disabled=true . That's it.
            Hide
            stephenconnolly Stephen Connolly added a comment -

            Confirmed as a duplicate

            Show
            stephenconnolly Stephen Connolly added a comment - Confirmed as a duplicate
            Hide
            stephenconnolly Stephen Connolly added a comment -

            Duplicate of JENKINS-25218

            Show
            stephenconnolly Stephen Connolly added a comment - Duplicate of JENKINS-25218

              People

              • Assignee:
                Unassigned
                Reporter:
                svensteiniger Sven Steiniger
              • Votes:
                3 Vote for this issue
                Watchers:
                5 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: