Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-32950

Jenkins slave resets connection during or just after artifacts download.

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Unresolved
    • Icon: Major Major
    • core, remoting
    • Windows 2008 R2 64bit (master) + Virtual Machine Windows 2008 R2 64bit (slave)

      In jenkins I have several build jobs with some artifact dependencies. First project builds just fine both on linux and windows, but the second one (requiring artifacts from previous project) fails during artifact download.

      Slave log from slave perspective:

      Feb 15, 2016 5:12:54 AM hudson.remoting.jnlp.Main createEngine
      INFO: Setting up slave: Windows2008R2_64bit
      Feb 15, 2016 5:12:54 AM hudson.remoting.jnlp.Main$CuiListener <init>
      INFO: Jenkins agent is running in headless mode.
      Feb 15, 2016 5:12:54 AM hudson.remoting.jnlp.Main$CuiListener status
      INFO: Locating server among [http://10.102.22.50:8080/]
      Feb 15, 2016 5:12:54 AM hudson.remoting.jnlp.Main$CuiListener status
      INFO: Handshaking
      Feb 15, 2016 5:12:54 AM hudson.remoting.jnlp.Main$CuiListener status
      INFO: Connecting to 10.102.22.50:50226
      Feb 15, 2016 5:12:54 AM hudson.remoting.jnlp.Main$CuiListener status
      INFO: Trying protocol: JNLP2-connect
      Feb 15, 2016 5:12:54 AM hudson.remoting.jnlp.Main$CuiListener status
      INFO: Connected
      Feb 15, 2016 5:13:54 AM hudson.remoting.SynchronousCommandTransport$ReaderThread
       run
      SEVERE: I/O error in channel channel
      java.net.SocketException: Connection reset
              at java.net.SocketInputStream.read(Unknown Source)
              at java.net.SocketInputStream.read(Unknown Source)
              at java.io.BufferedInputStream.fill(Unknown Source)
              at java.io.BufferedInputStream.read1(Unknown Source)
              at java.io.BufferedInputStream.read(Unknown Source)
              at hudson.remoting.FlightRecorderInputStream.read(FlightRecorderInputStr
      eam.java:90)
              at hudson.remoting.ChunkedInputStream.read(ChunkedInputStream.java:46)
              at hudson.remoting.ChunkedInputStream.readUntilBreak(ChunkedInputStream.
      java:97)
              at hudson.remoting.ChunkedCommandTransport.readBlock(ChunkedCommandTrans
      port.java:39)
              at hudson.remoting.AbstractSynchronousByteArrayCommandTransport.read(Abs
      tractSynchronousByteArrayCommandTransport.java:34)
              at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(Synchron
      ousCommandTransport.java:48)
      
      Feb 15, 2016 5:13:54 AM hudson.remoting.jnlp.Main$CuiListener status
      INFO: Terminated
      Feb 15, 2016 5:14:04 AM hudson.remoting.jnlp.Main$CuiListener status
      INFO: Locating server among [http://10.102.22.50:8080/]
      Feb 15, 2016 5:14:04 AM hudson.remoting.jnlp.Main$CuiListener status
      INFO: Handshaking
      Feb 15, 2016 5:14:04 AM hudson.remoting.jnlp.Main$CuiListener status
      INFO: Connecting to 10.102.22.50:50226
      Feb 15, 2016 5:14:04 AM hudson.remoting.jnlp.Main$CuiListener status
      INFO: Trying protocol: JNLP2-connect
      Feb 15, 2016 5:14:04 AM hudson.remoting.jnlp.Main$CuiListener status
      INFO: Connected
      

      Slave log from master perspective:

      JNLP agent connected from /10.102.22.50
      <===[JENKINS REMOTING CAPACITY]===>   Slave.jar version: 2.53.2
      This is a Windows slave
      Slave successfully connected and online
      ERROR: Connection terminated
      [8mha:AAAAWB+LCAAAAAAAAP9b85aBtbiIQSmjNKU4P08vOT+vOD8nVc8DzHWtSE4tKMnMz/PLL0ldFVf2c+b/lb5MDAwVRQxSaBqcITRIIQMEMIIUFgAAckCEiWAAAAA=[0mjava.io.IOException: Connection aborted: org.jenkinsci.remoting.nio.NioChannelHub$MonoNioTransport@2ce66ffa[name=Windows2008R2_64bit]
      	at org.jenkinsci.remoting.nio.NioChannelHub$NioTransport.abort(NioChannelHub.java:208)
      	at org.jenkinsci.remoting.nio.NioChannelHub.run(NioChannelHub.java:628)
      	at jenkins.util.ContextResettingExecutorService$1.run(ContextResettingExecutorService.java:28)
      	at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
      	at java.util.concurrent.FutureTask.run(Unknown Source)
      	at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
      	at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
      	at java.lang.Thread.run(Unknown Source)
      Caused by: java.io.IOException: An existing connection was forcibly closed by the remote host
      	at sun.nio.ch.SocketDispatcher.read0(Native Method)
      	at sun.nio.ch.SocketDispatcher.read(Unknown Source)
      	at sun.nio.ch.IOUtil.readIntoNativeBuffer(Unknown Source)
      	at sun.nio.ch.IOUtil.read(Unknown Source)
      	at sun.nio.ch.SocketChannelImpl.read(Unknown Source)
      	at org.jenkinsci.remoting.nio.FifoBuffer$Pointer.receive(FifoBuffer.java:136)
      	at org.jenkinsci.remoting.nio.FifoBuffer.receive(FifoBuffer.java:306)
      	at org.jenkinsci.remoting.nio.NioChannelHub.run(NioChannelHub.java:561)
      	... 6 more
      

      Log from jenkins job:

      Building remotely on Windows2008R2_64bit (Win64e) in workspace C:\jenkins\workspace\##\buildNode\Win64e
       > C:\Program Files\Git\bin\git.exe rev-parse --is-inside-work-tree # timeout=10
      Fetching changes from the remote Git repository
       > C:\Program Files\Git\bin\git.exe config remote.origin.url #### # timeout=10
      Fetching upstream changes from ######
       > C:\Program Files\Git\bin\git.exe --version # timeout=10
      using GIT_SSH to set credentials 
       > C:\Program Files\Git\bin\git.exe -c core.askpass=true fetch --tags --progress ssh://####### +refs/heads/*:refs/remotes/origin/*
      Checking out Revision ##(refs/remotes/origin/master)
       > C:\Program Files\Git\bin\git.exe config core.sparsecheckout # timeout=10
       > C:\Program Files\Git\bin\git.exe checkout -f ##
       > C:\Program Files\Git\bin\git.exe rev-list ### timeout=10
      Run condition [Execution node ] enabling prebuild for step [Execute shell]
      Run condition [Execution node ] enabling prebuild for step [Execute Windows batch command]
      Slave went offline during the build
      ERROR: Connection was broken: java.io.IOException: Connection aborted: org.jenkinsci.remoting.nio.NioChannelHub$MonoNioTransport@41241c12[name=Windows2008R2_64bit]
      	at org.jenkinsci.remoting.nio.NioChannelHub$NioTransport.abort(NioChannelHub.java:208)
      	at org.jenkinsci.remoting.nio.NioChannelHub.run(NioChannelHub.java:628)
      	at jenkins.util.ContextResettingExecutorService$1.run(ContextResettingExecutorService.java:28)
      	at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
      	at java.util.concurrent.FutureTask.run(Unknown Source)
      	at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
      	at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
      	at java.lang.Thread.run(Unknown Source)
      Caused by: java.io.IOException: An existing connection was forcibly closed by the remote host
      	at sun.nio.ch.SocketDispatcher.read0(Native Method)
      	at sun.nio.ch.SocketDispatcher.read(Unknown Source)
      	at sun.nio.ch.IOUtil.readIntoNativeBuffer(Unknown Source)
      	at sun.nio.ch.IOUtil.read(Unknown Source)
      	at sun.nio.ch.SocketChannelImpl.read(Unknown Source)
      	at org.jenkinsci.remoting.nio.FifoBuffer$Pointer.receive(FifoBuffer.java:136)
      	at org.jenkinsci.remoting.nio.FifoBuffer.receive(FifoBuffer.java:306)
      	at org.jenkinsci.remoting.nio.NioChannelHub.run(NioChannelHub.java:561)
      	... 6 more
      
      Build step 'Copy artifacts from another project' marked build as failure
      ERROR: Step 'Scan for compiler warnings' failed: no workspace for ##/buildNode=Win64e #57
      ERROR: Step 'Archive the artifacts' failed: no workspace for ##/buildNode=Win64e #57
      Finished: FAILURE
      

      You may notice slave reconnects, but the build is frozen and it has to be killed in jenkins UI. It hapens 19/20 cases (from very rare time to time it finishes without problems).
      The problem happens only on Windows slave. It's not happening on any of linux slaves.
      I tried:

      • Different java versions and bittness (1.7 32 bit java, 1.8 64 bit java) on slave machine.
      • Setting hudson.diyChunking to false
      • Increasing Xmx, Xms java values
        Nothing helped. Is there any possibility to debug the slave? If I knew what's going on there... logs are not helpful at all.
        One clue is that the jenkins itself was upgraded from 1.3xx to recent build 1.647 (it's not the clean installation).

      Checked on different Windows machine (Windows 2012) everything seems to work just fine. Some Hyper-V issue? I'll make more tests.

            Unassigned Unassigned
            321kami Kamil Bednarczyk
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

              Created:
              Updated: