Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-12235

FATAL, Unable to delete script file, IOException2, remote file operation failed, unexpected termination of channel

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Resolved (View Workflow)
    • Priority: Critical
    • Resolution: Duplicate
    • Component/s: core, remoting
    • Labels:
      None
    • Similar Issues:

      Description

      Below is the stacktrace.

      It happened when I ran two jobs on a master. After running a while, both jobs crashed with this exception.
      I think this might be caused by a small flip-flop connectivity of the network, but I didn't noticed any disconnection.
      Another cause may be the huge load of jenkins:

      PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
      25942 hudson 15 0 6902m 5.8g 5720 S 0.3 74.3 401:22.30 java

      Does the jenkins runs its own garbage collector at some specified time?
      We have to restart every few days because it's getting slower and slower until hangs out.

      FATAL: Unable to delete script file /tmp/hudson8303731085225956739.sh
      hudson.util.IOException2: remote file operation failed: /tmp/hudson8303731085225956739.sh at hudson.remoting.Channel@30e472f4:build@autom-1
      at hudson.FilePath.act(FilePath.java:781)
      at hudson.FilePath.act(FilePath.java:767)
      at hudson.FilePath.delete(FilePath.java:1022)
      at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:92)
      at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:58)
      at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:19)
      at hudson.model.AbstractBuild$AbstractRunner.perform(AbstractBuild.java:695)
      at hudson.model.Build$RunnerImpl.build(Build.java:178)
      at hudson.model.Build$RunnerImpl.doRun(Build.java:139)
      at hudson.model.AbstractBuild$AbstractRunner.run(AbstractBuild.java:461)
      at hudson.model.Run.run(Run.java:1404)
      at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46)
      at hudson.model.ResourceController.execute(ResourceController.java:88)
      at hudson.model.Executor.run(Executor.java:230)
      Caused by: hudson.remoting.ChannelClosedException: channel is already closed
      at hudson.remoting.Channel.send(Channel.java:499)
      at hudson.remoting.Request.call(Request.java:110)
      at hudson.remoting.Channel.call(Channel.java:681)
      at hudson.FilePath.act(FilePath.java:774)
      ... 13 more
      Caused by: java.io.IOException: Unexpected termination of the channel
      at hudson.remoting.Channel$ReaderThread.run(Channel.java:1115)
      Caused by: java.io.EOFException
      at java.io.ObjectInputStream$BlockDataInputStream.peekByte(ObjectInputStream.java:2554)
      at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1297)
      at java.io.ObjectInputStream.readObject(ObjectInputStream.java:351)
      at hudson.remoting.Channel$ReaderThread.run(Channel.java:1109)
      FATAL: hudson.remoting.RequestAbortedException: java.io.IOException: Unexpected termination of the channel
      hudson.remoting.RequestAbortedException: hudson.remoting.RequestAbortedException: java.io.IOException: Unexpected termination of the channel
      at hudson.remoting.Request.call(Request.java:149)
      at hudson.remoting.Channel.call(Channel.java:681)
      at hudson.remoting.RemoteInvocationHandler.invoke(RemoteInvocationHandler.java:158)
      at $Proxy29.join(Unknown Source)
      at hudson.Launcher$RemoteLauncher$ProcImpl.join(Launcher.java:859)
      at hudson.Launcher$ProcStarter.join(Launcher.java:345)
      at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:82)
      at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:58)
      at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:19)
      at hudson.model.AbstractBuild$AbstractRunner.perform(AbstractBuild.java:695)
      at hudson.model.Build$RunnerImpl.build(Build.java:178)
      at hudson.model.Build$RunnerImpl.doRun(Build.java:139)
      at hudson.model.AbstractBuild$AbstractRunner.run(AbstractBuild.java:461)
      at hudson.model.Run.run(Run.java:1404)
      at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46)
      at hudson.model.ResourceController.execute(ResourceController.java:88)
      at hudson.model.Executor.run(Executor.java:230)
      Caused by: hudson.remoting.RequestAbortedException: java.io.IOException: Unexpected termination of the channel
      at hudson.remoting.Request.abort(Request.java:273)
      at hudson.remoting.Channel.terminate(Channel.java:732)
      at hudson.remoting.Channel$ReaderThread.run(Channel.java:1139)
      Caused by: java.io.IOException: Unexpected termination of the channel
      at hudson.remoting.Channel$ReaderThread.run(Channel.java:1115)
      Caused by: java.io.EOFException
      at java.io.ObjectInputStream$BlockDataInputStream.peekByte(ObjectInputStream.java:2554)
      at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1297)
      at java.io.ObjectInputStream.readObject(ObjectInputStream.java:351)
      at hudson.remoting.Channel$ReaderThread.run(Channel.java:1109)

        Attachments

          Issue Links

            Activity

            Hide
            rb2k Marc Seeger added a comment -

            I just witnessed it live on a slave today.
            Some findings:

            1. Once the slave started failing, following (different) jobs failed too. (Tested 3 jobs, all of them failed with the same error)
            2. Just disconnecting and reconnecting the slave made it work again

            Show
            rb2k Marc Seeger added a comment - I just witnessed it live on a slave today. Some findings: 1. Once the slave started failing, following (different) jobs failed too. (Tested 3 jobs, all of them failed with the same error) 2. Just disconnecting and reconnecting the slave made it work again
            Hide
            guyr Guy Rozendorn added a comment -

            We had some issues in our lab, which forced us to re-install all of our slaves (84 and counting).
            We are still experiencing this issue

            It seems that after this happens, the slave remains connected to Jenkins. However, I can't tell what happens if you try to run another job on it, because we revert the slave VM from snapshot after every run (whether it is successful or not)

            Show
            guyr Guy Rozendorn added a comment - We had some issues in our lab, which forced us to re-install all of our slaves (84 and counting). We are still experiencing this issue It seems that after this happens, the slave remains connected to Jenkins. However, I can't tell what happens if you try to run another job on it, because we revert the slave VM from snapshot after every run (whether it is successful or not)
            Hide
            dannystaple Danny Staple added a comment -

            Ok - I've found something on this today. If you have very "chatty" jobs on the slaves which output a lot of console data, try to log/redirect it to a file - they aren't necessarily the root cause, but make it more prone.

            If a job is running, but quiet, you can unplug a slave network cable for a few seconds, put it back in and things will pretty much continue as before. However- a slave running a chatty job will die with an io error almost immediately.

            If you can redirect to file, you may see a big reduction in these.

            Show
            dannystaple Danny Staple added a comment - Ok - I've found something on this today. If you have very "chatty" jobs on the slaves which output a lot of console data, try to log/redirect it to a file - they aren't necessarily the root cause, but make it more prone. If a job is running, but quiet, you can unplug a slave network cable for a few seconds, put it back in and things will pretty much continue as before. However- a slave running a chatty job will die with an io error almost immediately. If you can redirect to file, you may see a big reduction in these.
            Hide
            guyr Guy Rozendorn added a comment -

            After update all our jobs to yield output every 10 seconds this occurs less frequent, but it still happens few times a week.

            Show
            guyr Guy Rozendorn added a comment - After update all our jobs to yield output every 10 seconds this occurs less frequent, but it still happens few times a week.
            Hide
            jglick Jesse Glick added a comment -

            Essentially a duplicate of JENKINS-1948.

            Show
            jglick Jesse Glick added a comment - Essentially a duplicate of JENKINS-1948 .

              People

              • Assignee:
                Unassigned
                Reporter:
                dumghen Ghenadie Dumitru
              • Votes:
                38 Vote for this issue
                Watchers:
                47 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: