Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-39289

Better diagnostics on "Backing channel is disconnected"

    XMLWordPrintable

    Details

    • Type: Improvement
    • Status: Reopened (View Workflow)
    • Priority: Minor
    • Resolution: Unresolved
    • Component/s: remoting
    • Labels:
      None
    • Similar Issues:

      Description

      Often, the code that handles lost connection and report that problem is different from code that uses proxies that go over this channel. Example stack trace:

      15:44:26 java.io.IOException: Backing channel is disconnected.
      15:44:26        at hudson.remoting.RemoteInvocationHandler.channelOrFail(RemoteInvocationHandler.java:184)
      15:44:26        at hudson.remoting.RemoteInvocationHandler.invoke(RemoteInvocationHandler.java:249)
      15:44:26        at com.sun.proxy.$Proxy104.isAlive(Unknown Source)
      15:44:26        at hudson.Launcher$RemoteLauncher$ProcImpl.isAlive(Launcher.java:996)
      15:44:26        at hudson.maven.ProcessCache$MavenProcess.call(ProcessCache.java:166)
      15:44:26        at hudson.maven.MavenModuleSetBuild$MavenModuleSetBuildExecution.doRun(MavenModuleSetBuild.java:853)
      15:44:26        at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:534)
      15:44:26        at hudson.model.Run.execute(Run.java:1738)
      15:44:26        at hudson.maven.MavenModuleSetBuild.run(MavenModuleSetBuild.java:531)
      15:44:26        at hudson.model.ResourceController.execute(ResourceController.java:98)
      15:44:26        at hudson.model.Executor.run(Executor.java:410)
      

      The error diagnostics is easier if the cause of the channel loss is linked to this exception.

        Attachments

          Activity

          Hide
          scm_issue_link SCM/JIRA link daemon added a comment -

          Code changed in jenkins
          User: Kohsuke Kawaguchi
          Path:
          src/main/java/hudson/remoting/Channel.java
          src/main/java/hudson/remoting/RemoteInvocationHandler.java
          http://jenkins-ci.org/commit/remoting/89a1774918d1e2adba41ae7ae4ba34f095dc2676
          Log:
          JENKINS-39289 When a proxy fails, report what caused the channel to go down

          Today, this requires out of bound knowledge about which connection the
          proxy was representing, then use other means to figure out why it has
          failed.

          This exception chaining shortens that step and makes it easy to find the
          cause as to why the channel was shut down.

          Show
          scm_issue_link SCM/JIRA link daemon added a comment - Code changed in jenkins User: Kohsuke Kawaguchi Path: src/main/java/hudson/remoting/Channel.java src/main/java/hudson/remoting/RemoteInvocationHandler.java http://jenkins-ci.org/commit/remoting/89a1774918d1e2adba41ae7ae4ba34f095dc2676 Log: JENKINS-39289 When a proxy fails, report what caused the channel to go down Today, this requires out of bound knowledge about which connection the proxy was representing, then use other means to figure out why it has failed. This exception chaining shortens that step and makes it easy to find the cause as to why the channel was shut down.
          Hide
          scm_issue_link SCM/JIRA link daemon added a comment -

          Code changed in jenkins
          User: Kohsuke Kawaguchi
          Path:
          src/main/java/hudson/remoting/Channel.java
          src/main/java/hudson/remoting/RemoteInvocationHandler.java
          http://jenkins-ci.org/commit/remoting/0d8a2af7cffa6aa5a6f2675e810b286a984af04e
          Log:
          JENKINS-39289 When a proxy fails, report what caused the channel to go down (#128)

          Today, this requires out of bound knowledge about which connection the
          proxy was representing, then use other means to figure out why it has
          failed.

          This exception chaining shortens that step and makes it easy to find the
          cause as to why the channel was shut down.

          Show
          scm_issue_link SCM/JIRA link daemon added a comment - Code changed in jenkins User: Kohsuke Kawaguchi Path: src/main/java/hudson/remoting/Channel.java src/main/java/hudson/remoting/RemoteInvocationHandler.java http://jenkins-ci.org/commit/remoting/0d8a2af7cffa6aa5a6f2675e810b286a984af04e Log: JENKINS-39289 When a proxy fails, report what caused the channel to go down (#128) Today, this requires out of bound knowledge about which connection the proxy was representing, then use other means to figure out why it has failed. This exception chaining shortens that step and makes it easy to find the cause as to why the channel was shut down.
          Hide
          oleg_nenashev Oleg Nenashev added a comment -

          The fix has been integrated towards remoting-3.1

          Show
          oleg_nenashev Oleg Nenashev added a comment - The fix has been integrated towards remoting-3.1
          Hide
          scm_issue_link SCM/JIRA link daemon added a comment -

          Code changed in jenkins
          User: Oleg Nenashev
          Path:
          pom.xml
          http://jenkins-ci.org/commit/jenkins/7a948d399585d201c4132597aed5723a495acf69
          Log:
          Update remoting to 2.31 in the Jenkins core. (#2628)

          The change introduces one serious bugfix (JENKINS-39596) and a bunch of various diagnostics improvements.

          Bugfixes:

          Improvements:

          Show
          scm_issue_link SCM/JIRA link daemon added a comment - Code changed in jenkins User: Oleg Nenashev Path: pom.xml http://jenkins-ci.org/commit/jenkins/7a948d399585d201c4132597aed5723a495acf69 Log: Update remoting to 2.31 in the Jenkins core. (#2628) The change introduces one serious bugfix ( JENKINS-39596 ) and a bunch of various diagnostics improvements. Bugfixes: JENKINS-39596 ( https://issues.jenkins-ci.org/browse/JENKINS-39596 ) - Jenkins URL in `hudson.remoting.Engine` was always `null` since `3.0`. It was causing connection failures of Jenkins JNLP agents when using Java Web Start. ( PR #131 ( https://github.com/jenkinsci/remoting/pull/131 )) JENKINS-39617 ( https://issues.jenkins-ci.org/browse/JENKINS-39617 ) - `hudson.remoting.Engine` was failing to establish connection if one of the URLs parameter in parameters was malformed. ( PR #131 ( https://github.com/jenkinsci/remoting/pull/131 )) Improvements: JENKINS-39150 ( https://issues.jenkins-ci.org/browse/JENKINS-39150 ) - Add logic for dumping diagnostics across all the channels. ( PR #122 ( https://github.com/jenkinsci/remoting/pull/122 ), PR #125 ( https://github.com/jenkinsci/remoting/pull/125 )) JENKINS-39543 ( https://issues.jenkins-ci.org/browse/JENKINS-39543 ) - Improve the caller/callee correlation diagnostics in thread dumps. ( PR #119 ( https://github.com/jenkinsci/remoting/pull/119 )) JENKINS-39290 ( https://issues.jenkins-ci.org/browse/JENKINS-39290 ) - Add the `org.jenkinsci.remoting.nio.NioChannelHub.disabled` flag for disabling NIO (mostly for debugging purposes). ( PR #123 ( https://github.com/jenkinsci/remoting/pull/123 )) JENKINS-38692 ( https://issues.jenkins-ci.org/browse/JENKINS-38692 ) - Add extra logging to help diagnosing `IOHub` Thread spikes. ( PR #116 ( https://github.com/jenkinsci/remoting/pull/116 )) JENKINS-39289 ( https://issues.jenkins-ci.org/browse/JENKINS-39289 ) - When a proxy fails, report what caused the channel to go down. ( PR #128 ( https://github.com/jenkinsci/remoting/pull/128 ))
          Hide
          oleg_nenashev Oleg Nenashev added a comment -

          The fix has been integrated towards jenkins-2.31

          Show
          oleg_nenashev Oleg Nenashev added a comment - The fix has been integrated towards jenkins-2.31
          Hide
          megafon Jan Duris added a comment - - edited

          Sometimes (randomly), my tests are failing with this error. Tests runs on docker nodes. Jenkins (or docker plugin) cancel docker node where tests are executed too early. Jenkins job can't  finish because executor is terminated.

          Jenkins version: 2.100 | Docker Plugin version: 1.1.2

           
          FATAL: command execution failed
          Command close created at
          at hudson.remoting.Command.<init>(Command.java:62)
          at hudson.remoting.Channel$CloseCommand.<init>(Channel.java:1220)
          at hudson.remoting.Channel$CloseCommand.<init>(Channel.java:1218)
          at hudson.remoting.Channel.close(Channel.java:1391)
          at hudson.remoting.Channel.close(Channel.java:1358)
          at hudson.remoting.Channel$CloseCommand.execute(Channel.java:1225)
          Caused: hudson.remoting.Channel$OrderlyShutdown
          at hudson.remoting.Channel$CloseCommand.execute(Channel.java:1226)
          at hudson.remoting.Channel$1.handle(Channel.java:561)
          at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:87)
          Caused: java.io.IOException: Backing channel 'docker-4e1dcb53088a' is disconnected.
          at hudson.remoting.RemoteInvocationHandler.channelOrFail(RemoteInvocationHandler.java:212)
          at hudson.remoting.RemoteInvocationHandler.invoke(RemoteInvocationHandler.java:281)
          at com.sun.proxy.$Proxy87.isAlive(Unknown Source)
          at hudson.Launcher$RemoteLauncher$ProcImpl.isAlive(Launcher.java:1137)
          at hudson.Launcher$RemoteLauncher$ProcImpl.join(Launcher.java:1129)
          at hudson.tasks.CommandInterpreter.join(CommandInterpreter.java:155)
          at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:109)
          at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:66)
          at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:20)
          at hudson.model.AbstractBuild$AbstractBuildExecution.perform(AbstractBuild.java:744)
          at hudson.model.Build$BuildExecution.build(Build.java:206)
          at hudson.model.Build$BuildExecution.doRun(Build.java:163)
          at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:504)
          at hudson.model.Run.execute(Run.java:1727)
          at hudson.matrix.MatrixRun.run(MatrixRun.java:146)
          at hudson.model.ResourceController.execute(ResourceController.java:97)
          at hudson.model.Executor.run(Executor.java:429)

          Show
          megafon Jan Duris added a comment - - edited Sometimes (randomly), my tests are failing with this error. Tests runs on docker nodes. Jenkins (or docker plugin) cancel docker node where tests are executed too early. Jenkins job can't  finish because executor is terminated. Jenkins version: 2.100 | Docker Plugin version: 1.1.2   FATAL: command execution failed Command close created at at hudson.remoting.Command.<init>(Command.java:62) at hudson.remoting.Channel$CloseCommand.<init>(Channel.java:1220) at hudson.remoting.Channel$CloseCommand.<init>(Channel.java:1218) at hudson.remoting.Channel.close(Channel.java:1391) at hudson.remoting.Channel.close(Channel.java:1358) at hudson.remoting.Channel$CloseCommand.execute(Channel.java:1225) Caused: hudson.remoting.Channel$OrderlyShutdown at hudson.remoting.Channel$CloseCommand.execute(Channel.java:1226) at hudson.remoting.Channel$1.handle(Channel.java:561) at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:87) Caused: java.io.IOException: Backing channel 'docker-4e1dcb53088a' is disconnected. at hudson.remoting.RemoteInvocationHandler.channelOrFail(RemoteInvocationHandler.java:212) at hudson.remoting.RemoteInvocationHandler.invoke(RemoteInvocationHandler.java:281) at com.sun.proxy.$Proxy87.isAlive(Unknown Source) at hudson.Launcher$RemoteLauncher$ProcImpl.isAlive(Launcher.java:1137) at hudson.Launcher$RemoteLauncher$ProcImpl.join(Launcher.java:1129) at hudson.tasks.CommandInterpreter.join(CommandInterpreter.java:155) at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:109) at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:66) at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:20) at hudson.model.AbstractBuild$AbstractBuildExecution.perform(AbstractBuild.java:744) at hudson.model.Build$BuildExecution.build(Build.java:206) at hudson.model.Build$BuildExecution.doRun(Build.java:163) at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:504) at hudson.model.Run.execute(Run.java:1727) at hudson.matrix.MatrixRun.run(MatrixRun.java:146) at hudson.model.ResourceController.execute(ResourceController.java:97) at hudson.model.Executor.run(Executor.java:429)
          Hide
          oleg_nenashev Oleg Nenashev added a comment -

          Jan Duris why have you reopened it? Could you please clarify the status?

          Show
          oleg_nenashev Oleg Nenashev added a comment - Jan Duris why have you reopened it? Could you please clarify the status?
          Hide
          oleg_nenashev Oleg Nenashev added a comment -

          Unfortunately I have no capacity to work on Remoting in medium term, so I will unassign it and let others to take it. If somebody is interested to submit a pull request, I will be happy to help to get it reviewed and released.

          Show
          oleg_nenashev Oleg Nenashev added a comment - Unfortunately I have no capacity to work on Remoting in medium term, so I will unassign it and let others to take it. If somebody is interested to submit a pull request, I will be happy to help to get it reviewed and released.

            People

            • Assignee:
              Unassigned
              Reporter:
              kohsuke Kohsuke Kawaguchi
            • Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

              • Created:
                Updated: