Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-19619

Collecting findbugs analysis results occasionally causes ssh slave to go offline causing job to abort

    XMLWordPrintable

    Details

    • Similar Issues:

      Description

      Collecting findbugs analysis results occasionally causes ssh slave to go offline causing job to abort

      Some of our legacy builds have a large number of Findbugs warnings.
      Parsing these results in the post build action sometimes causes the slave to go offline, and jenkins quickly brings the slave back online but it causes the job to have failed.

      I notice that findbugs outputs a lot of information to the slave.log on the jenkins master
      By the time I check, the slave.log has rotated to slave.log.1 but a new slave.log has not been created.

      From tailing slave.log on Jenkins master:

      Sep 17, 2013 11:52:39 AM hudson.plugins.findbugs.parser.FindBugsParser findSourceFile
      WARNING: Can't resolve absolute file name for file CallbackInterceptorConfigurer.java, dir list = [/tr/j/jh/workspace/tws_trunk_nightly_build/com.aepona.tws.build/build/test-reports/findbugs.xml/src/main/java, /tr/j/jh/workspace/tws_trunk_nightly_build/com.aepona.tws.build/build/test-reports/findbugs.xml/src/test/java, /tr/j/jh/workspace/tws_trunk_nightly_build/com.aepona.tws.build/build/test-reports/findbugs.xml/src]
      tail: `slave.log' has become inaccessible: No such file or directory
      

      list of slave logs: notice how slave.log no longer exists
      slave.log has rolled over to slave.log.1

      $ ll -h
      total 764K
      -rw-r--r-- 1 rcbuild_user cs_sl025 461K Sep 17 11:52 slave.log.1
      -rw-r--r-- 1 rcbuild_user cs_sl025 3.1K Sep 11 10:28 slave.log.10
      -rw-r--r-- 1 rcbuild_user cs_sl025 2.6K Sep 17 08:42 slave.log.2
      -rw-r--r-- 1 rcbuild_user cs_sl025 2.6K Sep 17 08:03 slave.log.3
      -rw-r--r-- 1 rcbuild_user cs_sl025 2.8K Sep 17 07:53 slave.log.4
      -rw-r--r-- 1 rcbuild_user cs_sl025 267K Sep 13 10:33 slave.log.5
      -rw-r--r-- 1 rcbuild_user cs_sl025 2.6K Sep 12 10:03 slave.log.6
      -rw-r--r-- 1 rcbuild_user cs_sl025 2.6K Sep 12 09:37 slave.log.7
      

      From Jenkins log:

      Sep 17, 2013 11:52:38 AM hudson.model.Run execute
      INFO: tws_trunk_nightly_build #306 main build action completed: SUCCESS
      Sep 17, 2013 11:52:39 AM hudson.remoting.SynchronousCommandTransport$ReaderThread run
      SEVERE: I/O error in channel Neshi
      java.io.IOException: Unexpected termination of the channel
              at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:50)
      Caused by: java.io.EOFException
              at java.io.ObjectInputStream$BlockDataInputStream.peekByte(ObjectInputStream.java:2596)
              at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1316)
              at java.io.ObjectInputStream.readObject(ObjectInputStream.java:370)
              at hudson.remoting.Command.readFrom(Command.java:92)
              at hudson.remoting.ClassicCommandTransport.read(ClassicCommandTransport.java:71)
              at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:48)
      
      Sep 17, 2013 11:52:39 AM hudson.model.AbstractBuild$AbstractBuildExecution performAllBuildSteps
      WARNING: Publisher hudson.plugins.findbugs.FindBugsPublisher aborted due to exception
      hudson.remoting.RequestAbortedException: hudson.remoting.RequestAbortedException: java.io.IOException: Unexpected termination of the channel
              at hudson.remoting.RequestAbortedException.wrapForRethrow(RequestAbortedException.java:41)
              at hudson.remoting.RequestAbortedException.wrapForRethrow(RequestAbortedException.java:34)
              at hudson.remoting.Request.call(Request.java:174)
              at hudson.remoting.Channel.call(Channel.java:714)
              at hudson.FilePath.act(FilePath.java:898)
              at hudson.FilePath.act(FilePath.java:882)
              at hudson.plugins.findbugs.FindBugsPublisher.perform(FindBugsPublisher.java:161)
              at hudson.plugins.analysis.core.HealthAwarePublisher.perform(HealthAwarePublisher.java:144)
              at hudson.plugins.analysis.core.HealthAwareRecorder.perform(HealthAwareRecorder.java:333)
              at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:20)
              at hudson.model.AbstractBuild$AbstractBuildExecution.perform(AbstractBuild.java:782)
              at hudson.model.AbstractBuild$AbstractBuildExecution.performAllBuildSteps(AbstractBuild.java:754)
              at hudson.model.Build$BuildExecution.post2(Build.java:183)
              at hudson.model.AbstractBuild$AbstractBuildExecution.post(AbstractBuild.java:707)
              at hudson.model.Run.execute(Run.java:1629)
              at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46)
              at hudson.model.ResourceController.execute(ResourceController.java:88)
              at hudson.model.Executor.run(Executor.java:246)
      Caused by: hudson.remoting.RequestAbortedException: java.io.IOException: Unexpected termination of the channel
              at hudson.remoting.Request.abort(Request.java:299)
              at hudson.remoting.Channel.terminate(Channel.java:774)
              at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:69)
      Caused by: java.io.IOException: Unexpected termination of the channel
              at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:50)
      Caused by: java.io.EOFException
              at java.io.ObjectInputStream$BlockDataInputStream.peekByte(ObjectInputStream.java:2596)
              at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1316)
              at java.io.ObjectInputStream.readObject(ObjectInputStream.java:370)
              at hudson.remoting.Command.readFrom(Command.java:92)
              at hudson.remoting.ClassicCommandTransport.read(ClassicCommandTransport.java:71)
              at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:48)
      

      Slave Reconnect from jenkins.log

      Sep 17, 2013 11:53:09 AM hudson.slaves.SlaveComputer tryReconnect
      INFO: Attempting to reconnect xxx
      

      Log from build:

      11:52:38 BUILD SUCCESSFUL
      11:52:38 Total time: 8 seconds
      11:52:38 [FINDBUGS] Collecting findbugs analysis files...
      11:52:39 ERROR: Publisher hudson.plugins.findbugs.FindBugsPublisher aborted due to exception
      11:52:39 hudson.remoting.RequestAbortedException: hudson.remoting.RequestAbortedException: java.io.IOException: Unexpected termination of the channel
      11:52:39 	at hudson.remoting.RequestAbortedException.wrapForRethrow(RequestAbortedException.java:41)
      11:52:39 	at hudson.remoting.RequestAbortedException.wrapForRethrow(RequestAbortedException.java:34)
      11:52:39 	at hudson.remoting.Request.call(Request.java:174)
      11:52:39 	at hudson.remoting.Channel.call(Channel.java:714)
      11:52:39 	at hudson.FilePath.act(FilePath.java:898)
      11:52:39 	at hudson.FilePath.act(FilePath.java:882)
      11:52:39 	at hudson.plugins.findbugs.FindBugsPublisher.perform(FindBugsPublisher.java:161)
      11:52:39 	at hudson.plugins.analysis.core.HealthAwarePublisher.perform(HealthAwarePublisher.java:144)
      11:52:39 	at hudson.plugins.analysis.core.HealthAwareRecorder.perform(HealthAwareRecorder.java:333)
      11:52:39 	at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:20)
      11:52:39 	at hudson.model.AbstractBuild$AbstractBuildExecution.perform(AbstractBuild.java:782)
      11:52:39 	at hudson.model.AbstractBuild$AbstractBuildExecution.performAllBuildSteps(AbstractBuild.java:754)
      11:52:39 	at hudson.model.Build$BuildExecution.post2(Build.java:183)
      11:52:39 	at hudson.model.AbstractBuild$AbstractBuildExecution.post(AbstractBuild.java:707)
      11:52:39 	at hudson.model.Run.execute(Run.java:1629)
      11:52:39 	at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46)
      11:52:39 	at hudson.model.ResourceController.execute(ResourceController.java:88)
      11:52:39 	at hudson.model.Executor.run(Executor.java:246)
      11:52:39 Caused by: hudson.remoting.RequestAbortedException: java.io.IOException: Unexpected termination of the channel
      11:52:39 	at hudson.remoting.Request.abort(Request.java:299)
      11:52:39 	at hudson.remoting.Channel.terminate(Channel.java:774)
      11:52:39 	at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:69)
      11:52:39 Caused by: java.io.IOException: Unexpected termination of the channel
      11:52:39 	at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:50)
      11:52:39 Caused by: java.io.EOFException
      11:52:39 	at java.io.ObjectInputStream$BlockDataInputStream.peekByte(ObjectInputStream.java:2596)
      11:52:39 	at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1316)
      11:52:39 	at java.io.ObjectInputStream.readObject(ObjectInputStream.java:370)
      11:52:39 	at hudson.remoting.Command.readFrom(Command.java:92)
      11:52:39 	at hudson.remoting.ClassicCommandTransport.read(ClassicCommandTransport.java:71)
      11:52:39 	at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:48)
      11:52:39 [PMD] Skipping publisher since build result is FAILURE
      

        Attachments

          Issue Links

            Activity

            Hide
            stephenconnolly Stephen Connolly added a comment -

            Fixed in Jenkins core 1.536

            Show
            stephenconnolly Stephen Connolly added a comment - Fixed in Jenkins core 1.536
            Hide
            dogfood dogfood added a comment -

            Integrated in jenkins_main_trunk #2938
            [FIXED JENKINS-18836][FIXED JENKINS-18879][FIXED JENKINS-19619] Upgrade trilead-ssh to version with the fix (Revision bb265c5e95b0fe39128720b903914236962db41b)

            Result = UNSTABLE
            Stephen Connolly : bb265c5e95b0fe39128720b903914236962db41b
            Files :

            • changelog.html
            • core/pom.xml
            Show
            dogfood dogfood added a comment - Integrated in jenkins_main_trunk #2938 [FIXED JENKINS-18836] [FIXED JENKINS-18879] [FIXED JENKINS-19619] Upgrade trilead-ssh to version with the fix (Revision bb265c5e95b0fe39128720b903914236962db41b) Result = UNSTABLE Stephen Connolly : bb265c5e95b0fe39128720b903914236962db41b Files : changelog.html core/pom.xml
            Hide
            drulli Ulli Hafner added a comment -

            Thanks for fixing that Stephen!

            Show
            drulli Ulli Hafner added a comment - Thanks for fixing that Stephen!
            Hide
            scm_issue_link SCM/JIRA link daemon added a comment -

            Code changed in jenkins
            User: Stephen Connolly
            Path:
            core/pom.xml
            http://jenkins-ci.org/commit/jenkins/1bb06ada301496ebed6d212188d1b7c9d006317b
            Log:
            [FIXED JENKINS-18836][FIXED JENKINS-18879][FIXED JENKINS-19619] Upgrade trilead-ssh to version with the fix

            (cherry picked from commit bb265c5e95b0fe39128720b903914236962db41b)

            Conflicts:
            changelog.html

            Show
            scm_issue_link SCM/JIRA link daemon added a comment - Code changed in jenkins User: Stephen Connolly Path: core/pom.xml http://jenkins-ci.org/commit/jenkins/1bb06ada301496ebed6d212188d1b7c9d006317b Log: [FIXED JENKINS-18836] [FIXED JENKINS-18879] [FIXED JENKINS-19619] Upgrade trilead-ssh to version with the fix (cherry picked from commit bb265c5e95b0fe39128720b903914236962db41b) Conflicts: changelog.html
            Hide
            scm_issue_link SCM/JIRA link daemon added a comment -

            Code changed in jenkins
            User: Stephen Connolly
            Path:
            src/com/trilead/ssh2/channel/Channel.java
            http://jenkins-ci.org/commit/trilead-ssh2/5811ddd7ae15670a4f9ad345352613b3f2f2db97
            Log:
            JENKINS-22938 SSH slave connections die after the slave outputs 4MB of stderr, usually during findbugs analysis

            The fix for JENKINS-18836, JENKINS-18879, JENKINS-19619 was incorrect in its analysis.

            • There is no call to getChannelData() on the new code path, so thus you cannot have two calls of freeupWindow()
            • The problem with the original call to freeupWindow() is that it is on the receiver thread. You should not mix the responsibilities. Blocking the receiver thread to send a message will negatively impact performance and connection stability.
            • The correct solution is to push the freeupWindow onto the async queue thus the ACK gets sent and the purity of the receiving thread can be maintained.
            Show
            scm_issue_link SCM/JIRA link daemon added a comment - Code changed in jenkins User: Stephen Connolly Path: src/com/trilead/ssh2/channel/Channel.java http://jenkins-ci.org/commit/trilead-ssh2/5811ddd7ae15670a4f9ad345352613b3f2f2db97 Log: JENKINS-22938 SSH slave connections die after the slave outputs 4MB of stderr, usually during findbugs analysis The fix for JENKINS-18836 , JENKINS-18879 , JENKINS-19619 was incorrect in its analysis. There is no call to getChannelData() on the new code path, so thus you cannot have two calls of freeupWindow() The problem with the original call to freeupWindow() is that it is on the receiver thread. You should not mix the responsibilities. Blocking the receiver thread to send a message will negatively impact performance and connection stability. The correct solution is to push the freeupWindow onto the async queue thus the ACK gets sent and the purity of the receiving thread can be maintained.

              People

              • Assignee:
                drulli Ulli Hafner
                Reporter:
                gcummings Geoff Cummings
              • Votes:
                1 Vote for this issue
                Watchers:
                6 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: