Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-2183

Job cannot be killed when it gets stuck during sending something by scp plugin

    Details

    • Type: Bug
    • Status: Open (View Workflow)
    • Priority: Critical
    • Resolution: Unresolved
    • Component/s: scp-plugin
    • Labels:
      None
    • Environment:
      Platform: All, OS: All
    • Similar Issues:

      Description

      • we had a job, which got stuck during sending results by SCP plugin
      • console output was showing "connecting to ourserver.ourdomain" (originally it
        was the problem on ourserver.ourdomain, machine did not accept ssh connections
        for some reason and had to be restarted), however, SCP plugin did not recover
        from this
      • moreover, whole job could not be terminated - tried to kill it by "x" button,
        and tried also to disconnect slave machine on which build got stuck - did not
        help, had to restart master Hudson instance.

        Attachments

          Activity

          Hide
          kohsuke Kohsuke Kawaguchi added a comment -

          Due to the synchronous nature of Java I/O, interrupting the I/O activity is very
          difficult.

          When it happens next time, please go to http://server/hudson/threadDump to
          obtain the thread dump of the system, so that we can know where the thread is stuck.

          Show
          kohsuke Kohsuke Kawaguchi added a comment - Due to the synchronous nature of Java I/O, interrupting the I/O activity is very difficult. When it happens next time, please go to http://server/hudson/threadDump to obtain the thread dump of the system, so that we can know where the thread is stuck.
          Hide
          musilt2 musilt2 added a comment -

          well, happened again, the same as before, attaching thread dump. Some machines
          (see e.g. Executor #0 for Win-Vista-1-stable )are stuck in SCP plugin - or in
          jsch library. Machine where they are connecting is dead, I'm unable to login
          there manually by ssh. Could not this be solved by some sort of timeout during
          creating session? (just guessing from thread dump, I haven't look into a code
          yet )

          Show
          musilt2 musilt2 added a comment - well, happened again, the same as before, attaching thread dump. Some machines (see e.g. Executor #0 for Win-Vista-1-stable )are stuck in SCP plugin - or in jsch library. Machine where they are connecting is dead, I'm unable to login there manually by ssh. Could not this be solved by some sort of timeout during creating session? (just guessing from thread dump, I haven't look into a code yet )
          Hide
          musilt2 musilt2 added a comment -

          Created an attachment (id=353)
          threaddump

          Show
          musilt2 musilt2 added a comment - Created an attachment (id=353) threaddump
          Hide
          musilt2 musilt2 added a comment -

          Created an attachment (id=354)
          threaddump2

          Show
          musilt2 musilt2 added a comment - Created an attachment (id=354) threaddump2
          Hide
          musilt2 musilt2 added a comment -

          attached thread dump#2 (not sure if it differs). The system is in state after
          unsuccessful attempt to kill stuck job on Linux-Ubuntu-1-Stable machine... just
          for case it may be helpful..

          Show
          musilt2 musilt2 added a comment - attached thread dump#2 (not sure if it differs). The system is in state after unsuccessful attempt to kill stuck job on Linux-Ubuntu-1-Stable machine... just for case it may be helpful..
          Hide
          musilt2 musilt2 added a comment -

          I've done a few experiments about this, and I'm just curious, why thread, which
          is (i guess) establishing the connection, gets stuck, and does not end with
          timeout?:

          "Executor #0 for Sol-10-2-stable" Id=51 RUNNABLE (in native)
          at java.net.SocketInputStream.socketRead0(Native Method)
          at java.net.SocketInputStream.read(SocketInputStream.java:129)
          at java.net.SocketInputStream.read(SocketInputStream.java:182)
          at com.jcraft.jsch.IO.getByte(Unknown Source)
          at com.jcraft.jsch.Session.connect(Unknown Source)
          at com.jcraft.jsch.Session.connect(Unknown Source)
          at be.certipost.hudson.plugin.SCPSite.createSession(SCPSite.java:117)
          at
          be.certipost.hudson.plugin.SCPRepositoryPublisher.perform(SCPRepositoryPublisher.java:86)

          when i was experimenting with this on my local Hudson instance, connector thread
          looked slightly different:

          "Executor #1 for master" Id=20 RUNNABLE (in native)
          at java.net.PlainSocketImpl.socketConnect(Native Method)
          at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:310)

          • locked java.net.SocksSocketImpl@836869
            at
            java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:176)
            at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:163)
            at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:381)
            at java.net.Socket.connect(Socket.java:537)
            at java.net.Socket.connect(Socket.java:487)
            at java.net.Socket.(Socket.java:384)
            at java.net.Socket.(Socket.java:198)
            at com.jcraft.jsch.Util.createSocket(Unknown Source)
            at com.jcraft.jsch.Session.connect(Unknown Source)
            at com.jcraft.jsch.Session.connect(Unknown Source)
            at be.certipost.hudson.plugin.SCPSite.createSession(SCPSite.java:117)
            at
            be.certipost.hudson.plugin.SCPRepositoryPublisher.perform(SCPRepositoryPublisher.java:91)

          moreover, it ended up with timeout exception:

          com.jcraft.jsch.JSchException: java.net.ConnectException: Connection timed out
          at com.jcraft.jsch.Util.createSocket(Unknown Source)
          at com.jcraft.jsch.Session.connect(Unknown Source)

          which is correct.
          So the problem seems to be that JSch library is not able to recognize disability
          to connect under some circumstances and treat it with appropriate exception.

          Show
          musilt2 musilt2 added a comment - I've done a few experiments about this, and I'm just curious, why thread, which is (i guess) establishing the connection, gets stuck, and does not end with timeout?: "Executor #0 for Sol-10-2-stable" Id=51 RUNNABLE (in native) at java.net.SocketInputStream.socketRead0(Native Method) at java.net.SocketInputStream.read(SocketInputStream.java:129) at java.net.SocketInputStream.read(SocketInputStream.java:182) at com.jcraft.jsch.IO.getByte(Unknown Source) at com.jcraft.jsch.Session.connect(Unknown Source) at com.jcraft.jsch.Session.connect(Unknown Source) at be.certipost.hudson.plugin.SCPSite.createSession(SCPSite.java:117) at be.certipost.hudson.plugin.SCPRepositoryPublisher.perform(SCPRepositoryPublisher.java:86) when i was experimenting with this on my local Hudson instance, connector thread looked slightly different: "Executor #1 for master" Id=20 RUNNABLE (in native) at java.net.PlainSocketImpl.socketConnect(Native Method) at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:310) locked java.net.SocksSocketImpl@836869 at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:176) at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:163) at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:381) at java.net.Socket.connect(Socket.java:537) at java.net.Socket.connect(Socket.java:487) at java.net.Socket.(Socket.java:384) at java.net.Socket.(Socket.java:198) at com.jcraft.jsch.Util.createSocket(Unknown Source) at com.jcraft.jsch.Session.connect(Unknown Source) at com.jcraft.jsch.Session.connect(Unknown Source) at be.certipost.hudson.plugin.SCPSite.createSession(SCPSite.java:117) at be.certipost.hudson.plugin.SCPRepositoryPublisher.perform(SCPRepositoryPublisher.java:91) moreover, it ended up with timeout exception: com.jcraft.jsch.JSchException: java.net.ConnectException: Connection timed out at com.jcraft.jsch.Util.createSocket(Unknown Source) at com.jcraft.jsch.Session.connect(Unknown Source) which is correct. So the problem seems to be that JSch library is not able to recognize disability to connect under some circumstances and treat it with appropriate exception.

            People

            • Assignee:
              Unassigned
              Reporter:
              musilt2 musilt2
            • Votes:
              1 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated: