Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-10778

sendEmulatorCommand should not fire and forget commands

    XMLWordPrintable

    Details

    • Type: Improvement
    • Status: Closed (View Workflow)
    • Priority: Major
    • Resolution: Fixed
    • Labels:
      None
    • Environment:
      Android emulator/jenkins slave running on Windows 7. Using Android SDK11. Jenkins master running on Ubuntu 10.04.
      Running jenkins 1.426 and android emulator plugin 1.16-SNAPSHOT
    • Similar Issues:

      Description

      I have been seeing issues similar to JENKINS-10639 where the android emulator does not shutdown.occasionally. I have instrumented the android-emulator plugin to try to get some information on what is going on. This investigation is still ongoing but as part of that I noticed that sendEmulatorCommand may not always be doing the best it can to get the message through.

      1 - each close in the finally block should be wrapped in its own exception handler to minimise the chance of leaked/unclosed streams.

      2 - PrintWriter does not auto flush on writes. potentially print could be used but I just added flush.

      3 - I have had bad experiences in the past of closing sockets on windows before the receiving process has processed the inflight data. The close seems to be able to overtake the data in the stream and appear as a closed exception before the inflight data is read. To minimize the risk of this I added a "quit" command after the desired command and then read the responses from the emulator until it gives end of stream.

      In a few hours of testing I have not seen a failure to shutdown the emulator although I have seen a few situations where the emulator took a long time to respond to the sendEmulatorCommand requests.

      The only concerns that I have over the changes that I have made are that the wait for the emulator command to quit could end up blocking the build if the emulator is really stuck. But then again in this instance the first call to readLine would likely stick anyway so I do not think it makes things any worse than before.

      The above is available as patches at
      https://github.com/oldelvet/android-emulator-plugin/tree/110818-emulator-shutdown
      I'll submit a pull request in due course.

        Attachments

          Activity

          Hide
          orrc Christopher Orr added a comment -

          Thanks for the pull request at https://github.com/jenkinsci/android-emulator-plugin/pull/3
          I commented directly on the pull request about my experiences with/without the patch.

          It seems you were right to be concerned about the while loop really blocking the build.
          In my case, I'm really seeing the emulator process deadlock after the "kill" command is sent, so the first readLine executes correctly, but the second call in the loop blocks forever.

          What has your experience been so far? I don't have any Windows slaves; is there a noticeable difference for you when using this patch?

          Show
          orrc Christopher Orr added a comment - Thanks for the pull request at https://github.com/jenkinsci/android-emulator-plugin/pull/3 I commented directly on the pull request about my experiences with/without the patch. It seems you were right to be concerned about the while loop really blocking the build. In my case, I'm really seeing the emulator process deadlock after the "kill" command is sent, so the first readLine executes correctly, but the second call in the loop blocks forever. What has your experience been so far? I don't have any Windows slaves; is there a noticeable difference for you when using this patch?
          Hide
          oldelvet Richard Mortimer added a comment -

          I have not had any hangs on either Linux or Windows slaves with the patches applied. However my Windows slave has not had a massive amount of use due to me taking some annual leave.

          For reference the Windows box is running Android SDK 11 due to other known issues with Android SDK 12 on Windows.

          Show
          oldelvet Richard Mortimer added a comment - I have not had any hangs on either Linux or Windows slaves with the patches applied. However my Windows slave has not had a massive amount of use due to me taking some annual leave. For reference the Windows box is running Android SDK 11 due to other known issues with Android SDK 12 on Windows.
          Hide
          oldelvet Richard Mortimer added a comment -

          I have managed to recreate a hang in readLine on Windows with SDK 11. I cancelled the build via the Jenkins web interface and that correctly interrupted the wait and the next build automatically kicked off and used the same emulator with no further action on my part.

          I will look to improve the patch by making the readLine timeout without relying on user interaction or a build timeout plugin.

          "pool-1-thread-11" Id=32 Group=main RUNNABLE (in native)
          at java.net.SocketInputStream.socketRead0(Native Method)
          at java.net.SocketInputStream.read(Unknown Source)
          at sun.nio.cs.StreamDecoder.readBytes(Unknown Source)
          at sun.nio.cs.StreamDecoder.implRead(Unknown Source)
          at sun.nio.cs.StreamDecoder.read(Unknown Source)

          • locked java.io.InputStreamReader@194ba47
            at java.io.InputStreamReader.read(Unknown Source)
            at java.io.BufferedReader.fill(Unknown Source)
            at java.io.BufferedReader.readLine(Unknown Source)
          • locked java.io.InputStreamReader@194ba47
            at java.io.BufferedReader.readLine(Unknown Source)
            at hudson.plugins.android_emulator.AndroidEmulator$2.call(AndroidEmulator.java:713)
            at hudson.plugins.android_emulator.AndroidEmulator$2.call(AndroidEmulator.java:701)
            at hudson.remoting.UserRequest.perform(UserRequest.java:118)
            at hudson.remoting.UserRequest.perform(UserRequest.java:48)
            at hudson.remoting.Request$2.run(Request.java:287)
            at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
            at java.util.concurrent.FutureTask$Sync.innerRun(Unknown Source)
            at java.util.concurrent.FutureTask.run(Unknown Source)
            at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown Source)
            at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
            at hudson.remoting.Engine$1$1.run(Engine.java:60)
            at java.lang.Thread.run(Unknown Source)

          Number of locked synchronizers = 1

          • java.util.concurrent.locks.ReentrantLock$NonfairSync@ecc0f8
          Show
          oldelvet Richard Mortimer added a comment - I have managed to recreate a hang in readLine on Windows with SDK 11. I cancelled the build via the Jenkins web interface and that correctly interrupted the wait and the next build automatically kicked off and used the same emulator with no further action on my part. I will look to improve the patch by making the readLine timeout without relying on user interaction or a build timeout plugin. "pool-1-thread-11" Id=32 Group=main RUNNABLE (in native) at java.net.SocketInputStream.socketRead0(Native Method) at java.net.SocketInputStream.read(Unknown Source) at sun.nio.cs.StreamDecoder.readBytes(Unknown Source) at sun.nio.cs.StreamDecoder.implRead(Unknown Source) at sun.nio.cs.StreamDecoder.read(Unknown Source) locked java.io.InputStreamReader@194ba47 at java.io.InputStreamReader.read(Unknown Source) at java.io.BufferedReader.fill(Unknown Source) at java.io.BufferedReader.readLine(Unknown Source) locked java.io.InputStreamReader@194ba47 at java.io.BufferedReader.readLine(Unknown Source) at hudson.plugins.android_emulator.AndroidEmulator$2.call(AndroidEmulator.java:713) at hudson.plugins.android_emulator.AndroidEmulator$2.call(AndroidEmulator.java:701) at hudson.remoting.UserRequest.perform(UserRequest.java:118) at hudson.remoting.UserRequest.perform(UserRequest.java:48) at hudson.remoting.Request$2.run(Request.java:287) at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source) at java.util.concurrent.FutureTask$Sync.innerRun(Unknown Source) at java.util.concurrent.FutureTask.run(Unknown Source) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown Source) at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at hudson.remoting.Engine$1$1.run(Engine.java:60) at java.lang.Thread.run(Unknown Source) Number of locked synchronizers = 1 java.util.concurrent.locks.ReentrantLock$NonfairSync@ecc0f8
          Hide
          orrc Christopher Orr added a comment -

          The shutdown-fixes branch has been merged.
          Thanks again for the investigation, code, ideas and testing.

          Show
          orrc Christopher Orr added a comment - The shutdown-fixes branch has been merged. Thanks again for the investigation, code, ideas and testing.
          Hide
          orrc Christopher Orr added a comment -

          Version 1.18 has now been released, which includes this update.

          Show
          orrc Christopher Orr added a comment - Version 1.18 has now been released, which includes this update.

            People

            • Assignee:
              oldelvet Richard Mortimer
              Reporter:
              oldelvet Richard Mortimer
            • Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: