Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-40842

swarm 2.2 SEGV w/ java-1.8.0-openjdk-1.8.0.111-0.b15.el6_8.x86_64

    Details

    • Type: Bug
    • Status: Closed (View Workflow)
    • Priority: Major
    • Resolution: Not A Defect
    • Component/s: swarm-plugin
    • Labels:
      None
    • Environment:
      CentOS release 6.8 (Final)
      java-1.8.0-openjdk-1.8.0.111-0.b15.el6_8.x86_64
    • Similar Issues:

      Description

      I updated 2x centos 6 and 4x centos 7 swarm slaves to java 1.8.0.111-0.b15 yesterday morning. Overnight, both of the centos 6 slaves had died with a SEGV.

      #
      # A fatal error has been detected by the Java Runtime Environment:
      #
      #  SIGSEGV (0xb) at pc=0x00007fce95b91ee0, pid=7403, tid=0x00007fce9dfe7700
      #
      # JRE version: OpenJDK Runtime Environment (8.0_111-b15) (build 1.8.0_111-b15)
      # Java VM: OpenJDK 64-Bit Server VM (25.111-b15 mixed mode linux-amd64 compressed oops)
      # Problematic frame:
      # C  0x00007fce95b91ee0
      #
      # Failed to write core dump. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again
      #
      # An error report file with more information is saved as:
      # /home/jenkins-slave/hs_err_pid7403.log
      #
      # If you would like to submit a bug report, please visit:
      #   http://bugreport.java.com/bugreport/crash.jsp
      #
      

        Attachments

          Activity

          Hide
          jhoblitt Joshua Hoblitt added a comment -

          I am continuing to see occasional slave segvs on el6 after updating java to `java-1.8.0-openjdk-1.8.0.121-0.b13.el7_3.x86_64`:

          #
          # A fatal error has been detected by the Java Runtime Environment:
          #
          #  SIGSEGV (0xb) at pc=0x00007f838915bee0, pid=1559, tid=0x00007f8391396700
          #
          # JRE version: OpenJDK Runtime Environment (8.0_121-b13) (build 1.8.0_121-b13)
          # Java VM: OpenJDK 64-Bit Server VM (25.121-b13 mixed mode linux-amd64 compressed oops)
          # Problematic frame:
          # C  0x00007f838915bee0
          #
          # Failed to write core dump. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again
          #
          # An error report file with more information is saved as:
          # /tmp/hs_err_pid1559.log
          #
          # If you would like to submit a bug report, please visit:
          #   http://bugreport.java.com/bugreport/crash.jsp
          #
          
          Show
          jhoblitt Joshua Hoblitt added a comment - I am continuing to see occasional slave segvs on el6 after updating java to `java-1.8.0-openjdk-1.8.0.121-0.b13.el7_3.x86_64`: # # A fatal error has been detected by the Java Runtime Environment: # # SIGSEGV (0xb) at pc=0x00007f838915bee0, pid=1559, tid=0x00007f8391396700 # # JRE version: OpenJDK Runtime Environment (8.0_121-b13) (build 1.8.0_121-b13) # Java VM: OpenJDK 64-Bit Server VM (25.121-b13 mixed mode linux-amd64 compressed oops) # Problematic frame: # C 0x00007f838915bee0 # # Failed to write core dump. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again # # An error report file with more information is saved as: # /tmp/hs_err_pid1559.log # # If you would like to submit a bug report, please visit: # http: //bugreport.java.com/bugreport/crash.jsp #
          Hide
          psufoxman Ryan Fox added a comment -

          I am seeing similar SIGSEGVs on Java (Oracle) JDK 1.8.0_112-b15

          Show
          psufoxman Ryan Fox added a comment - I am seeing similar SIGSEGVs on Java (Oracle) JDK 1.8.0_112-b15
          Hide
          spencermalone Spencer Malone added a comment - - edited

          We're also seeing this, and I feel like the ticket priority should be bumped up until a workaround is presented. It's bad enough that we're rewriting the puppet-jenkins module to support SSH slaves instead of using this plugin, because the unreliability is causing regular job failures. With 4 slave workers, we were experiencing 1-2 going down per day. Swapped off the swarm plugin and haven't experienced a single node go down in ~ a week.

           

          It's also unclear if newer versions do or don't have this problem, but it's hard to update to 3.3 with so much of the changelog seemingly missing. Does 3.x's changelog combine all the changes of the prior failed releases?

          Show
          spencermalone Spencer Malone added a comment - - edited We're also seeing this, and I feel like the ticket priority should be bumped up until a workaround is presented. It's bad enough that we're rewriting the puppet-jenkins module to support SSH slaves instead of using this plugin, because the unreliability is causing regular job failures. With 4 slave workers, we were experiencing 1-2 going down per day. Swapped off the swarm plugin and haven't experienced a single node go down in ~ a week.   It's also unclear if newer versions do or don't have this problem, but it's hard to update to 3.3 with so much of the changelog seemingly missing. Does 3.x's changelog combine all the changes of the prior failed releases?
          Hide
          oleg_nenashev Oleg Nenashev added a comment -

          KK does not maintain this plugin anymore. Moving to unassigned to set the expectation

          Show
          oleg_nenashev Oleg Nenashev added a comment - KK does not maintain this plugin anymore. Moving to unassigned to set the expectation
          Hide
          basil Basil Crow added a comment -

          The crash was in com.kenai.jffi.PageManager, and the latest version of the Swarm client doesn't even have that library or that class in the JAR, so it is safe to say this is no longer a bug. Please ensure you are running the latest LTS release of Jenkins, the latest release of the Swarm plugin, and the latest release of the Swarm client.

          Show
          basil Basil Crow added a comment - The crash was in com.kenai.jffi.PageManager , and the latest version of the Swarm client doesn't even have that library or that class in the JAR, so it is safe to say this is no longer a bug. Please ensure you are running the latest LTS release of Jenkins, the latest release of the Swarm plugin, and the latest release of the Swarm client.

            People

            • Assignee:
              Unassigned
              Reporter:
              jhoblitt Joshua Hoblitt
            • Votes:
              2 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: