Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-36535

"java.lang.OutOfMemoryError: GC overhead limit exceeded" in slaves

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Fixed
    • Icon: Minor Minor
    • core
    • Jenkins 2.11 on Linux (RHEL7) master and Linux SSH slaves (RHEL7 and RHEL5) with Java >= 1.7.

      Today at ~10:50 CEST, all of my slaves went down, and it is impossible to relaunch them. All of them suddenly show the following error in the log:

      [07/08/16 15:01:58] [SSH] Starting slave process: cd "/localworkspaces/coverity/jenkins" && /opt/java1.7_x86_64/bin/java -Xmx2g -Xms2g -verbose:gc -Xloggc:/tmp/gc.log -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+PrintHeapAtGC -XX:+HeapDumpOnOutOfMemoryError -jar slave.jar
      <===[JENKINS REMOTING CAPACITY]===>channel started
      hudson.util.IOException2: Slave JVM has not reported exit code. Is it still running?
      	at hudson.plugins.sshslaves.SSHLauncher.startSlave(SSHLauncher.java:984)
      	at hudson.plugins.sshslaves.SSHLauncher.access$400(SSHLauncher.java:137)
      	at hudson.plugins.sshslaves.SSHLauncher$2.call(SSHLauncher.java:725)
      	at hudson.plugins.sshslaves.SSHLauncher$2.call(SSHLauncher.java:706)
      	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
      	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
      	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
      	at java.lang.Thread.run(Thread.java:745)
      Caused by: java.io.IOException: Remote call on socvm458 failed
      	at hudson.remoting.Channel.call(Channel.java:789)
      	at hudson.slaves.SlaveComputer.setChannel(SlaveComputer.java:516)
      	at hudson.slaves.SlaveComputer.setChannel(SlaveComputer.java:389)
      	at hudson.plugins.sshslaves.SSHLauncher.startSlave(SSHLauncher.java:976)
      	... 7 more
      Caused by: java.lang.OutOfMemoryError: GC overhead limit exceeded
      [07/08/16 15:02:01] Launch failed - cleaning up connection
      [07/08/16 15:02:01] [SSH] Connection closed.
      ERROR: Connection terminated
      java.io.IOException: Unexpected termination of the channel
      	at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:73)
      Caused by: java.io.EOFException
      	at java.io.ObjectInputStream$PeekInputStream.readFully(ObjectInputStream.java:2353)
      	at java.io.ObjectInputStream$BlockDataInputStream.readShort(ObjectInputStream.java:2822)
      	at java.io.ObjectInputStream.readStreamHeader(ObjectInputStream.java:804)
      	at java.io.ObjectInputStream.<init>(ObjectInputStream.java:301)
      	at hudson.remoting.ObjectInputStreamEx.<init>(ObjectInputStreamEx.java:48)
      	at hudson.remoting.AbstractSynchronousByteArrayCommandTransport.read(AbstractSynchronousByteArrayCommandTransport.java:34)
      	at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:59)
      

      I have already tried to increase the memory size and provide debugging options to JVM as described here: https://cloudbees.zendesk.com/hc/en-us/articles/204529970-Java-Heap-Out-of-Memory-Exception

      ... but to no effect. All my slaves are down, and I can't get them running again.

      This is the gc.log:

      Heap
       PSYoungGen      total 611648K, used 41943K [0x00000007d5560000, 0x0000000800000000, 0x0000000800000000)
        eden space 524288K, 8% used [0x00000007d5560000,0x00000007d7e55da8,0x00000007f5560000)
        from space 87360K, 0% used [0x00000007faab0000,0x00000007faab0000,0x0000000800000000)
        to   space 87360K, 0% used [0x00000007f5560000,0x00000007f5560000,0x00000007faab0000)
       ParOldGen       total 1398144K, used 0K [0x0000000780000000, 0x00000007d5560000, 0x00000007d5560000)
        object space 1398144K, 0% used [0x0000000780000000,0x0000000780000000,0x00000007d5560000)
       PSPermGen       total 21248K, used 6741K [0x000000077ae00000, 0x000000077c2c0000, 0x0000000780000000)
        object space 21248K, 31% used [0x000000077ae00000,0x000000077b4957e0,0x000000077c2c0000)
      

            Unassigned Unassigned
            olenz Olaf Lenz
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

              Created:
              Updated:
              Resolved: