Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-38278

Slave initialization w/Kubernetes Plugin: Unable to start native thread OOM exception

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Cannot Reproduce
    • Icon: Minor Minor
    • kubernetes-plugin
    • None
    • Jenkins 2.7.3 (official docker image jenkins:2.7.3) - Docker host has 16GB of memory and runs nothing else

      Slave (official docker image jenkinsci/jnlp-slave:latest) - Kubernetes hosts all have 48GB of memory

      Kubernetes plugin 0.8

      I'm running into an issue where the slaves dynamically spun up by the Kubernetes plugin randomly cause an "unable to create new native thread" exception during the slave.jar connection phase. As far as I can tell, the exception is throws on the Jenkins master, not the slave. Both the master and slave are running in the official docker containers (jenkins:2.7.3 and jenkinsci/jnlp-slave:latest, respectively).

      The first thing I did was check the process limits from the "jenkins" user inside the Jenkins master container. The limits seem to be much higher than what should be needed for a server that only has 7 jobs (where everything runs on slaves). It usually gets to the point where a build for a job is created, but doesn't actually do anything. The job hangs with either no output or the same OOM exception.

      I've included the process limits and relevant logs below. Is there anything else I can provide that would help solve the problem?

      Jenkins master process limits from inside the container: https://gist.github.com/agunnerson-ibm/dccb6c8c2edc6a498f9c377d96f43481#file-process_limits_master-txt
      Slave connection log: https://gist.github.com/agunnerson-ibm/dccb6c8c2edc6a498f9c377d96f43481#file-slave-log
      slave.jar log: https://gist.github.com/agunnerson-ibm/dccb6c8c2edc6a498f9c377d96f43481#file-slave-jar-log
      Hanging job log: https://gist.github.com/agunnerson-ibm/dccb6c8c2edc6a498f9c377d96f43481#file-job-log

            csanchez Carlos Sanchez
            andrew_gunnerson_ibm Andrew Gunnerson
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

              Created:
              Updated:
              Resolved: