Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-48149

OOMKilled handling in Jenkins slaves

XMLWordPrintable

    • Icon: Improvement Improvement
    • Resolution: Fixed
    • Icon: Minor Minor
    • kubernetes-plugin
    • None

      It's not at all easy to see what's going on when a child process run by a Jenkins slave created by kubernetes-plugin allocates too much memory and gets killed by the kernel OOM killer.  The following items would help:

      1. Kubernetes plugin should cleanly terminate the slave agent when it's done, rather than delete its pod. This would enable Reason: OOMKilled to be seen on the pod status before pod deletion, as well as giving a chance for a meaningful exit code to be set if wanted.

      2. Kubernetes plugin should wait some grace time for the pod to exit after cleanly terminating the slave agent, then look for Reason: OOMKilled on the pod status.  It should clearly log if this is seen.

      It'd be really nice if this log could appear in the build log as well as/instead of the Jenkins master log.  I don't know if that's possible due to the ordering (build finishes, then agent is terminated?), but anything would be better than nothing.

      Kubernetes plugin can than delete the pod as it does today.

      3. It would help to be able to disable automatic pod deletion for debugging purposes, allowing slave logs to be recovered and the pod status to be investigated more easily.

       

       

            adambkaplan Adam Kaplan
            jim_minter Jim Minter
            Votes:
            6 Vote for this issue
            Watchers:
            9 Start watching this issue

              Created:
              Updated:
              Resolved: