Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-65537

Jenkins Is OOM when openstack plugin fails to provision

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Unresolved
    • Icon: Major Major
    • openstack-cloud-plugin
    • None
    • Cloudbees jenkins version: 2.277.3.1
      Openstack Cloud Plugin version: 2.57

      Our jenkins instance is constantly OOMing, and usually before the OOM happens we get a ton of failures to launch (because of project limits), but it doesn't seem like the jenkins plugin can handle the failures and just crashes our instances. Each time the jenkins instance OOM we get the same patterns of logs that are posted below (slightly edited since this a public jira). We opened a ticket with cloudbees and they think it might be a plugin issue:

       

      #Lots of these in the logs
      [05/03/21 22:56:19] SSH Launch of opensack-2xlarge-246 on xxx.xxx.xxx.xxx failed in 253 ms [05/03/21 22:56:19] SSH Launch of opensack-2xlarge-246 on xxx.xxx.xxx.xxx completed in 8,315 ms [05/03/21 
      

       

       

      and right before it dies it's trying to clean up the openstack instances:

       

      2021-05-03 23:08:18.923+0000 [id=657]   INFO    j.p.o.compute.JCloudsComputer#setPendingDelete: Setting opensack-2xlarge-236 pending delete status to true
      2021-05-03 23:08:50.177+0000 [id=626]   INFO    j.p.o.compute.JCloudsComputer#setPendingDelete: Setting opensack-2xlarge273 pending delete status to true
      2021-05-03 23:09:11.565+0000 [id=685]   INFO    j.p.o.compute.JCloudsComputer#setPendingDelete: Setting opensack-2xlarge-246 pending delete status to true

       

      Syslogs:

       

      6May  3 23:09:50 jenkins- kernel: [4165292.120922] Out of memory: Kill process 15236 (java) score 971 or sacrifice child
      May  3 23:09:50 jenkins kernel: [4165292.125523] Killed process 15236 (java) total-vm:39648372kB, anon-rss:29932904kB, file-rss:0kB, shmem-rss:0kB
      May  3 23:09:51 jenkins kernel: [4165293.313561] oom_reaper: reaped process 15236 (java), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB
      May  3 23:09:51 jenkins cloudbees-core-cm: cloudbees-core-cm: fatal: client (pid 15236) killed by signal 9, exiting
      

       

      Steps to reproduce:

      We notice this happens when the project hits their limits on openstack. So the best way to test is to have an openstack project with a set limit of VCPU, and spin up more jenkin workers that exceeds that limit provision to that openstack instances. The plugin should be able to handle project limits without the machine OOMing the master node.

            olivergondza Oliver Gondža
            stevenla408 Steven
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

              Created:
              Updated: