Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-46465

Decrease connection attempts while host is in Maintenance Mode

XMLWordPrintable

      When a vSphere host recently entered maintenance mode for several hours, I noticed that the Tasks&Events (in vSphere Client software) started to periodically receive connection attempt failures, apparently caused by Jenkins which seemed unaware of the maintenance operation.

      Although non-severe, observed negative effects were:

      • Eventually, the "flooding" rendered all tasks logged in vShered before the maintenance operation useless
      • The task failures cause some overhead in Jenkins (logging exceptions) and might even cause overhead in vSphere, too (if there are several hosts and each has a few nodes and Jenkins is attempting to start them...)

      Goal would be to improve the behavior in such scenarios. A couple ideas follow:

      • Detect when the host for the desired node is in maintenance mode - possibly by using vSphere APIs - and avoid performing the current task (Power On, etc.); still, log a message in the Jenkins node so that the situation is flagged
      • Introduce a new configuration for specifying which should be the pooling interval when a maintenance is underway; that might also be in the form of a factor (1x to have the current behavior, 2x to take twice the time between attempts, etc.)

      In terms of potentially related settings, the node has the following set:

      • Force VM Launch: Launches the virtual machine when necessary.
      • Wait for VMTools: [x]
        Delay between launch and boot complete: 120
      • Availability: Take this slave online when on-demand and off-line when idle

      Excerpt of the node log while the host was in maintenance:

      [MyVirtualMachineName] Starting Virtual Machine...
      [MyVirtualMachineName] Powering on VM
      [MyVirtualMachineName] EXCEPTION while starting VM
      org.jenkinsci.plugins.vsphere.tools.VSphereException: vSphere Error: VM cannot be started
      org.jenkinsci.plugins.vsphere.tools.VSphereException: vSphere Error: VM cannot be started
          at org.jenkinsci.plugins.vsphere.tools.VSphere.startVm(VSphere.java:383)
          at org.jenkinsci.plugins.vSphereCloudLauncher.launch(vSphereCloudLauncher.java:202)
          at hudson.slaves.SlaveComputer$1.call(SlaveComputer.java:253)
          at jenkins.util.ContextResettingExecutorService$2.call(ContextResettingExecutorService.java:46)
          at java.util.concurrent.FutureTask.run(Unknown Source)
          at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
          at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
          at java.lang.Thread.run(Unknown Source)
      ERROR: Unexpected error in launching a slave. This is probably a bug in Jenkins
      java.lang.RuntimeException: org.jenkinsci.plugins.vsphere.tools.VSphereException: vSphere Error: VM cannot be started
          at org.jenkinsci.plugins.vSphereCloudLauncher.launch(vSphereCloudLauncher.java:253)
          at hudson.slaves.SlaveComputer$1.call(SlaveComputer.java:253)
          at jenkins.util.ContextResettingExecutorService$2.call(ContextResettingExecutorService.java:46)
          at java.util.concurrent.FutureTask.run(Unknown Source)
          at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
          at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
          at java.lang.Thread.run(Unknown Source)
      Caused by: org.jenkinsci.plugins.vsphere.tools.VSphereException: vSphere Error: VM cannot be started
          at org.jenkinsci.plugins.vsphere.tools.VSphere.startVm(VSphere.java:383)
          at org.jenkinsci.plugins.vSphereCloudLauncher.launch(vSphereCloudLauncher.java:202)
          ... 6 more

       

            Unassigned Unassigned
            heldermagalhaes Helder Magalhães
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

              Created:
              Updated: