Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-23676

Nodes go offine shortly after build starts

XMLWordPrintable

      mansion-client seems to be removing nodes shortly after they go offline. Common stack traces include:

      https://gist.github.com/recampbell/fc711322922a99a7b3da

      and more frequently:

      13:43:24 FATAL: null
      13:43:24 java.lang.NullPointerException
      13:43:24 at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:549)
      13:43:24 at hudson.model.Run.execute(Run.java:1665)
      13:43:24 at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46)
      13:43:24 at hudson.model.ResourceController.execute(ResourceController.java:88)
      13:43:24 at hudson.model.Executor.run(Executor.java:246)

      The basic problem seems to be that hudson.model.Executor#isIdle checks if Executor#executable is null. This is assigned in a synchronized block in Executor#run:217.

      Fundamentally, we can't check for idleness and atomically ensure that the idle state holds. Any cloud plugin likely suffers from this problem, but mansion-cloud has a unique exposure since it adds and removes slaves so quickly.

      Related to CloudBees ZD-19508

            Unassigned Unassigned
            recampbell Ryan Campbell
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

              Created:
              Updated:
              Resolved: