Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-7707

Multiple dead executors on slaves post 1.379 upgrade

    Details

    • Type: Bug
    • Status: Resolved (View Workflow)
    • Priority: Major
    • Resolution: Fixed
    • Component/s: remoting
    • Labels:
      None
    • Environment:
      CentOS Linux 5.x kernel 2.6.18-194.3.1.el5
      hudson.war 1.379 under Tomcat 5.5.28
      Slave OSs: CentOS Linux 5.x, Windows XP 32bit, Windows Server 2008 64bit
    • Similar Issues:

      Description

      Post upgrade to 1.379 we are experiencing increased ocurrances of dead executors on our slave systems. Prior to this release we had never encountered a dead executor on any system, master or slave. Immediately after deploying the 1.379 WAR, 6 executors spread out among a variety of slave platforms (Linux, WinXP 32bit, Win2k8 64bit) died. Today one more died on a Linux slave. Restarting Hudson clears out the dead executors, but disconnecting and reconnecting the slaves does not. I have not tried rebooting the slaves themselves yet. The stack trace below has consistently been the output associated with the dead executors.

      java.lang.AbstractMethodError
      at hudson.model.Executor.getEstimatedRemainingTimeMillis(Executor.java:340)
      at hudson.model.queue.LoadPredictor$CurrentlyRunningTasks.predict(LoadPredictor.java:77)
      at hudson.model.queue.MappingWorksheet.(MappingWorksheet.java:303)
      at hudson.model.Queue.pop(Queue.java:753)
      at hudson.model.Executor.grabJob(Executor.java:175)
      at hudson.model.Executor.run(Executor.java:113)

        Attachments

          Issue Links

            Activity

            Hide
            carlo_bonamico carlo_bonamico added a comment -

            I just noticed that at the time the issue appeared, I had both upgraded to 1384, AND set the maximum thread number for SCM polling to 20. Apparently, removing the thread polling limit made the issue disappear. Also, the issue in fact appeared to happen just after the SCM polling for a big project had taken place. I have about 40 projects on the server, and 4 slaves.

            Show
            carlo_bonamico carlo_bonamico added a comment - I just noticed that at the time the issue appeared, I had both upgraded to 1384, AND set the maximum thread number for SCM polling to 20. Apparently, removing the thread polling limit made the issue disappear. Also, the issue in fact appeared to happen just after the SCM polling for a big project had taken place. I have about 40 projects on the server, and 4 slaves.
            Hide
            mindless Alan Harder added a comment -

            The original reporter has mentioned not seeing this issue anymore.. does anyone else still see dead slaves with this exception on the latest Hudson release?

            java.lang.AbstractMethodError
            at hudson.model.Executor.getEstimatedRemainingTimeMillis(Executor.java:340)
            Show
            mindless Alan Harder added a comment - The original reporter has mentioned not seeing this issue anymore.. does anyone else still see dead slaves with this exception on the latest Hudson release? java.lang.AbstractMethodError at hudson.model.Executor.getEstimatedRemainingTimeMillis(Executor.java:340)
            Hide
            usammmy usammmy added a comment -

            Upgraded to .385. We haven't seen this issue for a while.

            Show
            usammmy usammmy added a comment - Upgraded to .385. We haven't seen this issue for a while.
            Hide
            carlo_bonamico carlo_bonamico added a comment -

            I am not seeing it on 1.385 and latest Batch Task plugin

            Show
            carlo_bonamico carlo_bonamico added a comment - I am not seeing it on 1.385 and latest Batch Task plugin
            Hide
            mindless Alan Harder added a comment -

            Ok, thanks.. closing this out. Reopen if anyone sees this AbstractMethodError on a recent release.

            Show
            mindless Alan Harder added a comment - Ok, thanks.. closing this out. Reopen if anyone sees this AbstractMethodError on a recent release.

              People

              • Assignee:
                Unassigned
                Reporter:
                dru_n dru_n
              • Votes:
                6 Vote for this issue
                Watchers:
                10 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: