Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-34712

"master is offline" preventing Pipeline from executing

    XMLWordPrintable

    Details

    • Similar Issues:

      Description

      Our own Jenkins Pipeline projects seem to be getting stuck in this state of "master is offline" when attempting to run on our clusters which have zero executors assigned to the master node.

      It's unclear what, past a service restart, will clear this up

      Steps to reproduce:

      1. Start a pipeline job
      2. forcing the master to run out of storage
      3. shutdown master, clear up storage
      4. restart master, confirm it's up
      5. observe that it is still marked as offline for a long time. 30+ minutes

        Attachments

          Issue Links

            Activity

            Hide
            rtyler R. Tyler Croy added a comment -

            Correction, a restart has not corrected the issue. The Pipeline is stuck again in the build queue

            Show
            rtyler R. Tyler Croy added a comment - Correction, a restart has not corrected the issue. The Pipeline is stuck again in the build queue
            Hide
            abayer Andrew Bayer added a comment -

            A few questions -

            • What versions of core and Pipeline plugins are running?
            • Is the jenkins.io job "running"? That is, the job itself blinking etc - if it is, then it's stuck on executing part of itself, but if it isn't, then it's stuck even before that.
            Show
            abayer Andrew Bayer added a comment - A few questions - What versions of core and Pipeline plugins are running? Is the jenkins.io job "running"? That is, the job itself blinking etc - if it is, then it's stuck on executing part of itself, but if it isn't, then it's stuck even before that.
            Hide
            abayer Andrew Bayer added a comment -

            At first glance, I can't see how it'd ever have "master is offline" as a blocked reason (https://github.com/jenkinsci/workflow-job-plugin/blob/master/src/main/java/org/jenkinsci/plugins/workflow/job/WorkflowJob.java#L314) but I may be missing something. Jesse Glick, any thoughts?

            Show
            abayer Andrew Bayer added a comment - At first glance, I can't see how it'd ever have "master is offline" as a blocked reason ( https://github.com/jenkinsci/workflow-job-plugin/blob/master/src/main/java/org/jenkinsci/plugins/workflow/job/WorkflowJob.java#L314 ) but I may be missing something. Jesse Glick , any thoughts?
            Hide
            danielbeck Daniel Beck added a comment -

            JENKINS-7291 should ensure master always has a computer.

            Show
            danielbeck Daniel Beck added a comment - JENKINS-7291 should ensure master always has a computer.
            Hide
            rtyler R. Tyler Croy added a comment -

            Andrew Bayer, the Environment section of this JIRA has the information answering question number one

            As for the second, this is a Multiibranch project. The "master" branch "job" is not blinking, and the "jenkins.io" folder is not blinking either, though I don't think it does that

            Both you and Daniel Beck have access to this instance, you can "see" it live, but as this is a managed host, please refrain from tinkering settings and whatnot.

            Show
            rtyler R. Tyler Croy added a comment - Andrew Bayer , the Environment section of this JIRA has the information answering question number one As for the second, this is a Multiibranch project. The "master" branch "job" is not blinking, and the "jenkins.io" folder is not blinking either, though I don't think it does that Both you and Daniel Beck have access to this instance, you can "see" it live, but as this is a managed host, please refrain from tinkering settings and whatnot.
            Hide
            jglick Jesse Glick added a comment -

            I have never heard of this problem before, and have no idea offhand how it could occur, since as Daniel Beck notes, there is always a MasterComputer even if you have configured zero heavyweight executors—WorkflowJob uses flyweights.

            As far as I know I lack administrative access to the server in question to do any live debugging.

            Show
            jglick Jesse Glick added a comment - I have never heard of this problem before, and have no idea offhand how it could occur, since as Daniel Beck notes, there is always a MasterComputer even if you have configured zero heavyweight executors— WorkflowJob uses flyweights. As far as I know I lack administrative access to the server in question to do any live debugging.
            Hide
            jglick Jesse Glick added a comment -

            Jenkins.instance.selfLabel.offline, which should never be possible.

            Show
            jglick Jesse Glick added a comment - Jenkins.instance.selfLabel.offline , which should never be possible.
            Hide
            danielbeck Daniel Beck added a comment -

            Jesse Glick We learned a few hours ago that master was marked offline due to disk space, and since it has zero executors, it wasn't apparent from the UI (as an executor-less master isn't shown on the executors pane).

            For some reason that offline state was preserved across restarts, and apparently longer then disk space cleanup + 30 minutes for the next monitor run, so maybe something was wrong there, but that was the offline cause.

            Show
            danielbeck Daniel Beck added a comment - Jesse Glick We learned a few hours ago that master was marked offline due to disk space, and since it has zero executors, it wasn't apparent from the UI (as an executor-less master isn't shown on the executors pane). For some reason that offline state was preserved across restarts, and apparently longer then disk space cleanup + 30 minutes for the next monitor run, so maybe something was wrong there, but that was the offline cause.
            Hide
            danielbeck Daniel Beck added a comment -

            Looks a lot like Not A Defect to me. If the master is offline (especially for disk space reasons), no need to run any builds anywhere. The only RFE I could think of would be to not hide the executor-less master node in the executors sidepanel if it's marked offline.

            Show
            danielbeck Daniel Beck added a comment - Looks a lot like Not A Defect to me. If the master is offline (especially for disk space reasons), no need to run any builds anywhere. The only RFE I could think of would be to not hide the executor-less master node in the executors sidepanel if it's marked offline.
            Hide
            jglick Jesse Glick added a comment -

            Sounds like a core bug.

            Show
            jglick Jesse Glick added a comment - Sounds like a core bug.
            Hide
            danielbeck Daniel Beck added a comment -

            Jesse Glick What's the bug? That the node monitors work? That flyweight tasks don't run on marked-offline nodes?

            Show
            danielbeck Daniel Beck added a comment - Jesse Glick What's the bug? That the node monitors work? That flyweight tasks don't run on marked-offline nodes?
            Hide
            jglick Jesse Glick added a comment -

            I guess that the master node should be displayed when it is offline.

            Show
            jglick Jesse Glick added a comment - I guess that the master node should be displayed when it is offline.
            Hide
            scm_issue_link SCM/JIRA link daemon added a comment -

            Code changed in jenkins
            User: Oleg Nenashev
            Path:
            core/src/main/resources/lib/hudson/executors.jelly
            http://jenkins-ci.org/commit/jenkins/b67a30f8daff936c91fd54b90bef6c366707a8f1
            Log:
            Merge pull request #3294 from dwnusbaum/JENKINS-34712

            JENKINS-34712 Always show the master node when it is offline

            Compare: https://github.com/jenkinsci/jenkins/compare/5c8cc45900bf...b67a30f8daff

            Show
            scm_issue_link SCM/JIRA link daemon added a comment - Code changed in jenkins User: Oleg Nenashev Path: core/src/main/resources/lib/hudson/executors.jelly http://jenkins-ci.org/commit/jenkins/b67a30f8daff936c91fd54b90bef6c366707a8f1 Log: Merge pull request #3294 from dwnusbaum/ JENKINS-34712 JENKINS-34712 Always show the master node when it is offline Compare: https://github.com/jenkinsci/jenkins/compare/5c8cc45900bf...b67a30f8daff
            Hide
            danielbeck Daniel Beck added a comment -

            Released in 2.108.

            Show
            danielbeck Daniel Beck added a comment - Released in 2.108.
            Hide
            scm_issue_link SCM/JIRA link daemon added a comment -

            Code changed in jenkins
            User: Oleg Nenashev
            Path:
            core/src/main/resources/lib/hudson/executors.jelly
            http://jenkins-ci.org/commit/jenkins/20d44c5aa750f6fece96f83f0f7ed519e9df2e54
            Log:
            Merge pull request #3294 from dwnusbaum/JENKINS-34712

            JENKINS-34712 Always show the master node when it is offline

            (cherry picked from commit b67a30f8daff936c91fd54b90bef6c366707a8f1)

            Show
            scm_issue_link SCM/JIRA link daemon added a comment - Code changed in jenkins User: Oleg Nenashev Path: core/src/main/resources/lib/hudson/executors.jelly http://jenkins-ci.org/commit/jenkins/20d44c5aa750f6fece96f83f0f7ed519e9df2e54 Log: Merge pull request #3294 from dwnusbaum/ JENKINS-34712 JENKINS-34712 Always show the master node when it is offline (cherry picked from commit b67a30f8daff936c91fd54b90bef6c366707a8f1)

              People

              • Assignee:
                dnusbaum Devin Nusbaum
                Reporter:
                rtyler R. Tyler Croy
              • Votes:
                1 Vote for this issue
                Watchers:
                6 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: