Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-24836

Ability to fail a build when no executors are available

    XMLWordPrintable

    Details

    • Type: New Feature
    • Status: Open (View Workflow)
    • Priority: Major
    • Resolution: Unresolved
    • Component/s: core
    • Labels:
      None
    • Similar Issues:

      Description

      When using Jenkins to run automatic tasks, it's almost foolproof. Unfotunately, there's a loophole: if a slave goes offline, the job will just stay in waiting state. It's not acceptable for critical tasks (backups, etc). They should either run in time or fail.

      Thus, this feature request. Please add an option to fail a build when it cannot run.

        Attachments

          Activity

          Hide
          brein12 Ilya Ivanov added a comment -

          Well, generally a vanilla Jenkins job may not work properly (and don't notify about it) due to the following reasons (I'm referring to bash here):

          1) The script isn't catching all errors. Jenkins launches the scripts with "-xe" options by default. A more robust way is "-xeu -o pipefail". This is trivial to do.
          2) Not all commands return correct exit code on failure. "Log parser" plugin often helps here.
          3) The node isn't available at the time, as described in this issue. Right now I'm using https://wiki.jenkins-ci.org/display/JENKINS/Monitor+and+Restart+Offline+Slaves, but it's not enough.
          4) Some step hangs, so the build is never completed and the next one is in queue forever. Fortunately, there's a "Build timeout" plugin for that.

          I think that's all.

          Show
          brein12 Ilya Ivanov added a comment - Well, generally a vanilla Jenkins job may not work properly (and don't notify about it) due to the following reasons (I'm referring to bash here): 1) The script isn't catching all errors. Jenkins launches the scripts with "-xe" options by default. A more robust way is "-xeu -o pipefail". This is trivial to do. 2) Not all commands return correct exit code on failure. "Log parser" plugin often helps here. 3) The node isn't available at the time, as described in this issue. Right now I'm using https://wiki.jenkins-ci.org/display/JENKINS/Monitor+and+Restart+Offline+Slaves , but it's not enough. 4) Some step hangs, so the build is never completed and the next one is in queue forever. Fortunately, there's a "Build timeout" plugin for that. I think that's all.
          Hide
          danielbeck Daniel Beck added a comment -

          Please explain other conditions etc. so I get a more complete picture what this is about. It may be possible to implement fairly easily.

          Show
          danielbeck Daniel Beck added a comment - Please explain other conditions etc. so I get a more complete picture what this is about. It may be possible to implement fairly easily.
          Hide
          brein12 Ilya Ivanov added a comment -

          There are other conditions, but they can be worked around with plugins. This one cannot.
          No, the plugins that you mentioned don't exactly solve the problem. The node status isn't relevant. Job status is.
          (To elaborate, for example, the node might itself be launched only periodically. Or have some issues with connectivity. Or resources. Or something else. In that case using a plugin that notifies about node availability will result in a number of alerts, which will render the notification system useless).

          Show
          brein12 Ilya Ivanov added a comment - There are other conditions, but they can be worked around with plugins. This one cannot. No, the plugins that you mentioned don't exactly solve the problem. The node status isn't relevant. Job status is. (To elaborate, for example, the node might itself be launched only periodically. Or have some issues with connectivity. Or resources. Or something else. In that case using a plugin that notifies about node availability will result in a number of alerts, which will render the notification system useless).
          Hide
          danielbeck Daniel Beck added a comment -

          Before a build is started by starting execution on an executor, there is no build, just a queue item. Those cannot have a result like builds, and nothing about them is recorded either. If you cancel one, no "failed build" or "canceled build" entry is created, and no build numbers are assigned either.

          Is node being/going offline just an example, or is this the one condition that's relevant? If the latter, plugins like https://wiki.jenkins-ci.org/display/JENKINS/Extreme+Notification+Plugin or http://jenkins-enterprise.cloudbees.com/docs/user-guide-bundle/nodes-plus.html would solve this.

          Show
          danielbeck Daniel Beck added a comment - Before a build is started by starting execution on an executor, there is no build, just a queue item. Those cannot have a result like builds, and nothing about them is recorded either. If you cancel one, no "failed build" or "canceled build" entry is created, and no build numbers are assigned either. Is node being/going offline just an example, or is this the one condition that's relevant? If the latter, plugins like https://wiki.jenkins-ci.org/display/JENKINS/Extreme+Notification+Plugin or http://jenkins-enterprise.cloudbees.com/docs/user-guide-bundle/nodes-plus.html would solve this.
          Hide
          brein12 Ilya Ivanov added a comment -

          Not sure what do you mean by "there's no build to fail". It's right there, in queue, "job #12345", waiting for the next available executor.
          Anyway, I don't really see a difference. I'm asking for a way to be sure that either the job runs in time or I get a notification that it didn't. I guessed failing it would be the natural way. "Queue watcher" would do just as well, I suppose.

          Show
          brein12 Ilya Ivanov added a comment - Not sure what do you mean by "there's no build to fail". It's right there, in queue, "job #12345", waiting for the next available executor. Anyway, I don't really see a difference. I'm asking for a way to be sure that either the job runs in time or I get a notification that it didn't. I guessed failing it would be the natural way. "Queue watcher" would do just as well, I suppose.
          Hide
          danielbeck Daniel Beck added a comment -

          There is no build to fail if there's no executor available to run on. Please explain in more detail what you are asking for.

          Wouldn't it be more useful to have a "queue watcher" and if certain items are queued longer than a certain timeout, send a notification?

          Show
          danielbeck Daniel Beck added a comment - There is no build to fail if there's no executor available to run on. Please explain in more detail what you are asking for. Wouldn't it be more useful to have a "queue watcher" and if certain items are queued longer than a certain timeout, send a notification?

            People

            • Assignee:
              Unassigned
              Reporter:
              brein12 Ilya Ivanov
            • Votes:
              4 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

              • Created:
                Updated: