Loading...

XML

Word

Printable

Type: Improvement
Resolution: Unresolved
Priority: Minor
Component/s: core
Labels:
None
Environment:
2.89.2

Similar Issues:

Show

DiskSpaceMonitorDescriptor (used to check free space on the temp and workspace partitions) inherits (with a few intermediate classes) from AbstractNodeMonitorDescriptor, whose default scheduling interval is one hour. That means if an agent runs out of space it could take an hour before Jenkins detects the problem and takes the node offline. (There are some other code paths – such as onConnect – that can trigger an update, but I believe one hour remains the worst case.)

I've tripped into this multiple times where a job fulls up an agent, a subsequent job fails, yet the agent is still marked as online.

I believe one hour is not a reasonable modern value for such a quick check, but I am unsure how to proceed:

Change AbstractNodeMonitorDescriptor to a "more reasonable" default?
Make this a configurable value?
Make it a configurable value per check?
A fancy dynamic scheduler with backoff?

My own inclination is that one minute would be a reasonable and unsurprising default.

Assignee:: Unassigned

Reporter:: Chris Burroughs

Votes:: 0 Vote for this issue

Watchers:: 1 Start watching this issue

Created:: 2017-12-22 22:44

Updated:: 2017-12-22 22:44

Details

Description

Attachments

Activity

People

Dates