-
Improvement
-
Resolution: Fixed
-
Major
-
None
-
Platform: All, OS: All
It appears that the idle timeout for slaves provisioned with EC2 is hard-coded
to 30 minutes.
I suspect it's this line in EC2RetentionStrategy.java:
if (idleMilliseconds > TimeUnit2.MINUTES.toMillis(30)) {
There is a lot of cost associated with spinning up a new instance. A fresh
checkout will generate more traffic than a simple update on an already running
instance. On the other hand, an idle instance may unnecessarily block other
instances from being launched. A way to configure the timeout at least globally,
possibly by job is highly desirable.
An auto-adjusting timeout based on prior time intervals between job completion
and subsequent job run would be nice to have. Not asking for any fancy queueing
theory work; just something simple where slaves get terminated if a certain
amount of time greater than a median has passed. Could be when it gets greater
than one std. deviation, when it moves past the third quartile, doesn't really
matter as long as the end result is that long quiet times (e.g., nights or
weekends) don't lengthen the timeout too much.
It also appears that there is no check on whether or not a user might be logged
in on the slave before terminating it. This is problematic when users try to
troubleshoot a build on the slave.