My projects run on separate slave machines/nodes.
Some of these projects have testcases that change the system clock.
This causes Jenkins and the slave to immediately disconnect. It always happens.
I've tried to dig into it a bit and here's what I have:
I'm looking at hudson.remoting.PingThread.java
The timeout on the channel ping is enforced by calculating the difference between 2 counter values, each obtained with System.currentTimeMillis().
However System.currentTimeMillis() is not the right way of obtaining such counters - the values are not monotonically increasing.
If wall-clock time changes for any reason (like maybe Daylight savings or a manual change) then the value returned might become smaller than a moment ago.
Obtaining the counter values with System.nanoTime() would solve this.
STACK TRACE below:
FATAL: Unable to delete script file C:\WINDOWS\TEMP\hudson2959765976006672203.bat
00:44:00 hudson.util.IOException2: remote file operation failed: C:\WINDOWS\TEMP\hudson2959765976006672203.bat at hudson.remoting.Channel@62ec4e:UTM-RT5AUTOM027
00:44:00 at hudson.FilePath.act(FilePath.java:901)
00:44:00 at hudson.FilePath.act(FilePath.java:878)
00:44:00 at hudson.FilePath.delete(FilePath.java:1263)
00:44:00 at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:101)
00:44:00 at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:60)
00:44:00 at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:19)
00:44:00 at hudson.model.AbstractBuild$AbstractBuildExecution.perform(AbstractBuild.java:802)
00:44:00 at hudson.model.Build$BuildExecution.build(Build.java:199)
00:44:00 at hudson.model.Build$BuildExecution.doRun(Build.java:160)
00:44:00 at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:584)
00:44:00 at hudson.model.Run.execute(Run.java:1592)
00:44:00 at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46)
00:44:00 at hudson.model.ResourceController.execute(ResourceController.java:88)
00:44:00 at hudson.model.Executor.run(Executor.java:237)
00:44:00 Caused by: hudson.remoting.ChannelClosedException: channel is already closed
00:44:00 at hudson.remoting.Channel.send(Channel.java:494)
00:44:00 at hudson.remoting.Request.call(Request.java:129)
00:44:00 at hudson.remoting.Channel.call(Channel.java:672)
00:44:00 at hudson.FilePath.act(FilePath.java:894)
00:44:00 ... 13 more
00:44:00 Caused by: java.io.IOException: Unexpected termination of the channel
00:44:00 at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:50)
00:44:00 Caused by: java.io.EOFException
00:44:00 at java.io.ObjectInputStream$BlockDataInputStream.peekByte(ObjectInputStream.java:2554)
00:44:00 at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1297)
00:44:00 at java.io.ObjectInputStream.readObject(ObjectInputStream.java:351)
00:44:00 at hudson.remoting.Command.readFrom(Command.java:92)
00:44:00 at hudson.remoting.ClassicCommandTransport.read(ClassicCommandTransport.java:59)
00:44:00 at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:48)
00:44:00 FATAL: hudson.remoting.RequestAbortedException: java.io.IOException: Unexpected termination of the channel
00:44:00 hudson.remoting.RequestAbortedException: hudson.remoting.RequestAbortedException: java.io.IOException: Unexpected termination of the channel
00:44:00 at hudson.remoting.Request.call(Request.java:174)
00:44:00 at hudson.remoting.Channel.call(Channel.java:672)
00:44:00 at hudson.remoting.RemoteInvocationHandler.invoke(RemoteInvocationHandler.java:158)
00:44:00 at $Proxy45.join(Unknown Source)
00:44:00 at hudson.Launcher$RemoteLauncher$ProcImpl.join(Launcher.java:925)
00:44:00 at hudson.Launcher$ProcStarter.join(Launcher.java:360)
00:44:00 at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:91)
00:44:00 at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:60)
00:44:00 at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:19)
00:44:00 at hudson.model.AbstractBuild$AbstractBuildExecution.perform(AbstractBuild.java:802)
00:44:00 at hudson.model.Build$BuildExecution.build(Build.java:199)
00:44:00 at hudson.model.Build$BuildExecution.doRun(Build.java:160)
00:44:00 at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:584)
00:44:00 at hudson.model.Run.execute(Run.java:1592)
00:44:00 at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46)
00:44:00 at hudson.model.ResourceController.execute(ResourceController.java:88)
00:44:00 at hudson.model.Executor.run(Executor.java:237)
00:44:00 Caused by: hudson.remoting.RequestAbortedException: java.io.IOException: Unexpected termination of the channel
00:44:00 at hudson.remoting.Request.abort(Request.java:299)
00:44:00 at hudson.remoting.Channel.terminate(Channel.java:732)
00:44:00 at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:69)
00:44:00 Caused by: java.io.IOException: Unexpected termination of the channel
00:44:00 at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:50)
00:44:00 Caused by: java.io.EOFException
00:44:00 at java.io.ObjectInputStream$BlockDataInputStream.peekByte(ObjectInputStream.java:2554)
00:44:00 at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1297)
00:44:00 at java.io.ObjectInputStream.readObject(ObjectInputStream.java:351)
00:44:00 at hudson.remoting.Command.readFrom(Command.java:92)
00:44:00 at hudson.remoting.ClassicCommandTransport.read(ClassicCommandTransport.java:59)
00:44:00 at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:48)
- is related to
-
JENKINS-16962 Linux slave on different timezone - immediate ping timeout exception
-
- Resolved
-
Code changed in jenkins
User: James Nord
Path:
src/main/java/hudson/remoting/AtmostOneThreadExecutor.java
src/main/java/hudson/remoting/Channel.java
src/main/java/hudson/remoting/PingThread.java
src/main/java/hudson/remoting/Request.java
src/main/java/hudson/remoting/SingleLaneExecutorService.java
src/main/java/hudson/remoting/SynchronousExecutorService.java
http://jenkins-ci.org/commit/remoting/75fb3559e354418000760573eac19efcd86c267f
Log:
JENKINS-21251Initial calculation of wait time was susseptable toclock drift.
The calculation on how long to wait in various places on the code was
suseptable to the clock changing as multiple calls where made to
System.currentTimeMillis() for the initial calculation.
Changed this so that we use a single call to System.nanoTime and made just
a single call for the initial calculation.
There is still a potential issue for any callers of Channel.getLastHeard()
this will be addressed in a future change and has only been noted in this
commit.