-
Bug
-
Resolution: Cannot Reproduce
-
Blocker
-
None
safe-restart is waiting to close a registered connection.
but that connection is waiting for being notified forever while the notifier thread either dead or hanging. there is no timeout in wait.
so safe-restart hanging forever.
our environment is jenkins 1.538 and ssh-slave plugin 1.9
but seems the issue i described still hold true in latest code base.
https://github.com/jenkinsci/ssh-slaves-plugin/blob/master/src/main/java/hudson/plugins/sshslaves/PluginImpl.java
/**
- Closes all the registered connections.
*/
private static synchronized void closeRegisteredConnections() {
for (Connection connection : activeConnections)Unknown macro: { LOGGER.log(Level.INFO, "Forcing connection to {0}}activeConnections.clear();
}
https://github.com/jenkinsci/trilead-ssh2/blob/master/src/com/trilead/ssh2/channel/ChannelManager.java
private void waitUntilChannelOpen(Channel c) throws IOException
{
synchronized (c)
{
while (c.state == Channel.STATE_OPENING)
{
try
catch (InterruptedException ignore)
{ throw new InterruptedIOException(); }}
if (c.state != Channel.STATE_OPEN)
{ removeChannel(c.localID); throw ioException("Could not open channel (state:" + c.state + ")", c); } }
}
stack trace which shows"deadlock"
"safe-restart thread" prio=10 tid=0x00007fd87e4ab800 nid=0x5157 waiting for monitor entry [0x00007fd9aa8e7000]
java.lang.Thread.State: BLOCKED (on object monitor)
at com.trilead.ssh2.Connection.getHostname(Connection.java:961)
- waiting to lock <0x0000000682669c88> (a com.trilead.ssh2.Connection)
at hudson.plugins.sshslaves.PluginImpl.closeRegisteredConnections(PluginImpl.java:70) - locked <0x0000000674712e38> (a java.lang.Class for hudson.plugins.sshslaves.PluginImpl)
at hudson.plugins.sshslaves.PluginImpl.stop(PluginImpl.java:61)
at hudson.PluginWrapper.stop(PluginWrapper.java:376)
at hudson.PluginManager.stop(PluginManager.java:734)
at jenkins.model.Jenkins.cleanUp(Jenkins.java:2797)
at hudson.lifecycle.UnixLifecycle.restart(UnixLifecycle.java:71)
at jenkins.model.Jenkins$23.run(Jenkins.java:3400)
"Channel reader thread: xyz.com" prio=10 tid=0x00007fd870014800 nid=0xbaec in Object.wait() [0x00007fd80ceb8000]
java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
at java.lang.Object.wait(Object.java:503)
at com.trilead.ssh2.channel.ChannelManager.waitUntilChannelOpen(ChannelManager.java:110)
- locked <0x0000000685743cc0> (a com.trilead.ssh2.channel.Channel)
at com.trilead.ssh2.channel.ChannelManager.openSessionChannel(ChannelManager.java:584)
at com.trilead.ssh2.Session.<init>(Session.java:42)
at com.trilead.ssh2.Connection.openSession(Connection.java:1129) - locked <0x0000000682669c88> (a com.trilead.ssh2.Connection)
at com.trilead.ssh2.SFTPv3Client.<init>(SFTPv3Client.java:99)
at com.trilead.ssh2.SFTPv3Client.<init>(SFTPv3Client.java:119)
at hudson.plugins.sshslaves.SSHLauncher.afterDisconnect(SSHLauncher.java:1213) - locked <0x0000000672021fc8> (a hudson.plugins.sshslaves.SSHLauncher)
at hudson.slaves.SlaveComputer$2.onClosed(SlaveComputer.java:456)
at hudson.remoting.Channel.terminate(Channel.java:831)
at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:76)