Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-28962

Memory leak on slaves when using Jnlp startup, listeners are registered but not removed any more

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Open (View Workflow)
    • Priority: Major
    • Resolution: Unresolved
    • Component/s: core, remoting
    • Labels:
      None
    • Environment:
      Jenkins ver. 1.617
    • Similar Issues:

      Description

      When I start slaves on Windows or Linux using the Jnlp startup method with the setting "take this slave offline when not needed", the slaves start with low memory usage, but quickly grow their usage up to the point where the slave-node itself goes OOM.

      When analyzing I found the following with Eclipse MAT (see below).

      This indicates to me that the class JnlpSlaveRestarterInstaller adds a listener, but never removes it.

      Thus some related data that is referenced from the JnlpSlaveRestarter is never freed as there are still listeners, even if new restarted were added in the meantime, quickly eating up the available memory on the slave.

        Attachments

          Issue Links

            Activity

            Hide
            centic centic added a comment -

            The stacktrace of the periodic call is

            Daemon Thread [pool-1-thread-87 for channel] (Suspended (breakpoint at line 156 in hudson.remoting.Engine))	
            	hudson.remoting.Engine.addListener(hudson.remoting.EngineListener) line: 156	
            	jenkins.slaves.restarter.JnlpSlaveRestarterInstaller$2.call() line: 70	
            	jenkins.slaves.restarter.JnlpSlaveRestarterInstaller$2.call() line: 52	
            	hudson.remoting.UserRequest<RSP,EXC>.perform(hudson.remoting.Channel) line: 121	
            	hudson.remoting.UserRequest<RSP,EXC>.perform(hudson.remoting.Channel) line: 49	
            	hudson.remoting.Request$2.run() line: 325	
            	hudson.remoting.InterceptingExecutorService$1.call() line: 68	
            	java.util.concurrent.FutureTask<V>.run() line: not available	
            	java.util.concurrent.ThreadPoolExecutor.runWorker(java.util.concurrent.ThreadPoolExecutor$Worker) line: not available	
            	java.util.concurrent.ThreadPoolExecutor$Worker.run() line: not available	
            	hudson.remoting.Engine$1$1.run() line: 69	
            	java.lang.Thread.run() line: not available	
            
            Show
            centic centic added a comment - The stacktrace of the periodic call is Daemon Thread [pool-1-thread-87 for channel] (Suspended (breakpoint at line 156 in hudson.remoting.Engine)) hudson.remoting.Engine.addListener(hudson.remoting.EngineListener) line: 156 jenkins.slaves.restarter.JnlpSlaveRestarterInstaller$2.call() line: 70 jenkins.slaves.restarter.JnlpSlaveRestarterInstaller$2.call() line: 52 hudson.remoting.UserRequest<RSP,EXC>.perform(hudson.remoting.Channel) line: 121 hudson.remoting.UserRequest<RSP,EXC>.perform(hudson.remoting.Channel) line: 49 hudson.remoting.Request$2.run() line: 325 hudson.remoting.InterceptingExecutorService$1.call() line: 68 java.util.concurrent.FutureTask<V>.run() line: not available java.util.concurrent.ThreadPoolExecutor.runWorker(java.util.concurrent.ThreadPoolExecutor$Worker) line: not available java.util.concurrent.ThreadPoolExecutor$Worker.run() line: not available hudson.remoting.Engine$1$1.run() line: 69 java.lang.Thread.run() line: not available
            Hide
            centic centic added a comment -

            The invocation of the JnlpSlaveRestarterInstaller is triggered by this stacktrace:

            Thread [Channel reader thread: channel] (Suspended (breakpoint at line 66 in hudson.remoting.InterceptingExecutorService))	
            	hudson.remoting.InterceptingExecutorService.wrap(java.lang.Runnable, V) line: 66	
            	hudson.remoting.InterceptingExecutorService.submit(java.lang.Runnable, T) line: 42	
            	hudson.remoting.InterceptingExecutorService.submit(java.lang.Runnable) line: 37	
            	hudson.remoting.UserRequest<RSP,EXC>(hudson.remoting.Request<RSP,EXC>).execute(hudson.remoting.Channel) line: 304	
            	hudson.remoting.Channel$2.handle(hudson.remoting.Command) line: 484	
            	hudson.remoting.SynchronousCommandTransport$ReaderThread.run() line: 60	
            
            Show
            centic centic added a comment - The invocation of the JnlpSlaveRestarterInstaller is triggered by this stacktrace: Thread [Channel reader thread: channel] (Suspended (breakpoint at line 66 in hudson.remoting.InterceptingExecutorService)) hudson.remoting.InterceptingExecutorService.wrap(java.lang.Runnable, V) line: 66 hudson.remoting.InterceptingExecutorService.submit(java.lang.Runnable, T) line: 42 hudson.remoting.InterceptingExecutorService.submit(java.lang.Runnable) line: 37 hudson.remoting.UserRequest<RSP,EXC>(hudson.remoting.Request<RSP,EXC>).execute(hudson.remoting.Channel) line: 304 hudson.remoting.Channel$2.handle(hudson.remoting.Command) line: 484 hudson.remoting.SynchronousCommandTransport$ReaderThread.run() line: 60
            Hide
            danielbeck Daniel Beck added a comment -

            listeners are registered but not removed any more

            Has this ever worked? It looks this has been the behavior since 1.55x when this feature was first introduced.

            Show
            danielbeck Daniel Beck added a comment - listeners are registered but not removed any more Has this ever worked? It looks this has been the behavior since 1.55x when this feature was first introduced.
            Hide
            centic centic added a comment -

            I think this is in there from the beginning. It only manifests itself if connections break repeatedely or are reestablished often, that is likely the reason why it usually goes unnoticed.

            Show
            centic centic added a comment - I think this is in there from the beginning. It only manifests itself if connections break repeatedely or are reestablished often, that is likely the reason why it usually goes unnoticed.
            Hide
            oleg_nenashev Oleg Nenashev added a comment -

            The issue seems to be still actual. Conditionally assigning it to myself

            Show
            oleg_nenashev Oleg Nenashev added a comment - The issue seems to be still actual. Conditionally assigning it to myself
            Hide
            oleg_nenashev Oleg Nenashev added a comment -

            Unfortunately I have no capacity to work on Remoting in medium term, so I will unassign it and let others to take it. If somebody is interested to submit a pull request, I will be happy to help to get it reviewed and released.

            Show
            oleg_nenashev Oleg Nenashev added a comment - Unfortunately I have no capacity to work on Remoting in medium term, so I will unassign it and let others to take it. If somebody is interested to submit a pull request, I will be happy to help to get it reviewed and released.

              People

              • Assignee:
                Unassigned
                Reporter:
                centic centic
              • Votes:
                3 Vote for this issue
                Watchers:
                6 Start watching this issue

                Dates

                • Created:
                  Updated: