Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-16429

ComThread is leaking even when its peer is dead

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Resolved (View Workflow)
    • Priority: Blocker
    • Resolution: Fixed
    • Environment:
    • Similar Issues:

      Description

      I get the following stack trace once a day:

      Exception:
      FATAL: unable to create new native thread
      java.lang.OutOfMemoryError: unable to create new native thread
          at java.lang.Thread.start0(Native Method)
          at java.lang.Thread.start(Thread.java:640)
          at hudson.Proc$LocalProc.<init>(Proc.java:264)
          at hudson.Proc$LocalProc.<init>(Proc.java:216)
          at hudson.Launcher$LocalLauncher.launch(Launcher.java:763)
          at hudson.Launcher$ProcStarter.start(Launcher.java:353)
          at hudson.plugins.perforce.HudsonP4DefaultExecutor.exec(HudsonP4DefaultExecutor.java:79)
          at com.tek42.perforce.parse.AbstractPerforceTemplate.getPerforceResponse(AbstractPerforceTemplate.java:321)
          at com.tek42.perforce.parse.AbstractPerforceTemplate.getPerforceResponse(AbstractPerforceTemplate.java:292)
          at com.tek42.perforce.parse.Workspaces.getWorkspace(Workspaces.java:61)
          at hudson.plugins.perforce.PerforceSCM.getPerforceWorkspace(PerforceSCM.java:1467)
          at hudson.plugins.perforce.PerforceSCM.getPerforceWorkspace(PerforceSCM.java:1428)
          at hudson.plugins.perforce.PerforceSCM.checkout(PerforceSCM.java:792)
          at hudson.model.AbstractProject.checkout(AbstractProject.java:1325)
          at hudson.model.AbstractBuild$AbstractBuildExecution.defaultCheckout(AbstractBuild.java:676)
          at jenkins.scm.SCMCheckoutStrategy.checkout(SCMCheckoutStrategy.java:88)
          at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:581)
          at hudson.model.Run.execute(Run.java:1516)
          at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46)
          at hudson.model.ResourceController.execute(ResourceController.java:88)
          at hudson.model.Executor.run(Executor.java:236)
      

      The URL http:/jenkinsserver/threadDump shows that there are hundreds of ComThreads:

      ComThread for RequestHandlerThread[#301]
      "ComThread for RequestHandlerThread[#301]" Id=50942 Group=main RUNNABLE (in native)
          at com4j.Win32Lock.suspend0(Native Method)
          at com4j.Win32Lock.suspend(Win32Lock.java:37)
          at com4j.ComThread.run0(ComThread.java:146)
          at com4j.ComThread.run(ComThread.java:134)
      

      The Monitoring plugin (https://wiki.jenkins-ci.org/display/JENKINS/Monitoring) shows that about 40 threads are added every hour. I tried to kill the ComThreads with the plugin but this did not work.

      As far as I understand the threads belong to the Active Directory plugin but I do not understand why they are created / never finish.

      I have activated logging for the Active Directory plugin by enabling the following loggers (all with log level all):

      • hudson.plugins.active_directory
      • org.acegisecurity
      • com4j

      The log contains a lot of entries like this:

      Jan 16, 2013 12:20:27 PM org.acegisecurity.context.HttpSessionContextIntegrationFilter doFilter
      FINE: SecurityContextHolder now cleared, as request processing completed
      
      Jan 16, 2013 12:20:27 PM org.acegisecurity.providers.anonymous.AnonymousProcessingFilter doFilter
      FINE: Populated SecurityContextHolder with anonymous token: 'org.acegisecurity.providers.anonymous.AnonymousAuthenticationToken@7ceb964f: Username: anonymous; Password: [PROTECTED]; Authenticated: true; Details: org.acegisecurity.ui.WebAuthenticationDetails@1de6: RemoteIpAddress: ##.##.##.##; SessionId: null; Granted Authorities: '
      
      Jan 16, 2013 12:20:27 PM org.acegisecurity.context.HttpSessionContextIntegrationFilter storeSecurityContextInSession
      FINE: The HttpSession is currently null, and the HttpSessionContextIntegrationFilter is prohibited from creating an HttpSession (because the allowSessionCreation property is false) - SecurityContext thus not stored for next request
      
      Jan 16, 2013 12:20:27 PM org.acegisecurity.ui.basicauth.BasicProcessingFilter doFilter
      FINE: Authorization header: null
      
      Jan 16, 2013 12:20:27 PM org.acegisecurity.ui.ExceptionTranslationFilter doFilter
      FINE: Chain processed normally
      
      Jan 16, 2013 12:20:27 PM org.acegisecurity.context.HttpSessionContextIntegrationFilter doFilter
      FINE: New SecurityContext instance will be associated with SecurityContextHolder
      
      Jan 16, 2013 12:20:27 PM org.acegisecurity.context.HttpSessionContextIntegrationFilter readSecurityContextFromSession
      FINE: No HttpSession currently exists
      
      Jan 16, 2013 12:20:27 PM org.acegisecurity.context.HttpSessionContextIntegrationFilter doFilter
      FINE: SecurityContextHolder now cleared, as request processing completed
      
      Jan 16, 2013 12:20:27 PM org.acegisecurity.context.HttpSessionContextIntegrationFilter storeSecurityContextInSession
      FINE: The HttpSession is currently null, and the HttpSessionContextIntegrationFilter is prohibited from creating an HttpSession (because the allowSessionCreation property is false) - SecurityContext thus not stored for next request
      
      Jan 16, 2013 12:20:27 PM org.acegisecurity.ui.ExceptionTranslationFilter doFilter
      FINE: Chain processed normally
      

      A lot of traffic (about four requests per second) is caused by the desktop application Hudson Tray Tracker (https://code.google.com/p/hudson-tray-tracker/). This application does not log in (it uses anonymous access).

      I have also

      • enabled access logging, but found no unusual URL requests
      • connected with VisualVM (http://visualvm.java.net/) to the Jenkins JVM, but got no more informations
      • created a PowerShell script that logs the number of threads, handles, working set and the virtual size of the Java process, but it only shows that all numbers are constantly increasing:
      Date Time Pid Thread Handle
      Count
      WorkingSet VirtualSize
      2013-01-20 22:01:31 740 86 1380 474.285 MB 1,291.566 MB
      2013-01-20 23:01:31 740 111 1719 520.957 MB 1,309.094 MB
      2013-01-21 00:01:31 740 138 2094 576.602 MB 1,322.531 MB
      2013-01-21 01:01:30 740 161 2381 630.742 MB 1,334.031 MB
      2013-01-21 02:01:30 740 177 2563 635.363 MB 1,342.094 MB
      2013-01-21 03:01:30 740 203 2884 638.371 MB 1,355.031 MB
      2013-01-21 04:01:31 740 222 3192 660.984 MB 1,396.156 MB
      2013-01-21 05:01:30 740 243 3443 641.645 MB 1,406.609 MB
      2013-01-21 06:01:30 740 264 3719 648.918 MB 1,417.219 MB
      2013-01-21 07:01:31 740 294 4114 661.613 MB 1,432.156 MB
      2013-01-21 08:01:31 740 313 4357 647.117 MB 1,441.719 MB
      2013-01-21 09:01:30 740 344 4758 667.797 MB 1,457.219 MB
      2013-01-21 10:01:31 740 414 5701 742.559 MB 1,575.898 MB
      2013-01-21 11:01:31 740 502 6777 729.488 MB 1,620.813 MB
      2013-01-21 12:01:52 740 562 7533 740.715 MB 1,650.813 MB
      2013-01-21 13:01:31 740 616 8313 763.828 MB 1,677.922 MB

      The problem is related to system load: at weekend the numbers are increasing at a lower rate than at work days. Between Christmas and New Year it didn't happened at all as most developers have been on vacation.

      • created a Groovey script:
      def threads=com4j.ComThread.threads.toArray(new com4j.ComThread[com4j.ComThread.threads.size()]);
      def numberOfTerminatedPeerThreads=0
      
      println("The follwing ComThreads have a peer thread which has been terminated")
      for (t in threads) {
        if (t.getName().startsWith("ComThread for RequestHandlerThread[#") &&
            t.peer.getState().equals(Thread.State.TERMINATED) ) {
          println("ComThread: " + t.getName() + "  Peer: " + t.peer )
          numberOfTerminatedPeerThreads++
          //t.kill()
        }
      }
      
      println("\nnumber of terminated peer threads: " + numberOfTerminatedPeerThreads)
      

      result:

      The follwing ComThreads have a peer thread which has been terminated
      ComThread: ComThread for RequestHandlerThread[#357]  Peer: Thread[RequestHandlerThread[#357],5,]
      ComThread: ComThread for RequestHandlerThread[#258]  Peer: Thread[RequestHandlerThread[#258],5,]
      ...
      
      number of terminated peer threads: 506
      

        Attachments

          Issue Links

            Activity

            Hide
            dennys Dennys Hsieh added a comment -

            Sorry, I think the thread issue is fixed. Because I forget to remove the old com4j in WEB-INF\lib of jenkins.war (in order to run the Groovy script). After remove it, the thread is normal now.

            Show
            dennys Dennys Hsieh added a comment - Sorry, I think the thread issue is fixed. Because I forget to remove the old com4j in WEB-INF\lib of jenkins.war (in order to run the Groovy script). After remove it, the thread is normal now.
            Hide
            jglick Jesse Glick added a comment -

            @franks then you may be seeing an unrelated issue; file it separately (and use for example jmap -histo:live $pid to diagnose).

            Show
            jglick Jesse Glick added a comment - @franks then you may be seeing an unrelated issue; file it separately (and use for example jmap -histo:live $pid to diagnose).
            Hide
            walterk82 Walter Kacynski added a comment -

            I was looking at how this plugin is packaged and there is a copy of com4j-amd64.dll in Jenkins\plugins\active-directory\WEB-INF\lib and there is also a copy inside com4j-2.1.jar\com4j\

            Is this expected?

            Show
            walterk82 Walter Kacynski added a comment - I was looking at how this plugin is packaged and there is a copy of com4j-amd64.dll in Jenkins\plugins\active-directory\WEB-INF\lib and there is also a copy inside com4j-2.1.jar\com4j\ Is this expected?
            Hide
            walterk82 Walter Kacynski added a comment -

            The files are of identical size.

            Show
            walterk82 Walter Kacynski added a comment - The files are of identical size.
            Hide
            jglick Jesse Glick added a comment -

            @walterk82 probably a mistake but better to file it separately rather than confusing the discussion in this issue.

            Show
            jglick Jesse Glick added a comment - @walterk82 probably a mistake but better to file it separately rather than confusing the discussion in this issue.

              People

              • Assignee:
                kohsuke Kohsuke Kawaguchi
                Reporter:
                bernhardb Bernhard Berbuir
              • Votes:
                9 Vote for this issue
                Watchers:
                14 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: