Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-37489

Jenkins UI slow page load (big TTFB) - possible LDAP issue

XMLWordPrintable

      Symptoms:

      periodically Jenkins UI becomes very slow, sometimes returns 504, restart Jenkins service helps, but just for a short period of time. The periodicity of problems depends on a number of users, which use UI. No strange logs, just Jenkins web UI respond time becomes critical (nginx errors: "upstream timed out (110: Connection timed out) while reading response header from upstream")
      Average CPU load < 10%, Memory usage ~ 500Mb (heap size ~ 2.5Gb), Active Threads has critical values (max=38) and it matches with the time of UI problems
      Graphs (from Monitoring plugin) in attachments

      Monitoring plugin also has "Current requests" view, and there I find reason of this problem: lots of pending requests for same URLs, something like "/job/MY-JOB/changes GET", "/job/MY-JOB/BUILD-ID/wfapi/changesets?=1471452733792 ajax GET"_
      I tried to kill this requests (to free threads) using "Kill" button in Monitoring plugin, and after that I find helpful log records.

      Second record points on that Jenkins tries to loadUserByUsername and it's strange, because I configured Jenkins search query to search users by uid (login) (according to official configuration guide), but not "FirstName LastName".

      Than I checked ldap server logs and discovered high CPU usage, high LA, and tons of incorrect search queries from Jenkins in slapd.log:

      Aug 17 17:36:14 ldap-server slapd[1183]: conn=249907 fd=26 ACCEPT from IP=5.2.1.1:43078 (IP=0.0.0.0:636)
      Aug 17 17:36:15 ldap-server slapd[1183]: conn=249907 fd=26 TLS established tls_ssf=256 ssf=256
      Aug 17 17:36:15 ldap-server slapd[1183]: conn=249907 op=0 BIND dn="cn=jenkins-user,ou=system,dc=example,dc=com" method=128
      Aug 17 17:36:15 ldap-server slapd[1183]: conn=249907 op=0 BIND dn="cn=jenkins-user,ou=system,dc=example,dc=com" mech=SIMPLE ssf=0
      Aug 17 17:36:15 ldap-server slapd[1183]: conn=249907 op=0 RESULT tag=97 err=0 text=
      Aug 17 17:36:15 ldap-server slapd[1183]: conn=249907 op=1 SRCH base="ou=people,dc=example,dc=com" scope=2 deref=3 filter="(uid=firstname lastname)"
      Aug 17 17:36:15 ldap-server slapd[1183]: conn=249907 op=1 SEARCH RESULT tag=101 err=0 nentries=0 text=
      Aug 17 17:36:15 ldap-server slapd[1183]: conn=249907 op=2 UNBIND
      Aug 17 17:36:15 ldap-server slapd[1183]: conn=249907 fd=26 closed
      Aug 17 17:36:15 ldap-server slapd[1183]: conn=249911 fd=25 ACCEPT from IP=5.2.1.1:43132 (IP=0.0.0.0:636)
      Aug 17 17:36:16 ldap-server slapd[1183]: conn=249911 fd=25 TLS established tls_ssf=256 ssf=256
      Aug 17 17:36:16 ldap-server slapd[1183]: conn=249911 op=0 BIND dn="cn=jenkins-user,ou=system,dc=example,dc=com" method=128
      Aug 17 17:36:16 ldap-server slapd[1183]: conn=249911 op=0 BIND dn="cn=jenkins-user,ou=system,dc=example,dc=com" mech=SIMPLE ssf=0
      Aug 17 17:36:16 ldap-server slapd[1183]: conn=249911 op=0 RESULT tag=97 err=0 text=
      Aug 17 17:36:16 ldap-server slapd[1183]: conn=249911 op=1 SRCH base="ou=people,dc=example,dc=com" scope=2 deref=3 filter="(uid=firstname lastname)"
      Aug 17 17:36:16 ldap-server slapd[1183]: conn=249911 op=1 SEARCH RESULT tag=101 err=0 nentries=0 text=
      Aug 17 17:36:16 ldap-server slapd[1183]: conn=249911 op=2 UNBIND
      Aug 17 17:36:16 ldap-server slapd[1183]: conn=249911 fd=25 closed
      ....
      

      More then 37000 incorrect searches/day, just for my user

      Problem:

      It seems, that jenkins tries to check user privileges before executing (some?) requests, and while forming search query for LDAP it uses "Full Name"/username, but not ID/login/uid -> ldap can't find anything -> empty result -> jenkins tries one more time to verify privileges -> loop -> busy jenkins threads/workers/executors -> HTTP 504

      Workarounds:
      1. Not great, but possible: use big cache for LDAP (in jenkins "Configure Global Security" preferences), it didn't fix, but can minimize impact of this problem (not 100% sure)
      2. Like a fix: use custom LDAP search query (in jenkins "Configure Global Security" preferences), smth. like:
        (|(uid={0})(cn={0}))

        (don't forget to add 'cn' arrtibute to LDAP index)

        1. activeThreads.png
          32 kB
          Andrii Melnyk
        2. cpu.png
          18 kB
          Andrii Melnyk
        3. usedMemory.png
          44 kB
          Andrii Melnyk

            kohsuke Kohsuke Kawaguchi
            amelnyk Andrii Melnyk
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

              Created:
              Updated:
              Resolved: