Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-20051

Winstone goes down on 1.509.4 (due to LDAP issues?)

    XMLWordPrintable

    Details

    • Similar Issues:

      Description

      We have experienced a strange issue with Jenkins 1.509.4. Web interface has gone offline after several days.

      • "RMI TCP Connection" thread is not present in the thread dump
      • "RMI Scheduler" and "RMI accept" exist in the thread
      • There is no info in jenkins.log
      • There's no BLOCKED threads
      • Jenkins internal jobs work well, only the web interface is not available

      After analysis of logs, seems that issue has been caused by "Too many open files" error:

      Oct 14, 2013 5:27:48 PM winstone.Logger logInternal
      SEVERE: Error during HTTP listener init or shutdown
      java.net.SocketException: Too many open files
      at java.net.PlainSocketImpl.socketAccept(Native Method)
      at java.net.AbstractPlainSocketImpl.accept(AbstractPlainSocketImpl.java:398)
      at java.net.ServerSocket.implAccept(ServerSocket.java:530)
      at java.net.ServerSocket.accept(ServerSocket.java:498)
      at winstone.HttpListener.run(HttpListener.java:139)
      at java.lang.Thread.run(Thread.java:724)

      According to logs, there is a huge number of LDAP errors

      WARNING: Failed to search LDAP for username=USER_NAME
      org.acegisecurity.ldap.LdapDataAccessException: Failed to obtain InitialDirContext due to unexpected exception; nested exception is javax.naming.ServiceUnavailableException: ru20waddc.internal.synopsys.com:389; socket closed
      at org.acegisecurity.ldap.DefaultInitialDirContextFactory.connect(DefaultInitialDirContextFactory.java:193)
      at org.acegisecurity.ldap.DefaultInitialDirContextFactory.newInitialDirContext(DefaultInitialDirContextFactory.java:261)
      at org.acegisecurity.ldap.DefaultInitialDirContextFactory.newInitialDirContext(DefaultInitialDirContextFactory.java:241)
      at org.acegisecurity.ldap.LdapTemplate.execute(LdapTemplate.java:123)
      at org.acegisecurity.ldap.LdapTemplate.searchForSingleEntry(LdapTemplate.java:246)
      at org.acegisecurity.ldap.search.FilterBasedLdapUserSearch.searchForUser(FilterBasedLdapUserSearch.java:119)
      at hudson.security.LDAPSecurityRealm$LDAPUserDetailsService.loadUserByUsername(LDAPSecurityRealm.java:591)
      at hudson.security.LDAPSecurityRealm$LDAPUserDetailsService.loadUserByUsername(LDAPSecurityRealm.java:553)
      at org.acegisecurity.ui.rememberme.TokenBasedRememberMeServices.loadUserDetails(TokenBasedRememberMeServices.java:308)
      at org.acegisecurity.ui.rememberme.TokenBasedRememberMeServices.autoLogin(TokenBasedRememberMeServices.java:218)
      at org.acegisecurity.ui.rememberme.RememberMeProcessingFilter.doFilter(RememberMeProcessingFilter.java:104)
      ...
      at org.kohsuke.stapler.compression.CompressionFilter.doFilter(CompressionFilter.java:46)
      at winstone.FilterConfiguration.execute(FilterConfiguration.java:194)
      at winstone.RequestDispatcher.doFilter(RequestDispatcher.java:366)
      at hudson.util.CharacterEncodingFilter.doFilter(CharacterEncodingFilter.java:81)
      at winstone.FilterConfiguration.execute(FilterConfiguration.java:194)
      at winstone.RequestDispatcher.doFilter(RequestDispatcher.java:366)
      at winstone.RequestDispatcher.forward(RequestDispatcher.java:331)
      at winstone.RequestHandlerThread.processRequest(RequestHandlerThread.java:227)
      at winstone.RequestHandlerThread.run(RequestHandlerThread.java:150)
      at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
      at java.util.concurrent.FutureTask.run(FutureTask.java:262)
      at winstone.BoundedExecutorService$1.run(BoundedExecutorService.java:77)
      at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
      at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
      at java.lang.Thread.run(Thread.java:724)
      Caused by: javax.naming.ServiceUnavailableException: LDAP_ADDRESS; socket closed
      at com.sun.jndi.ldap.Connection.readReply(Connection.java:454)
      at com.sun.jndi.ldap.LdapClient.ldapBind(LdapClient.java:364)
      at com.sun.jndi.ldap.LdapClient.authenticate(LdapClient.java:213)
      at com.sun.jndi.ldap.LdapCtx.connect(LdapCtx.java:2740)
      at com.sun.jndi.ldap.LdapCtx.<init>(LdapCtx.java:316)
      at com.sun.jndi.ldap.LdapCtxFactory.getUsingURL(LdapCtxFactory.java:193)
      at com.sun.jndi.ldap.LdapCtxFactory.getUsingURLs(LdapCtxFactory.java:211)
      at com.sun.jndi.ldap.LdapCtxFactory.getLdapCtxInstance(LdapCtxFactory.java:154)
      at com.sun.jndi.ldap.LdapCtxFactory.getInitialContext(LdapCtxFactory.java:84)
      at javax.naming.spi.NamingManager.getInitialContext(NamingManager.java:684)
      at javax.naming.InitialContext.getDefaultInitCtx(InitialContext.java:307)
      at javax.naming.InitialContext.init(InitialContext.java:242)
      at javax.naming.InitialContext.<init>(InitialContext.java:216)
      at javax.naming.directory.InitialDirContext.<init>(InitialDirContext.java:101)
      at org.acegisecurity.ldap.DefaultInitialDirContextFactory.connect(DefaultInitialDirContextFactory.java:180)
      ... 39 more

        Attachments

          Issue Links

            Activity

            oleg_nenashev Oleg Nenashev created issue -
            Hide
            oleg_nenashev Oleg Nenashev added a comment -

            Fixed logs

            Show
            oleg_nenashev Oleg Nenashev added a comment - Fixed logs
            oleg_nenashev Oleg Nenashev made changes -
            Field Original Value New Value
            Description We have experienced a strange issue with Jenkins 1.509.4. Web interface has gone offline after several days.

            * "RMI TCP Connection" thread is not present in the thread dump
            * "RMI Scheduler" and "RMI accept" exist in the thread
            * There is no info in jenkins.log
            * There's no BLOCKED threads
            * Jenkins internal jobs work well, only the web interface is not available

            After analysis of logs, seems that issue has been caused by "Too many open files" error:
            {quote}
            Oct 14, 2013 5:27:48 PM winstone.Logger logInternal
            SEVERE: Error during HTTP listener init or shutdown
            java.net.SocketException: Too many open files
            at java.net.PlainSocketImpl.socketAccept(Native Method)
            at java.net.AbstractPlainSocketImpl.accept(AbstractPlainSocketImpl.java:398)
            at java.net.ServerSocket.implAccept(ServerSocket.java:530)
            at java.net.ServerSocket.accept(ServerSocket.java:498)
            at winstone.HttpListener.run(HttpListener.java:139)
            at java.lang.Thread.run(Thread.java:724)
            {quote}

            According to logs, there is a huge number of LDAP errors
            {quote}
            WARNING: Failed to search LDAP for username=USER_NAME
            org.acegisecurity.ldap.LdapDataAccessException: Failed to obtain InitialDirContext due to unexpected exception; nested exception is javax.naming.ServiceUnavailableException: ru20waddc.internal.synopsys.com:389; socket closed
            at org.acegisecurity.ldap.DefaultInitialDirContextFactory.connect(DefaultInitialDirContextFactory.java:193)
            at org.acegisecurity.ldap.DefaultInitialDirContextFactory.newInitialDirContext(DefaultInitialDirContextFactory.java:261)
            at org.acegisecurity.ldap.DefaultInitialDirContextFactory.newInitialDirContext(DefaultInitialDirContextFactory.java:241)
            at org.acegisecurity.ldap.LdapTemplate.execute(LdapTemplate.java:123)
            at org.acegisecurity.ldap.LdapTemplate.searchForSingleEntry(LdapTemplate.java:246)
            at org.acegisecurity.ldap.search.FilterBasedLdapUserSearch.searchForUser(FilterBasedLdapUserSearch.java:119)
            at hudson.security.LDAPSecurityRealm$LDAPUserDetailsService.loadUserByUsername(LDAPSecurityRealm.java:591)
            at hudson.security.LDAPSecurityRealm$LDAPUserDetailsService.loadUserByUsername(LDAPSecurityRealm.java:553)
            at org.acegisecurity.ui.rememberme.TokenBasedRememberMeServices.loadUserDetails(TokenBasedRememberMeServices.java:308)
            at org.acegisecurity.ui.rememberme.TokenBasedRememberMeServices.autoLogin(TokenBasedRememberMeServices.java:218)
            at org.acegisecurity.ui.rememberme.RememberMeProcessingFilter.doFilter(RememberMeProcessingFilter.java:104)
            at hudson.security.ChainedServletFilter$1.doFilter(ChainedServletFilter.java:87)
            at org.acegisecurity.ui.AbstractProcessingFilter.doFilter(AbstractProcessingFilter.java:271)
            at hudson.security.ChainedServletFilter$1.doFilter(ChainedServletFilter.java:87)
            at org.acegisecurity.ui.basicauth.BasicProcessingFilter.doFilter(BasicProcessingFilter.java:174)
            at hudson.security.ChainedServletFilter$1.doFilter(ChainedServletFilter.java:87)
            at jenkins.security.ApiTokenFilter.doFilter(ApiTokenFilter.java:64)
            at hudson.security.ChainedServletFilter$1.doFilter(ChainedServletFilter.java:87)
            at org.acegisecurity.context.HttpSessionContextIntegrationFilter.doFilter(HttpSessionContextIntegrationFilter.java:249)
            at hudson.security.HttpSessionContextIntegrationFilter2.doFilter(HttpSessionContextIntegrationFilter2.java:66)
            at hudson.security.ChainedServletFilter$1.doFilter(ChainedServletFilter.java:87)
            at hudson.security.ChainedServletFilter.doFilter(ChainedServletFilter.java:76)
            at hudson.security.HudsonFilter.doFilter(HudsonFilter.java:164)
            at winstone.FilterConfiguration.execute(FilterConfiguration.java:194)
            at winstone.RequestDispatcher.doFilter(RequestDispatcher.java:366)
            at org.kohsuke.stapler.compression.CompressionFilter.doFilter(CompressionFilter.java:46)
            at winstone.FilterConfiguration.execute(FilterConfiguration.java:194)
            at winstone.RequestDispatcher.doFilter(RequestDispatcher.java:366)
            at hudson.util.CharacterEncodingFilter.doFilter(CharacterEncodingFilter.java:81)
            at winstone.FilterConfiguration.execute(FilterConfiguration.java:194)
            at winstone.RequestDispatcher.doFilter(RequestDispatcher.java:366)
            at winstone.RequestDispatcher.forward(RequestDispatcher.java:331)
            at winstone.RequestHandlerThread.processRequest(RequestHandlerThread.java:227)
            at winstone.RequestHandlerThread.run(RequestHandlerThread.java:150)
            at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
            at java.util.concurrent.FutureTask.run(FutureTask.java:262)
            at winstone.BoundedExecutorService$1.run(BoundedExecutorService.java:77)
            at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
            at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
            at java.lang.Thread.run(Thread.java:724)
            Caused by: javax.naming.ServiceUnavailableException: ${LDAP_ADDRESS}; socket closed
            at com.sun.jndi.ldap.Connection.readReply(Connection.java:454)
            at com.sun.jndi.ldap.LdapClient.ldapBind(LdapClient.java:364)
            at com.sun.jndi.ldap.LdapClient.authenticate(LdapClient.java:213)
            at com.sun.jndi.ldap.LdapCtx.connect(LdapCtx.java:2740)
            at com.sun.jndi.ldap.LdapCtx.<init>(LdapCtx.java:316)
            at com.sun.jndi.ldap.LdapCtxFactory.getUsingURL(LdapCtxFactory.java:193)
            at com.sun.jndi.ldap.LdapCtxFactory.getUsingURLs(LdapCtxFactory.java:211)
            at com.sun.jndi.ldap.LdapCtxFactory.getLdapCtxInstance(LdapCtxFactory.java:154)
            at com.sun.jndi.ldap.LdapCtxFactory.getInitialContext(LdapCtxFactory.java:84)
            at javax.naming.spi.NamingManager.getInitialContext(NamingManager.java:684)
            at javax.naming.InitialContext.getDefaultInitCtx(InitialContext.java:307)
            at javax.naming.InitialContext.init(InitialContext.java:242)
            at javax.naming.InitialContext.<init>(InitialContext.java:216)
            at javax.naming.directory.InitialDirContext.<init>(InitialDirContext.java:101)
            at org.acegisecurity.ldap.DefaultInitialDirContextFactory.connect(DefaultInitialDirContextFactory.java:180)
            ... 39 more
            {quote}
            We have experienced a strange issue with Jenkins 1.509.4. Web interface has gone offline after several days.

            * "RMI TCP Connection" thread is not present in the thread dump
            * "RMI Scheduler" and "RMI accept" exist in the thread
            * There is no info in jenkins.log
            * There's no BLOCKED threads
            * Jenkins internal jobs work well, only the web interface is not available

            After analysis of logs, seems that issue has been caused by "Too many open files" error:
            {quote}
            Oct 14, 2013 5:27:48 PM winstone.Logger logInternal
            SEVERE: Error during HTTP listener init or shutdown
            java.net.SocketException: Too many open files
            at java.net.PlainSocketImpl.socketAccept(Native Method)
            at java.net.AbstractPlainSocketImpl.accept(AbstractPlainSocketImpl.java:398)
            at java.net.ServerSocket.implAccept(ServerSocket.java:530)
            at java.net.ServerSocket.accept(ServerSocket.java:498)
            at winstone.HttpListener.run(HttpListener.java:139)
            at java.lang.Thread.run(Thread.java:724)
            {quote}

            According to logs, there is a huge number of LDAP errors
            {quote}
            WARNING: Failed to search LDAP for username=USER_NAME
            org.acegisecurity.ldap.LdapDataAccessException: Failed to obtain InitialDirContext due to unexpected exception; nested exception is javax.naming.ServiceUnavailableException: ru20waddc.internal.synopsys.com:389; socket closed
            at org.acegisecurity.ldap.DefaultInitialDirContextFactory.connect(DefaultInitialDirContextFactory.java:193)
            at org.acegisecurity.ldap.DefaultInitialDirContextFactory.newInitialDirContext(DefaultInitialDirContextFactory.java:261)
            at org.acegisecurity.ldap.DefaultInitialDirContextFactory.newInitialDirContext(DefaultInitialDirContextFactory.java:241)
            at org.acegisecurity.ldap.LdapTemplate.execute(LdapTemplate.java:123)
            at org.acegisecurity.ldap.LdapTemplate.searchForSingleEntry(LdapTemplate.java:246)
            at org.acegisecurity.ldap.search.FilterBasedLdapUserSearch.searchForUser(FilterBasedLdapUserSearch.java:119)
            at hudson.security.LDAPSecurityRealm$LDAPUserDetailsService.loadUserByUsername(LDAPSecurityRealm.java:591)
            at hudson.security.LDAPSecurityRealm$LDAPUserDetailsService.loadUserByUsername(LDAPSecurityRealm.java:553)
            at org.acegisecurity.ui.rememberme.TokenBasedRememberMeServices.loadUserDetails(TokenBasedRememberMeServices.java:308)
            at org.acegisecurity.ui.rememberme.TokenBasedRememberMeServices.autoLogin(TokenBasedRememberMeServices.java:218)
            at org.acegisecurity.ui.rememberme.RememberMeProcessingFilter.doFilter(RememberMeProcessingFilter.java:104)
            ...
            at org.kohsuke.stapler.compression.CompressionFilter.doFilter(CompressionFilter.java:46)
            at winstone.FilterConfiguration.execute(FilterConfiguration.java:194)
            at winstone.RequestDispatcher.doFilter(RequestDispatcher.java:366)
            at hudson.util.CharacterEncodingFilter.doFilter(CharacterEncodingFilter.java:81)
            at winstone.FilterConfiguration.execute(FilterConfiguration.java:194)
            at winstone.RequestDispatcher.doFilter(RequestDispatcher.java:366)
            at winstone.RequestDispatcher.forward(RequestDispatcher.java:331)
            at winstone.RequestHandlerThread.processRequest(RequestHandlerThread.java:227)
            at winstone.RequestHandlerThread.run(RequestHandlerThread.java:150)
            at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
            at java.util.concurrent.FutureTask.run(FutureTask.java:262)
            at winstone.BoundedExecutorService$1.run(BoundedExecutorService.java:77)
            at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
            at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
            at java.lang.Thread.run(Thread.java:724)
            Caused by: javax.naming.ServiceUnavailableException: LDAP_ADDRESS; socket closed
            at com.sun.jndi.ldap.Connection.readReply(Connection.java:454)
            at com.sun.jndi.ldap.LdapClient.ldapBind(LdapClient.java:364)
            at com.sun.jndi.ldap.LdapClient.authenticate(LdapClient.java:213)
            at com.sun.jndi.ldap.LdapCtx.connect(LdapCtx.java:2740)
            at com.sun.jndi.ldap.LdapCtx.<init>(LdapCtx.java:316)
            at com.sun.jndi.ldap.LdapCtxFactory.getUsingURL(LdapCtxFactory.java:193)
            at com.sun.jndi.ldap.LdapCtxFactory.getUsingURLs(LdapCtxFactory.java:211)
            at com.sun.jndi.ldap.LdapCtxFactory.getLdapCtxInstance(LdapCtxFactory.java:154)
            at com.sun.jndi.ldap.LdapCtxFactory.getInitialContext(LdapCtxFactory.java:84)
            at javax.naming.spi.NamingManager.getInitialContext(NamingManager.java:684)
            at javax.naming.InitialContext.getDefaultInitCtx(InitialContext.java:307)
            at javax.naming.InitialContext.init(InitialContext.java:242)
            at javax.naming.InitialContext.<init>(InitialContext.java:216)
            at javax.naming.directory.InitialDirContext.<init>(InitialDirContext.java:101)
            at org.acegisecurity.ldap.DefaultInitialDirContextFactory.connect(DefaultInitialDirContextFactory.java:180)
            ... 39 more
            {quote}
            Hide
            oleg_nenashev Oleg Nenashev added a comment -

            Added files from monitoring

            Show
            oleg_nenashev Oleg Nenashev added a comment - Added files from monitoring
            oleg_nenashev Oleg Nenashev made changes -
            Attachment JENKINS-20051_HTTP_Sessions.png [ 24514 ]
            Attachment JENKINS-20051_OpennedFiles.png [ 24515 ]
            Hide
            oleg_nenashev Oleg Nenashev added a comment -

            Unfortunately, issue has been reproduced on our server.
            As I see, the issue's origin was in LDAP server, which somehow caused hanging of LDAP requests till the timeout ()

            • Seems that each request spawns an additional unnamed thread, which opens a new connection to LDAP reads info from it
            • Due to LDAP service, this IO hangs
            • In case of huge number of requests to LDAP, number of opened files and threads goes beyond limits
            • No surprise that some threads in Jenkins collapse in such situations

            Partial workaround:

            • always use caching in LDAP

            BTW, it would be great to limit a number of parallel requests to LDAP

            Spawned LDAP authentication thread:

            "Thread-30370" daemon prio=10 tid=0x00002aaae0001000 nid=0x2849 runnable [0x000000005fb1c000]
            java.lang.Thread.State: RUNNABLE
            at java.net.SocketInputStream.socketRead0(Native Method)
            at java.net.SocketInputStream.read(SocketInputStream.java:152)
            at java.net.SocketInputStream.read(SocketInputStream.java:122)
            at java.io.BufferedInputStream.fill(BufferedInputStream.java:235)
            at java.io.BufferedInputStream.read1(BufferedInputStream.java:275)
            at java.io.BufferedInputStream.read(BufferedInputStream.java:334)

            • locked <0x000000072b1de300> (a java.io.BufferedInputStream)
              at com.sun.jndi.ldap.Connection.run(Connection.java:853)
              at java.lang.Thread.run(Thread.java:724)
            Show
            oleg_nenashev Oleg Nenashev added a comment - Unfortunately, issue has been reproduced on our server. As I see, the issue's origin was in LDAP server, which somehow caused hanging of LDAP requests till the timeout () Seems that each request spawns an additional unnamed thread, which opens a new connection to LDAP reads info from it Due to LDAP service, this IO hangs In case of huge number of requests to LDAP, number of opened files and threads goes beyond limits No surprise that some threads in Jenkins collapse in such situations Partial workaround: always use caching in LDAP BTW, it would be great to limit a number of parallel requests to LDAP Spawned LDAP authentication thread: "Thread-30370" daemon prio=10 tid=0x00002aaae0001000 nid=0x2849 runnable [0x000000005fb1c000] java.lang.Thread.State: RUNNABLE at java.net.SocketInputStream.socketRead0(Native Method) at java.net.SocketInputStream.read(SocketInputStream.java:152) at java.net.SocketInputStream.read(SocketInputStream.java:122) at java.io.BufferedInputStream.fill(BufferedInputStream.java:235) at java.io.BufferedInputStream.read1(BufferedInputStream.java:275) at java.io.BufferedInputStream.read(BufferedInputStream.java:334) locked <0x000000072b1de300> (a java.io.BufferedInputStream) at com.sun.jndi.ldap.Connection.run(Connection.java:853) at java.lang.Thread.run(Thread.java:724)
            Hide
            oleg_nenashev Oleg Nenashev added a comment -

            Reassigned issue to Stephen (issue has been caused by LDAP)

            Show
            oleg_nenashev Oleg Nenashev added a comment - Reassigned issue to Stephen (issue has been caused by LDAP)
            oleg_nenashev Oleg Nenashev made changes -
            Assignee Kohsuke Kawaguchi [ kohsuke ] stephenconnolly [ stephenconnolly ]
            Hide
            oleg_nenashev Oleg Nenashev added a comment -

            Workaround exists, so issue is not a blocker

            Show
            oleg_nenashev Oleg Nenashev added a comment - Workaround exists, so issue is not a blocker
            oleg_nenashev Oleg Nenashev made changes -
            Priority Blocker [ 1 ] Critical [ 2 ]
            oleg_nenashev Oleg Nenashev made changes -
            Link This issue is related to JENKINS-18524 [ JENKINS-18524 ]
            oleg_nenashev Oleg Nenashev made changes -
            Component/s core [ 15593 ]
            Component/s www [ 15484 ]
            Hide
            danielbeck Daniel Beck added a comment -

            Seems to be LDAP alone; assuming core only due to use of former 'www' component.

            Show
            danielbeck Daniel Beck added a comment - Seems to be LDAP alone; assuming core only due to use of former 'www' component.
            danielbeck Daniel Beck made changes -
            Component/s core [ 15593 ]
            Hide
            stephenconnolly Stephen Connolly added a comment -

            I have not heard of this issue for some time... anyway no point in me being the assignee

            Show
            stephenconnolly Stephen Connolly added a comment - I have not heard of this issue for some time... anyway no point in me being the assignee
            stephenconnolly Stephen Connolly made changes -
            Status Open [ 1 ] Closed [ 6 ]
            Assignee stephenconnolly [ stephenconnolly ]
            Resolution Cannot Reproduce [ 5 ]
            rtyler R. Tyler Croy made changes -
            Workflow JNJira [ 151573 ] JNJira + In-Review [ 207037 ]

              People

              • Assignee:
                Unassigned
                Reporter:
                oleg_nenashev Oleg Nenashev
              • Votes:
                0 Vote for this issue
                Watchers:
                4 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: