Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-56625

Too many open file descriptors from embedded build status usage

    Details

    • Similar Issues:
    • Released As:
      v2.0.1

      Description

      We have some users with many Jenkins Jobs they monitor using a custom HTML dashboard full of 100+ embedded build status links, refreshing once a minute. After upgrading from Build status 1.9 to 2.0 we started seeing tomcat/jenkins running into a lot of "too many open files" errors like the following:

      Mar 19, 2019 6:25:55 AM org.apache.tomcat.util.net.NioEndpoint$Acceptor run
      SEVERE: Socket accept failed
      java.io.IOException: Too many open files
              at sun.nio.ch.ServerSocketChannelImpl.accept0(Native Method)
              at sun.nio.ch.ServerSocketChannelImpl.accept(ServerSocketChannelImpl.java:422)
              at sun.nio.ch.ServerSocketChannelImpl.accept(ServerSocketChannelImpl.java:250)
              at org.apache.tomcat.util.net.NioEndpoint$Acceptor.run(NioEndpoint.java:482)
              at java.lang.Thread.run(Thread.java:748)
      
      Mar 19, 2019 6:25:56 AM sun.rmi.transport.tcp.TCPTransport$AcceptLoop executeAcceptLoop
      WARNING: RMI TCP Accept-9009: accept loop for ServerSocket[addr=0.0.0.0/0.0.0.0,localport=9009] throws
      java.net.SocketException: Too many open files (Accept failed)
              at java.net.PlainSocketImpl.socketAccept(Native Method)
              at java.net.AbstractPlainSocketImpl.accept(AbstractPlainSocketImpl.java:409)
              at java.net.ServerSocket.implAccept(ServerSocket.java:545)
              at java.net.ServerSocket.accept(ServerSocket.java:513)
              at sun.rmi.transport.tcp.TCPTransport$AcceptLoop.executeAcceptLoop(TCPTransport.java:405)
              at sun.rmi.transport.tcp.TCPTransport$AcceptLoop.run(TCPTransport.java:377)
              at java.lang.Thread.run(Thread.java:748)
      

      We looked into the tomcat process's open FDs using lsof and found /srv/jenkins/plugins/embeddable-build-status/fonts/verdana.ttf is the thing getting accessed so much

      # lsof -p $(pgrep java) | awk '{ print $9 }' | sort | uniq -c | sort -n | tail
            2 /usr/java/jre1.8.0_202/lib/ext/sunjce_provider.jar
            2 /usr/java/jre1.8.0_202/lib/ext/sunpkcs11.jar
            2 /usr/java/jre1.8.0_202/lib/jce.jar
            2 /usr/java/jre1.8.0_202/lib/jsse.jar
            2 /usr/java/jre1.8.0_202/lib/resources.jar
            2 /usr/java/jre1.8.0_202/lib/rt.jar
            5 /dev/urandom
            8 anon_inode
           16 pipe
          836 /srv/jenkins/plugins/embeddable-build-status/fonts/verdana.ttf
      

      This is a lower example, we did see up to 6200 open verdana.ttf file descriptors out of the 8192 tomcat & system wide limits.

      To reproduce this I used a test jenkins system which didn't have any of the verdana open FDs, wrote a quick loop to hammer a buildStatus link and watched the tomcat total file descriptors shoot up. I downgraded the plugin 1.9 and confirmed it didn't have this problem, so this is new behavior in 2.0, probably due to the custom text features.

      Jenkins hitting the open file descriptor limits is not always fatal in low doses but when it does hit 8k it often seems to leave the system unresponsive for extended periods of time. These open FDs do get cleaned up when garbage collection runs, however it seems that in our use case it is not always garbage collecting the FDs prior to hitting the limit of 8k.

      We runs lots of jenkins systems and have seen this problem on more than one of them. I bring this up just to point out we don't have one massive monolithic Jenkins, we've scaled horizontally to try to spread jobs out across systems yet we're still encountering this problem on at least 2 of our production instances.

        Attachments

          Activity

          Hide
          oleg_nenashev Oleg Nenashev added a comment -
          Show
          oleg_nenashev Oleg Nenashev added a comment - I believe it comes from here: https://github.com/jenkinsci/embeddable-build-status-plugin/blob/ac894bdf0953c82bbd193005f9e9cff121b77ae2/src/main/java/org/jenkinsci/plugins/badge/StatusImage.java#L182-L198  . Indeed streams are not handled correctly there when baseUrl fomes from a file .
          Hide
          diginc Adam BH added a comment -

          This has become higher priority for us so I made the fix required and tested a SNAPSHOT hpi

          https://github.com/jenkinsci/embeddable-build-status-plugin/pull/42

          Show
          diginc Adam BH added a comment - This has become higher priority for us so I made the fix required and tested a SNAPSHOT hpi https://github.com/jenkinsci/embeddable-build-status-plugin/pull/42
          Hide
          thomas_dee Thomas Döring added a comment -

          Adam BH Thank you. I just merged your pull request. A v2.0.1 bugfix release will be released any time soon.

          Show
          thomas_dee Thomas Döring added a comment - Adam BH Thank you. I just merged your pull request. A v2.0.1 bugfix release will be released any time soon.
          Hide
          thomas_dee Thomas Döring added a comment -

          Thanks to Adam BH and Oleg Nenashev

          Show
          thomas_dee Thomas Döring added a comment - Thanks to Adam BH and Oleg Nenashev

            People

            • Assignee:
              thomas_dee Thomas Döring
              Reporter:
              diginc Adam BH
            • Votes:
              2 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: