Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-45057

"too many files open": file handles leak, job output file not closed

    XMLWordPrintable

    Details

    • Similar Issues:

      Description

      Jenkins seems to keep a open file handle to the log file (job output) for every single build, even those who have been discarded by the "Discard old build policy".

       

      This is a sample of the lsof output (whole file attached)

      java 8870 jenkins 941w REG 252,0 1840 1332171 /data/jenkins/jobs/automation/jobs/emr-termination-policy/builds/.50063/log (deleted)
      java 8870 jenkins 942w REG 252,0 2023 402006 /data/jenkins/jobs/automation/jobs/emr-termination-policy/builds/.50044/log (deleted)
      java 8870 jenkins 943w REG 252,0 2193 1332217 /data/jenkins/jobs/automation/jobs/emr-termination-policy/builds/50101/log
      java 8870 jenkins 944w REG 252,0 2512 1332247 /data/jenkins/jobs/automation/jobs/emr-termination-policy/builds/50106/log
      java 8870 jenkins 945w REG 252,0 1840 1703994 /data/jenkins/jobs/automation/jobs/emr-termination-policy/builds/.50067/log (deleted)
      java 8870 jenkins 946w REG 252,0 2350 1332230 /data/jenkins/jobs/automation/jobs/emr-termination-policy/builds/.50092/log (deleted)
      java 8870 jenkins 947w REG 252,0 1840 402034 /data/jenkins/jobs/automation/jobs/emr-termination-policy/builds/.50049/log (deleted)
      java 8870 jenkins 948w REG 252,0 1840 927855 /data/jenkins/jobs/automation/jobs/emr-termination-policy/builds/.50080/log (deleted)
      java 8870 jenkins 949w REG 252,0 2195 1332245 /data/jenkins/jobs/automation/jobs/emr-termination-policy/builds/.50095/log (deleted)
      java 8870 jenkins 950w REG 252,0 2326 1332249 /data/jenkins/jobs/automation/jobs/emr-termination-policy/builds/50107/log
      java 8870 jenkins 952w REG 252,0 2195 1332227 /data/jenkins/jobs/automation/jobs/emr-termination-policy/builds/50102/log
      java 8870 jenkins 953w REG 252,0 2154 1332254 /data/jenkins/jobs/automation/jobs/emr-termination-policy/builds/50109/log
      java 8870 jenkins 954w REG 252,0 2356 1332282 /data/jenkins/jobs/automation/jobs/emr-termination-policy/builds/50105/log
      

       

        Attachments

          Issue Links

            Activity

            Hide
            oleg_nenashev Oleg Nenashev added a comment -

            Which Build Discarder do you use in your job?

            Show
            oleg_nenashev Oleg Nenashev added a comment - Which Build Discarder do you use in your job?
            Hide
            bbonacci Bruno Bonacci added a comment -

            I'm using the default build discarder.

            Show
            bbonacci Bruno Bonacci added a comment - I'm using the default build discarder.
            Hide
            danielbeck Daniel Beck added a comment -

            Please provide a list of installed plugins, and a sample configuration file of an affected job. Does this happen with all jobs?

            Show
            danielbeck Daniel Beck added a comment - Please provide a list of installed plugins, and a sample configuration file of an affected job. Does this happen with all jobs?
            Hide
            jonasatwork Jonas Jonsson added a comment - - edited

            Here's a very simple way of getting this:

            Create a simple FreeStyle job (NOTHING else but default settings) in Jenkins that only contains the following System Groovy Script:
            /*
             * See if Jenkins/Groovy leaves files open.
             */
            import hudson.model.*

            def thr = Thread.currentThread()
            def build = thr?.executable
            def jobName = build.parent.builds[0].properties.get("envVars").get("JOB_NAME")
            def jobNr = build.parent.builds[0].properties.get("envVars").get("BUILD_NUMBER")
            println "This is " + jobName + " running for the $jobNr:th time"
             

            That's it.  For every time I run this job, I get three (3!!) new open files in /proc/$PID_OF_JENKINS that points to the "log" file of the job.

            Linux (Ubuntu-14.04.5 LTS) 4.4.0-79 kernel
            Java version: 1.8.0_131-b11
            Jenkins-version: 2.66
            Groovy-plugin: 2.0
            System groovy version: 1.8.6

            Show
            jonasatwork Jonas Jonsson added a comment - - edited Here's a very simple way of getting this: Create a simple FreeStyle job (NOTHING else but default settings) in Jenkins that only contains the following System Groovy Script : /*  * See if Jenkins/Groovy leaves files open.  */ import hudson.model.* def thr = Thread.currentThread() def build = thr?.executable def jobName = build.parent.builds [0] .properties.get("envVars").get("JOB_NAME") def jobNr = build.parent.builds [0] .properties.get("envVars").get("BUILD_NUMBER") println "This is " + jobName + " running for the $jobNr:th time"   That's it.  For every time I run this job, I get three (3!!) new open files in /proc/$PID_OF_JENKINS that points to the "log" file of the job. Linux (Ubuntu-14.04.5 LTS) 4.4.0-79 kernel Java version: 1.8.0_131-b11 Jenkins-version: 2.66 Groovy-plugin: 2.0 System groovy version: 1.8.6
            Hide
            jonasatwork Jonas Jonsson added a comment -

            The rationale for JENKINS-42934 was to avoid using close() on files, was this change taken a bit to far?

            Show
            jonasatwork Jonas Jonsson added a comment - The rationale for JENKINS-42934 was to avoid using close() on files, was this change taken a bit to far?
            Hide
            jonasatwork Jonas Jonsson added a comment -

            Our problems started when stepping the Jenkins-version from 2.51 to 2.58.

            Currently our production Jenkins must be restarted after about ten days.

            Show
            jonasatwork Jonas Jonsson added a comment - Our problems started when stepping the Jenkins-version from 2.51 to 2.58. Currently our production Jenkins must be restarted after about ten days.
            Hide
            danielbeck Daniel Beck added a comment -

            Jonas Jonsson Would be helpful if you could narrow this down further.

            Show
            danielbeck Daniel Beck added a comment - Jonas Jonsson Would be helpful if you could narrow this down further.
            Hide
            jonasatwork Jonas Jonsson added a comment -

            I also notice that at the same time, (May 3), we also updated the Groovy plugin from 1.30 to 2.0.  A co-worker has tried the script above on an other Jenkins, running 2.37 but with Groovy 1.30, without having this issue.

            Unfortunately I will probably not have more time before August to try to find out what's going on.

            Show
            jonasatwork Jonas Jonsson added a comment - I also notice that at the same time, (May 3), we also updated the Groovy plugin from 1.30 to 2.0.  A co-worker has tried the script above on an other Jenkins, running 2.37 but with Groovy 1.30, without having this issue. Unfortunately I will probably not have more time before August to try to find out what's going on.
            Hide
            danielbeck Daniel Beck added a comment -

            Our problems started when stepping the Jenkins-version from 2.51 to 2.58.

            and

            tried the script above on an other Jenkins, running 2.37 but with Groovy 1.30, without having this issue

            This does not look like adding useful data. Are any of the versions wrong here?

            Show
            danielbeck Daniel Beck added a comment - Our problems started when stepping the Jenkins-version from 2.51 to 2.58. and tried the script above on an other Jenkins, running 2.37 but with Groovy 1.30, without having this issue This does not look like adding useful data. Are any of the versions wrong here?
            Hide
            jonasatwork Jonas Jonsson added a comment -

            We (me & colleagues) start to believe that this is caused by the groovy plugin.  I'll know more tomorrow morning.

            Show
            jonasatwork Jonas Jonsson added a comment - We (me & colleagues) start to believe that this is caused by the groovy plugin.  I'll know more tomorrow morning.
            Hide
            mdelaney Mike Delaney added a comment -

            I'm seeing this as well on Jenkins 2.60.1 LTS on Ubuntu 14.04. When using Jenkins 2.46.2 LTS

            Show
            mdelaney Mike Delaney added a comment - I'm seeing this as well on Jenkins 2.60.1 LTS on Ubuntu 14.04. When using Jenkins 2.46.2 LTS
            Hide
            oleg_nenashev Oleg Nenashev added a comment -

            It may possible happen if Groovy overrides Build Log appenders via log decorators, but I am not sure why Groovy plugin would need it

            Show
            oleg_nenashev Oleg Nenashev added a comment - It may possible happen if Groovy overrides Build Log appenders via log decorators, but I am not sure why Groovy plugin would need it
            Hide
            jonasatwork Jonas Jonsson added a comment -

            From a colleague:  The problem doesn't exist in Jenkins-2.51 (and Groovy-2.0).  Jenkins-2.52 has the problem.

            Hi Jonas, I didn't have a problem with Jenkins 2.51 and Groovy 2.0, but the problem occurred with Jenkins 2.52 and Groovy 2.0. I will downgrade Groovy to a previous version and try these two versions of Jenkins to work out the differences. Regards

            Show
            jonasatwork Jonas Jonsson added a comment - From a colleague:  The problem doesn't exist in Jenkins-2.51 (and Groovy-2.0).  Jenkins-2.52 has the problem. Hi Jonas, I didn't have a problem with Jenkins 2.51 and Groovy 2.0, but the problem occurred with Jenkins 2.52 and Groovy 2.0. I will downgrade Groovy to a previous version and try these two versions of Jenkins to work out the differences. Regards
            Hide
            danielbeck Daniel Beck added a comment -

            Notably there's absolutely nothing of interest in 2.52: Just a major overhaul of the German localization, other localization fixes, removal of the most incomplete localizations, and this one change in the actual code:

            https://github.com/jenkinsci/jenkins/compare/jenkins-2.51...jenkins-2.52#diff-9fafdcd0712c5a5dab3acb4ea168515aR272

            So this seems to be unrelated to core.

            Show
            danielbeck Daniel Beck added a comment - Notably there's absolutely nothing of interest in 2.52: Just a major overhaul of the German localization, other localization fixes, removal of the most incomplete localizations, and this one change in the actual code: https://github.com/jenkinsci/jenkins/compare/jenkins-2.51...jenkins-2.52#diff-9fafdcd0712c5a5dab3acb4ea168515aR272 So this seems to be unrelated to core.
            Hide
            adamleggo Adam Leggo added a comment -

            I have found a solution for the code Jonas provided, I am not sure if it fixes the problem for Bruno since no groovy example has been provided.

            Problem code:

            import hudson.model.*

            def thr = Thread.currentThread()
            def build = thr?.executable
            def jobName = build.parent.builds[0].properties.get("envVars").get("JOB_NAME")
            def jobNr = build.parent.builds[0].properties.get("envVars").get("BUILD_NUMBER")
            println "This is " + jobName + " running for the $jobNr:th time"

             

            Fixed code:

            import hudson.model.*

            def jobName = build.environment.get("JOB_NAME")
            def jobNr = build.environment.get("BUILD_NUMBER")
            println "This is " + jobName + " running for the $jobNr:th time"

             

            No open files found after the fixed job is run.

            The build object is already available for the script to use, so getting it from the currentThread causes a problem. Not sure why.

            Show
            adamleggo Adam Leggo added a comment - I have found a solution for the code Jonas provided, I am not sure if it fixes the problem for Bruno since no groovy example has been provided. Problem code: import hudson.model.* def thr = Thread.currentThread() def build = thr?.executable def jobName = build.parent.builds [0] .properties.get("envVars").get("JOB_NAME") def jobNr = build.parent.builds [0] .properties.get("envVars").get("BUILD_NUMBER") println "This is " + jobName + " running for the $jobNr:th time"   Fixed code: import hudson.model.* def jobName = build.environment.get("JOB_NAME") def jobNr = build.environment.get("BUILD_NUMBER") println "This is " + jobName + " running for the $jobNr:th time"   No open files found after the fixed job is run. The build object is already available for the script to use, so getting it from the currentThread causes a problem. Not sure why.
            Hide
            bbonacci Bruno Bonacci added a comment -

            Hi Jonas Jonsson i've tried your test and what I get is 4 new open files rather than the 3 you suggested.

             

            This is the output of the diff between two lsof execution interleaved by one job run with your code

            > java 19008 jenkins 587r REG 252,0 503 395865 /data/jenkins/jobs/automation/jobs/test-open-files/builds/7/log
            > java 19008 jenkins 589r REG 252,0 503 395865 /data/jenkins/jobs/automation/jobs/test-open-files/builds/7/log
            > java 19008 jenkins 590r REG 252,0 503 395865 /data/jenkins/jobs/automation/jobs/test-open-files/builds/7/log
            > java 19008 jenkins 592r REG 252,0 503 395865 /data/jenkins/jobs/automation/jobs/test-open-files/builds/7/log
            
            Show
            bbonacci Bruno Bonacci added a comment - Hi Jonas Jonsson i've tried your test and what I get is 4 new open files rather than the 3 you suggested.   This is the output of the diff between two lsof execution interleaved by one job run with your code > java 19008 jenkins 587r REG 252,0 503 395865 /data/jenkins/jobs/automation/jobs/test-open-files/builds/7/log > java 19008 jenkins 589r REG 252,0 503 395865 /data/jenkins/jobs/automation/jobs/test-open-files/builds/7/log > java 19008 jenkins 590r REG 252,0 503 395865 /data/jenkins/jobs/automation/jobs/test-open-files/builds/7/log > java 19008 jenkins 592r REG 252,0 503 395865 /data/jenkins/jobs/automation/jobs/test-open-files/builds/7/log
            Hide
            adamleggo Adam Leggo added a comment -

            Hi Bruno Bonacci,

            Can you post an example of your emr-termination-policy groovy code?
            Please provide a list of installed plugins and a sample configuration file of an affected job.

            Show
            adamleggo Adam Leggo added a comment - Hi Bruno Bonacci , Can you post an example of your emr-termination-policy groovy code? Please provide a list of installed plugins and a sample configuration file of an affected job.
            Hide
            bbonacci Bruno Bonacci added a comment - - edited

            Hi Adam Leggo,
            the emr-termination-policy is a Freestyle job with a simple (bash) shell script.
            So I've been digging down and I've narrowed down the problem.
            It looks like when the option Use secret text(s) or file(s) is active the file handle leaks.

            Steps to reproduce:

            1. create free style project
            2. add one step with shell script running "echo test"
            3. click on Use secret text(s) or file(s)
            4. save job
            5. count numbers of open files with lsof -p <pid> | wc -l
            6. build job
            7. count numbers of open files with lsof -p <pid> | wc -l
            8. repeat last two steps.

            In my environment 1 file (the build log) handle is always leaked.

            Show
            bbonacci Bruno Bonacci added a comment - - edited Hi Adam Leggo , the emr-termination-policy is a Freestyle job with a simple (bash) shell script. So I've been digging down and I've narrowed down the problem. It looks like when the option Use secret text(s) or file(s) is active the file handle leaks. Steps to reproduce: create free style project add one step with shell script running "echo test" click on Use secret text(s) or file(s) save job count numbers of open files with lsof -p <pid> | wc -l build job count numbers of open files with lsof -p <pid> | wc -l repeat last two steps. In my environment 1 file (the build log) handle is always leaked.
            Hide
            bbonacci Bruno Bonacci added a comment - - edited

            The "secrets" extensions has a feature for which if the secrets appear as output in the log they are replaced with "*******".
            I guess somewhere in there, the log file isn't closed properly and the file handle leaks.

            Show
            bbonacci Bruno Bonacci added a comment - - edited The "secrets" extensions has a feature for which if the secrets appear as output in the log they are replaced with "*******". I guess somewhere in there, the log file isn't closed properly and the file handle leaks.
            Hide
            mdelaney Mike Delaney added a comment -

            I see this behavior without using "secrets" extension.

            Show
            mdelaney Mike Delaney added a comment - I see this behavior without using "secrets" extension.
            Hide
            abhishekmukherg Abhishek Mukherjee added a comment - - edited

            We're also seeing this behavior on our jenkins master running 2.60.1. Happy to provide any relevant information if I can be of help, just not sure what to get. Just to put in perspective, we're having to restart our master every ~4 days for one of our very busy jobs, with FD limit already increased to 10k. I believe we are seeing the same thing as Bruno, as we also have secrets bound to these jobs

            Show
            abhishekmukherg Abhishek Mukherjee added a comment - - edited We're also seeing this behavior on our jenkins master running 2.60.1. Happy to provide any relevant information if I can be of help, just not sure what to get. Just to put in perspective, we're having to restart our master every ~4 days for one of our very busy jobs, with FD limit already increased to 10k. I believe we are seeing the same thing as Bruno, as we also have secrets bound to these jobs
            Hide
            abhishekmukherg Abhishek Mukherjee added a comment -

            We appear to have upgraded the relevant plugin (Credentials Binding Plugin) to 1.12, if that's relevant

            Show
            abhishekmukherg Abhishek Mukherjee added a comment - We appear to have upgraded the relevant plugin (Credentials Binding Plugin) to 1.12, if that's relevant
            Hide
            ifoundmyhappythought kyle evans added a comment - - edited

            We are also seeing this behavior on Jenkins 2.54 with credentials binding plugin 1.12.

             

            Edit: this also seems to be the same issue: https://issues.jenkins-ci.org/browse/JENKINS-43199

            Also, there is a discussion around a pull request here: https://github.com/jenkinsci/credentials-binding-plugin/pull/37

            Show
            ifoundmyhappythought kyle evans added a comment - - edited We are also seeing this behavior on Jenkins 2.54 with credentials binding plugin 1.12.   Edit: this also seems to be the same issue: https://issues.jenkins-ci.org/browse/JENKINS-43199 Also, there is a discussion around a pull request here:  https://github.com/jenkinsci/credentials-binding-plugin/pull/37
            Hide
            oleg_nenashev Oleg Nenashev added a comment -

            There are more and more reports in JENKINS-43199, and the maintainer rejects to apply the hotfix in his plugin. So we may have to fix it in the core ("we" === "Jenkins community", feel free to contribute)

            Show
            oleg_nenashev Oleg Nenashev added a comment - There are more and more reports in JENKINS-43199 , and the maintainer rejects to apply the hotfix in his plugin. So we may have to fix it in the core ("we" === "Jenkins community", feel free to contribute)
            Hide
            shahmishal mishal shah added a comment -

            Is it possible to downgrade a plugin to resolve the fd leak? Has anyone tried downgrading from credentials binding plugin 1.12 to credentials binding plugin 1.11, did it help resolve this issue? Thanks! We have to restart our Jenkins once a day.

            Show
            shahmishal mishal shah added a comment - Is it possible to downgrade a plugin to resolve the fd leak? Has anyone tried downgrading from credentials binding plugin 1.12 to credentials binding plugin 1.11, did it help resolve this issue? Thanks! We have to restart our Jenkins once a day.
            Hide
            abhishekmukherg Abhishek Mukherjee added a comment -

            My team has tried downgrading to 1.11, but it did not help any. We're hoping to take our first stab at making a pull request for this sometime this week – we've never done any jenkins core changes, however, so no promises

            Show
            abhishekmukherg Abhishek Mukherjee added a comment - My team has tried downgrading to 1.11, but it did not help any. We're hoping to take our first stab at making a pull request for this sometime this week – we've never done any jenkins core changes, however, so no promises
            Hide
            shahmishal mishal shah added a comment -

            Abhishek Mukherjee Good Luck! Looking forward to your fix and Thanks! 

            Show
            shahmishal mishal shah added a comment - Abhishek Mukherjee Good Luck! Looking forward to your fix and Thanks! 
            Hide
            andreasmandel Andreas Mandel added a comment -

            We are also hit hard by this running LTS 2.60.1. I wonder if this Blocker was fixed in 2.60.2? The Changelog does not look promising, but how can a LTS Version be released with this known issue . We do NOT have the Credentials Binding Plugin installed at all.

            Show
            andreasmandel Andreas Mandel added a comment - We are also hit hard by this running LTS 2.60.1. I wonder if this Blocker was fixed in 2.60.2? The Changelog does not look promising, but how can a LTS Version be released with this known issue . We do NOT have the Credentials Binding Plugin installed at all.
            Hide
            shahmishal mishal shah added a comment - - edited

            Bruno Bonacci, What is the process on getting this issue escalated to be fixed soon?

            Show
            shahmishal mishal shah added a comment - - edited Bruno Bonacci , What is the process on getting this issue escalated to be fixed soon?
            Hide
            oleg_nenashev Oleg Nenashev added a comment -

            mishal shah This is an open-source project, there is no escalation process. The best way to help with this issue is to Participate and Contribute. In the Jenkins community we always encourage it.

            P.S: If you want to do escalations, there are companies offering commercial support

             

            Show
            oleg_nenashev Oleg Nenashev added a comment - mishal shah This is an open-source project, there is no escalation process. The best way to help with this issue is to Participate and Contribute . In the Jenkins community we always encourage it. P.S: If you want to do escalations, there are companies offering commercial support  
            Hide
            shahmishal mishal shah added a comment -

            Oleg Nenashev Thanks!

            Show
            shahmishal mishal shah added a comment - Oleg Nenashev Thanks!
            Hide
            alexraddas Alex Raddas added a comment -

            I am also encountering this issue in our production environment, the master is hitting 16k FDs open every 5hrs which requires a restart of the jenkins service prior to that.

            Show
            alexraddas Alex Raddas added a comment - I am also encountering this issue in our production environment, the master is hitting 16k FDs open every 5hrs which requires a restart of the jenkins service prior to that.
            Hide
            wheleph Volodymyr Sobotovych added a comment -

            We started seeing the issue after upgrading Jenkins from 2.46.2 LTS to 2.60.2 LTS and SSH Slaves plugin from 1.9 to 1.20. Credentials Binding plugin was NOT upgraded.

            If the bug is in Credentials Binding plugin why it didn't appear before?

            Show
            wheleph Volodymyr Sobotovych added a comment - We started seeing the issue after upgrading Jenkins from 2.46.2 LTS to 2.60.2 LTS and SSH Slaves plugin from 1.9 to 1.20. Credentials Binding plugin was NOT upgraded. If the bug is in Credentials Binding plugin why it didn't appear before?
            Hide
            danielbeck Daniel Beck added a comment -

            Would be interesting to know whether this started between 2.52 (unaffected) and 2.53 (affected). If so, JENKINS-42934 would be a likely culprit. Jonas Jonsson reported that 2.52 was the first to be affected, I wonder whether that report was off by one.

            Show
            danielbeck Daniel Beck added a comment - Would be interesting to know whether this started between 2.52 (unaffected) and 2.53 (affected). If so, JENKINS-42934 would be a likely culprit. Jonas Jonsson reported that 2.52 was the first to be affected, I wonder whether that report was off by one.
            Hide
            saretter Sascha Retter added a comment -

            Currently I have no answer for that question.

            What I can say the problem is reproducible on 2.60.2 but it isn't on 2.50.

            On both versions if I start build of a job with the above mentioned groovy the number of file handles used by jenkins process increases but on 2.50 it also decreases after a while - it looks like not immediately after the job finished - but on 2.60.2 it only increases and never decreases.

            I'll try to find some time to check for 2.52 and 2.53 or ask a colleague to do so.

            Show
            saretter Sascha Retter added a comment - Currently I have no answer for that question. What I can say the problem is reproducible on 2.60.2 but it isn't on 2.50. On both versions if I start build of a job with the above mentioned groovy the number of file handles used by jenkins process increases but on 2.50 it also decreases after a while - it looks like not immediately after the job finished - but on 2.60.2 it only increases and never decreases. I'll try to find some time to check for 2.52 and 2.53 or ask a colleague to do so.
            Hide
            carlescapdevila Carles Capdevila added a comment - - edited

            EDIT: Earlier in this comment I was saying that the problem could be reproduced in 2.52. I was wrong. I accidentally shuffled the war's name and didn't notice the version. I apologize to anyone who took the time to verify this.

             

            I tested this on 2.52 and 2.53 as Sascha Retter asked me:

            In 2.52 could not reproduce the problem.

            In 2.53 could reproduce it.

            As Daniel Beck suggested, https://issues.jenkins-ci.org/browse/JENKINS-42934 might be related to this.

            Show
            carlescapdevila Carles Capdevila added a comment - - edited EDIT: Earlier in this comment I was saying that the problem could be reproduced in 2.52. I was wrong. I accidentally shuffled the war's name and didn't notice the version. I apologize to anyone who took the time to verify this.   I tested this on 2.52 and 2.53 as Sascha Retter asked me: In 2.52 could not reproduce the problem. In 2.53 could reproduce it. As Daniel Beck suggested, https://issues.jenkins-ci.org/browse/JENKINS-42934 might be related to this.
            Hide
            batmat Baptiste Mathus added a comment -

            Bruno Bonacci Carles Capdevila would be great if you can use git bisect to find out even more precisely which commit introduced this. If you're unclear on how to use it, I can provide/write a documentation for it. Thanks!

            Show
            batmat Baptiste Mathus added a comment - Bruno Bonacci Carles Capdevila would be great if you can use git bisect to find out even more precisely which commit introduced this. If you're unclear on how to use it, I can provide/write a documentation for it. Thanks!
            Hide
            danielbeck Daniel Beck added a comment -

            use git bisect to find out even more precisely which commit introduced this

            Or just test https://github.com/jenkinsci/jenkins/commit/bde09f70afaf10d5e1453c257058a56b07556e8e which is assumed to break, and https://github.com/jenkinsci/jenkins/commit/0ddf2d5be77072264845a5f4cf197d91d32e4695 which is assumed to not break, to begin with, and see whether that's the cause.

            Show
            danielbeck Daniel Beck added a comment - use git bisect to find out even more precisely which commit introduced this Or just test https://github.com/jenkinsci/jenkins/commit/bde09f70afaf10d5e1453c257058a56b07556e8e which is assumed to break, and https://github.com/jenkinsci/jenkins/commit/0ddf2d5be77072264845a5f4cf197d91d32e4695 which is assumed to not break, to begin with, and see whether that's the cause.
            Hide
            carlescapdevila Carles Capdevila added a comment - - edited

            Tested https://github.com/jenkinsci/jenkins/commit/bde09f70afaf10d5e1453c257058a56b07556e8e and it did indeed break, this one https://github.com/jenkinsci/jenkins/commit/0ddf2d5be77072264845a5f4cf197d91d32e4695 was OK.

             

            By the way, I'm using Windows' "handle -s -p <jenkinsPID>" command to detect the file handles. The https://wiki.jenkins.io/display/JENKINS/File+Leak+Detector+Plugin does not show anything when the builds are over, but handle does show an increment of the files long after the builds are over.

             

            UPDATE: Tested https://github.com/jenkinsci/jenkins/commit/a3ef5b6048d66e59e48455b48623e30c14be8df4  - OK

            and then the next https://github.com/jenkinsci/jenkins/commit/f0cd7ae8ff269dd738e3377a62f3fbebebf9aef6 - has the issue, so this commit introduces the leak

            Show
            carlescapdevila Carles Capdevila added a comment - - edited Tested https://github.com/jenkinsci/jenkins/commit/bde09f70afaf10d5e1453c257058a56b07556e8e and it did indeed break, this one https://github.com/jenkinsci/jenkins/commit/0ddf2d5be77072264845a5f4cf197d91d32e4695 was OK.   By the way, I'm using Windows' "handle -s -p <jenkinsPID>" command to detect the file handles. The https://wiki.jenkins.io/display/JENKINS/File+Leak+Detector+Plugin does not show anything when the builds are over, but handle does show an increment of the files long after the builds are over.   UPDATE: Tested https://github.com/jenkinsci/jenkins/commit/a3ef5b6048d66e59e48455b48623e30c14be8df4  - OK and then the next https://github.com/jenkinsci/jenkins/commit/f0cd7ae8ff269dd738e3377a62f3fbebebf9aef6 - has the issue, so this commit introduces the leak
            Hide
            danielbeck Daniel Beck added a comment -
            Show
            danielbeck Daniel Beck added a comment - Stephen Connolly PTAL
            Hide
            stephenconnolly Stephen Connolly added a comment -

            Carles Capdevila any chance you could try the attached patch against HEAD (probably easy to apply to most versions) and see if that resolves the issue. Seems like there may be some paths where the run's log stream does not get closed correctly

            jenkins-45057.patch

            Show
            stephenconnolly Stephen Connolly added a comment - Carles Capdevila any chance you could try the attached patch against HEAD (probably easy to apply to most versions) and see if that resolves the issue. Seems like there may be some paths where the run's log stream does not get closed correctly jenkins-45057.patch
            Hide
            carlescapdevila Carles Capdevila added a comment - - edited

            Tested against HEAD with the patch applied and no luck. Tested too against 2.53 and 2.60.1 (both of them with the patch) and same, the leak doesn't go away. Thank you very much for the effort nevertheless.

            EDIT: I'm reproducing the issue according to Jonas Jonsson's comment: https://issues.jenkins-ci.org/browse/JENKINS-45057?focusedCommentId=304877&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-304877 (but in my case I'm using windows so I check the file usage with the "handle" command). So could it be due to some kind of interaction with the Groovy Plugin?

            Show
            carlescapdevila Carles Capdevila added a comment - - edited Tested against HEAD with the patch applied and no luck. Tested too against 2.53 and 2.60.1 (both of them with the patch) and same, the leak doesn't go away. Thank you very much for the effort nevertheless. EDIT: I'm reproducing the issue according to Jonas Jonsson 's comment: https://issues.jenkins-ci.org/browse/JENKINS-45057?focusedCommentId=304877&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-304877 (but in my case I'm using windows so I check the file usage with the "handle" command). So could it be due to some kind of interaction with the Groovy Plugin?
            Hide
            stephenconnolly Stephen Connolly added a comment -

            Carles Capdevila so one interesting thing is that most of the file handles look ok except for the emr-termination-policy files.

            There are 406 file handles open of the type:

            java    8870 jenkins  991w   REG              252,0        2194 1332198 /data/jenkins/jobs/automation/jobs/emr-termination-policy/builds/.50086/log (deleted)

            So these are file handles open on a file that appears to be deleted!

            406 of them to be precise:

            $ grep "(deleted)" filesopen.txt | wc -l
                 406

            And all but two of them are emr-termination-policy 

            $ grep "(deleted)" filesopen.txt | grep emr-termination-policy | wc -l
                 404
            $ grep "(deleted)" filesopen.txt | grep optimus | wc -l
                   2

            When we look at the file handles, these are WRITE file handles, so the file handle has to be opened inside Run.execute()

            Just to confirm, these two jobs are Freestyle jobs and not Pipeline jobs?

             

            Show
            stephenconnolly Stephen Connolly added a comment - Carles Capdevila so one interesting thing is that most of the file handles look ok except for the emr-termination-policy files. There are 406 file handles open of the type: java 8870 jenkins 991w REG 252,0 2194 1332198 /data/jenkins/jobs/automation/jobs/emr-termination-policy/builds/.50086/log (deleted) So these are file handles open on a file that appears to be deleted! 406 of them to be precise: $ grep "(deleted)" filesopen.txt | wc -l 406 And all but two of them are emr-termination-policy   $ grep "(deleted)" filesopen.txt | grep emr-termination-policy | wc -l 404 $ grep "(deleted)" filesopen.txt | grep optimus | wc -l 2 When we look at the file handles, these are WRITE file handles, so the file handle has to be opened inside Run.execute() Just to confirm, these two jobs are Freestyle jobs and not Pipeline jobs?  
            Hide
            stephenconnolly Stephen Connolly added a comment -

            Hmmm... digging some more, the Run.delete() method will do a rename of the build directory from /XXX/ to /.XXX so this looks very much like delete() is not waiting for the running job to complete

            Show
            stephenconnolly Stephen Connolly added a comment - Hmmm... digging some more, the Run.delete() method will do a rename of the build directory from /XXX/ to /.XXX so this looks very much like delete() is not waiting for the running job to complete
            Hide
            stephenconnolly Stephen Connolly added a comment -

            Hmmm I wonder if emr-termination-policy has:

            • a short log rotation period,
            • runs almost continually
            • uses a Notifier that perhaps should be a Publisher?

            What Post-build actions do you have configured Carles Capdevila?

            Show
            stephenconnolly Stephen Connolly added a comment - Hmmm I wonder if emr-termination-policy has: a short log rotation period, runs almost continually uses a Notifier that perhaps should be a Publisher? What Post-build actions do you have configured Carles Capdevila ?
            Hide
            stephenconnolly Stephen Connolly added a comment -

            Bruno Bonacci sorry just realized that you are the one with the emr-termination-policy job

            Show
            stephenconnolly Stephen Connolly added a comment - Bruno Bonacci sorry just realized that you are the one with the emr-termination-policy job
            Hide
            stephenconnolly Stephen Connolly added a comment -

            So https://github.com/jenkinsci/jenkins/pull/2953 should fix the credentials binding plugin issue in any case... though I think that it should be part of the contract of plugins annotating the console that they pass the close through, so in a sense it is a defensive core change and Jesse Glick should just merge https://github.com/jenkinsci/credentials-binding-plugin/pull/37

            Show
            stephenconnolly Stephen Connolly added a comment - So  https://github.com/jenkinsci/jenkins/pull/2953  should fix the credentials binding plugin issue in any case... though I think that it should be part of the contract of plugins annotating the console that they pass the close through, so in a sense it is a defensive core change and Jesse Glick should just merge  https://github.com/jenkinsci/credentials-binding-plugin/pull/37
            Hide
            jonasatwork Jonas Jonsson added a comment -

            Hi, finally back after some holiday

            Yes, the job is a freestyle job, just as I wrote.  I took a quick look at the changes that was done in JENKINS-42934 and noticed that on a few places, the "close()" calls on created files has been removed completely, earlier on they lived in some "finally{}" blocks in the code.  The reason for the change is ([https://bugs.openjdk.java.net/browse/JDK-8080225|https://bugs.openjdk.java.net/browse/JDK-8080225)]. I don't interpret that the calls to close() should be removed just because of this change.

            Show
            jonasatwork Jonas Jonsson added a comment - Hi, finally back after some holiday Yes, the job is a freestyle job, just as I wrote.  I took a quick look at the changes that was done in JENKINS-42934 and noticed that on a few places, the "close()" calls on created files has been removed completely, earlier on they lived in some "finally{}" blocks in the code.  The reason for the change is ([https://bugs.openjdk.java.net/browse/JDK-8080225| https://bugs.openjdk.java.net/browse/JDK-8080225 )]. I don't interpret that the calls to close() should be removed just because of this change.
            Hide
            stephenconnolly Stephen Connolly added a comment -

            Jonas Jonsson pointers to the cases you believe a handle is escaping?

            Show
            stephenconnolly Stephen Connolly added a comment - Jonas Jonsson pointers to the cases you believe a handle is escaping?
            Hide
            stephenconnolly Stephen Connolly added a comment -

            Jonas Jonsson keep in mind that we moved from 

            InputStream is = ...;
            try {
              ...
            } finally {
              is.close();
            }

            to try-with-resources:

            try (InputStream is = ...) {
              ...
            }

            So expect those close calls to be handled by try-with-resources

             

            Show
            stephenconnolly Stephen Connolly added a comment - Jonas Jonsson keep in mind that we moved from  InputStream is = ...; try { ... } finally { is.close(); } to try-with-resources: try (InputStream is = ...) { ... } So expect those close calls to be handled by try-with-resources  
            Hide
            jonasatwork Jonas Jonsson added a comment -

            Sorry, I'm not really up to date with java ...

            Show
            jonasatwork Jonas Jonsson added a comment - Sorry, I'm not really up to date with java ...
            Hide
            stephenconnolly Stephen Connolly added a comment -

            Ok, so the Groovy leak appears to be an issue with Stapler!!!

            Groovy is querying the properties and discovers the `getLogText()` method which results in stapler opening the read handle using a FileInputStream... which is then pending finalization at which point the file handle will be released...

            IOW this is a replica of JENKINS-42934 only against Stapler... it being https://github.com/stapler/stapler/blob/3ac71dce264da052186956ef06b772a91ca74d5e/core/src/main/java/org/kohsuke/stapler/framework/io/LargeText.java#L457-L467 that is responsible for the leak!!!

            Show
            stephenconnolly Stephen Connolly added a comment - Ok, so the Groovy leak appears to be an issue with Stapler!!! Groovy is querying the properties and discovers the `getLogText()` method which results in stapler opening the read handle using a FileInputStream... which is then pending finalization at which point the file handle will be released... IOW this is a replica of JENKINS-42934 only against Stapler... it being https://github.com/stapler/stapler/blob/3ac71dce264da052186956ef06b772a91ca74d5e/core/src/main/java/org/kohsuke/stapler/framework/io/LargeText.java#L457-L467  that is responsible for the leak!!!
            Hide
            jglick Jesse Glick added a comment -

            Well I suppose the workaround is to use the less obtuse

            def jobName = build.parent.builds[0].envVars.JOB_NAME

            If you are using a sandboxed script, well DefaultGroovyMethods.getProperties(Object) is already blacklisted so you could not make this mistake to begin with.

            Show
            jglick Jesse Glick added a comment - Well I suppose the workaround is to use the less obtuse def jobName = build.parent.builds[0].envVars.JOB_NAME If you are using a sandboxed script, well DefaultGroovyMethods.getProperties(Object) is already blacklisted so you could not make this mistake to begin with.
            Hide
            andreasmandel Andreas Mandel added a comment -

            Loos like we had been hit by a different side-effect of the identified change. All "OutputStream" returned from instance of hudson.console.ConsoleLogFilter must close the wrapped OutputStream, when close is called. Could be it was expected before - now with core 2.53ff it leads to leak of file handles if you miss this.

            This change fixed the issue for our plugin: https://github.com/SoftwareBuildService/log-file-filter-plugin/commit/c1148435a454aa5a3a72bab05c3a6996ea5f42f5

            Show
            andreasmandel Andreas Mandel added a comment - Loos like we had been hit by a different side-effect of the identified change. All "OutputStream" returned from instance of hudson.console.ConsoleLogFilter must close the wrapped OutputStream, when close is called. Could be it was expected before - now with core 2.53ff it leads to leak of file handles if you miss this. This change fixed the issue for our plugin: https://github.com/SoftwareBuildService/log-file-filter-plugin/commit/c1148435a454aa5a3a72bab05c3a6996ea5f42f5
            Hide
            oleg_nenashev Oleg Nenashev added a comment - - edited

            It should be solved by https://github.com/jenkinsci/jenkins/pull/2954 in the 2.73

            Show
            oleg_nenashev Oleg Nenashev added a comment - - edited It should be solved by https://github.com/jenkinsci/jenkins/pull/2954 in the 2.73
            Hide
            jglick Jesse Glick added a comment -

            Should be fixed, but needs verification if there is a reproducible test case.

            Show
            jglick Jesse Glick added a comment - Should be fixed, but needs verification if there is a reproducible test case.
            Hide
            oleg_nenashev Oleg Nenashev added a comment -

            It didn't get into 2.60.3 since it was fixed/integrated too late. It will be a candidate for the next baseline

            Show
            oleg_nenashev Oleg Nenashev added a comment - It didn't get into 2.60.3 since it was fixed/integrated too late. It will be a candidate for the next baseline
            Hide
            oleg_nenashev Oleg Nenashev added a comment -

            As Jesse Glick says, the patch in credentials binding 1.13 has been released, so the partial fix can be applied via the plugin update.

             

            Show
            oleg_nenashev Oleg Nenashev added a comment - As Jesse Glick says, the patch in credentials binding 1.13 has been released, so the partial fix can be applied via the plugin update.  
            Hide
            stevenatcisco Steven Christenson added a comment -

            Above is the change in file handle usage after upgrading to CloudBees Jenkins Enterprise 2.60.2.2-rolling. Our workaround until the core version is released is to set ulimit -n very large and reboot at least weekly.

            If a better interim solution is known, we'd love to hear it.

            Show
            stevenatcisco Steven Christenson added a comment - Above is the change in file handle usage after upgrading to CloudBees Jenkins Enterprise 2.60.2.2-rolling. Our workaround until the core version is released is to set ulimit -n very large and reboot at least weekly. If a better interim solution is known, we'd love to hear it.
            Hide
            oleg_nenashev Oleg Nenashev added a comment - - edited

            Steven Christenson what does cause it? If it is a Credentials Binding plugin, you can just update it (see the linked issues). Jenkins core just provides a generic fix for all cases, but plugins can be patched on their own without a need to bump the core. You can use http://file-leak-detector.kohsuke.org/ to triage the root cause

            In Jenkins the patch will be available in 2.73.1 LTS. Regarding CloudBees Jenkins Enterprise, please contact the vendor's support

            Show
            oleg_nenashev Oleg Nenashev added a comment - - edited Steven Christenson what does cause it? If it is a Credentials Binding plugin, you can just update it (see the linked issues). Jenkins core just provides a generic fix for all cases, but plugins can be patched on their own without a need to bump the core. You can use http://file-leak-detector.kohsuke.org/ to triage the root cause In Jenkins the patch will be available in 2.73.1 LTS. Regarding CloudBees Jenkins Enterprise, please contact the vendor's support
            Hide
            jglick Jesse Glick added a comment -

            I suppose lts-candidate can be removed given that this is already in 2.73.

            Oleg Nenashev the File Leak Detector plugin (better than the linked standalone tool) would not be helpful here since we already know well where the file handle is opened, when the build starts. The issue is why it is not closed, which will depend on which console-affecting plugins are activated during the build.

            Show
            jglick Jesse Glick added a comment - I suppose lts-candidate can be removed given that this is already in 2.73. Oleg Nenashev the File Leak Detector plugin (better than the linked standalone tool) would not be helpful here since we already know well where the file handle is opened, when the build starts. The issue is why it is not closed, which will depend on which console-affecting plugins are activated during the build.
            Hide
            danielbeck Daniel Beck added a comment -

            Right, the Stapler one is tracked in JENKINS-45903.

            Show
            danielbeck Daniel Beck added a comment - Right, the Stapler one is tracked in JENKINS-45903 .
            Hide
            stevenatcisco Steven Christenson added a comment -

            Oleg Nenashev: We tried using the File Leak Detector Plugin... it would not run, apparently it requires Oracle Java - we are using OpenJDK. The kohsuke leak detector when run crashed our Jenkins instance. It too seems to require Oracle Java.

            Here is the job we are running hourly, and the results

            {{ /* JOB TO PERIODICALLY CHECK FILE HANDLES */}}node('master') {
            {{ sh '''rm -f lsof.txt }}
            {{ lsof -u jenkins > lsof.txt}}
            {{ cut -f 1 /proc/sys/fs/file-nr > filehandles.txt}}
            {{ echo "$(cat filehandles.txt)=handles |" > numfiles.txt}}
            {{ echo "$(wc -l < lsof.txt)=JenkLSOF |" >> numfiles.txt}}
            {{ echo "$(grep -Fc \'(deleted)\' lsof.txt)=deleted " >> numfiles.txt}}
            {{ cat numfiles.txt}}
            {{ '''}}
            {{ archiveArtifacts allowEmptyArchive: true, artifacts: '*.txt', caseSensitive: false}}
            {{ result = readFile 'numfiles.txt'}}
            {{ currentBuild.description = result}}
            {{ fileHandlesInUse = readFile 'filehandles.txt'}}
            {{ deleteDir()}}
            {{ } // node}}

            {{/******* RESULTS *******/ }}
            {{ Aug 30, 2017 6:56 AM 9472=handles | 10554=JenkLSOF | 3621=deleted}}
            {{ Aug 30, 2017 5:56 AM 9568=handles | 10654=JenkLSOF | 3557=deleted}}
            {{ Aug 30, 2017 4:56 AM 9376=handles | 10521=JenkLSOF | 3524=deleted}}
            {{ Aug 30, 2017 3:56 AM 9312=handles | 10417=JenkLSOF | 3462=deleted}}
            {{ Aug 30, 2017 2:56 AM 9216=handles | 10358=JenkLSOF | 3401=deleted}}
            {{ Aug 30, 2017 1:56 AM 9184=handles | 10276=JenkLSOF | 3338=deleted}}
            {{ Aug 30, 2017 12:56 AM 9312=handles | 10406=JenkLSOF | 3303=deleted}}
            {{ Aug 29, 2017 11:56 PM 9216=handles | 10338=JenkLSOF | 3236=deleted}}
            {{ Aug 29, 2017 10:56 PM 9408=handles | 10423=JenkLSOF | 3198=deleted}}
            {{ Aug 29, 2017 9:56 PM 8896=handles | 10042=JenkLSOF | 3137=deleted}}
            {{ Aug 29, 2017 8:56 PM 9024=handles | 10138=JenkLSOF | 3098=deleted}}
            {{ Aug 29, 2017 7:56 PM 9024=handles | 10243=JenkLSOF | 3028=deleted}}
            {{ Aug 29, 2017 6:56 PM 8896=handles | 9948=JenkLSOF | 2981=deleted}}
            {{ Aug 29, 2017 5:56 PM 8768=handles | 9879=JenkLSOF | 2913=deleted}}
            {{ Aug 29, 2017 4:56 PM 8832=handles | 9879=JenkLSOF | 2844=deleted}}
            {{ Aug 29, 2017 3:56 PM 8608=handles | 9731=JenkLSOF | 2773=deleted}}
            {{ Aug 29, 2017 2:56 PM 8448=handles | 9587=JenkLSOF | 2741=deleted}}
            {{ Aug 29, 2017 1:56 PM 8384=handles | 9556=JenkLSOF | 2681=deleted}}
            {{ Aug 29, 2017 12:56 PM 8192=handles | 9452=JenkLSOF | 2650=deleted}}
            {{ Aug 29, 2017 11:56 AM 8096=handles | 9306=JenkLSOF | 2590=deleted}}
            {{ Aug 29, 2017 1:56 AM 8064=handles | 8921=JenkLSOF | 2081=deleted}}

            The "deleted" items are all log entries like those described in the original incident. 

            NOTE: I have opened an incident under our support contract, but have posted details here in case they may help to diagnose the root cause.  Is there another tool we can use?  Or would the LSOF output over many hours be sufficient?

            Show
            stevenatcisco Steven Christenson added a comment - Oleg Nenashev : We tried using the File Leak Detector Plugin... it would not run, apparently it requires Oracle Java - we are using OpenJDK. The kohsuke leak detector when run crashed our Jenkins instance. It too seems to require Oracle Java. Here is the job we are running hourly, and the results {{ /* JOB TO PERIODICALLY CHECK FILE HANDLES */}} node('master') { {{ sh '''rm -f lsof.txt }} {{ lsof -u jenkins > lsof.txt}} {{ cut -f 1 /proc/sys/fs/file-nr > filehandles.txt}} {{ echo "$(cat filehandles.txt)=handles |" > numfiles.txt}} {{ echo "$(wc -l < lsof.txt)=JenkLSOF |" >> numfiles.txt}} {{ echo "$(grep -Fc \'(deleted)\' lsof.txt)=deleted " >> numfiles.txt}} {{ cat numfiles.txt}} {{ '''}} {{ archiveArtifacts allowEmptyArchive: true, artifacts: '*.txt', caseSensitive: false}} {{ result = readFile 'numfiles.txt'}} {{ currentBuild.description = result}} {{ fileHandlesInUse = readFile 'filehandles.txt'}} {{ deleteDir()}} {{ } // node}} {{/******* RESULTS *******/ }} {{ Aug 30, 2017 6:56 AM 9472=handles | 10554=JenkLSOF | 3621=deleted}} {{ Aug 30, 2017 5:56 AM 9568=handles | 10654=JenkLSOF | 3557=deleted}} {{ Aug 30, 2017 4:56 AM 9376=handles | 10521=JenkLSOF | 3524=deleted}} {{ Aug 30, 2017 3:56 AM 9312=handles | 10417=JenkLSOF | 3462=deleted}} {{ Aug 30, 2017 2:56 AM 9216=handles | 10358=JenkLSOF | 3401=deleted}} {{ Aug 30, 2017 1:56 AM 9184=handles | 10276=JenkLSOF | 3338=deleted}} {{ Aug 30, 2017 12:56 AM 9312=handles | 10406=JenkLSOF | 3303=deleted}} {{ Aug 29, 2017 11:56 PM 9216=handles | 10338=JenkLSOF | 3236=deleted}} {{ Aug 29, 2017 10:56 PM 9408=handles | 10423=JenkLSOF | 3198=deleted}} {{ Aug 29, 2017 9:56 PM 8896=handles | 10042=JenkLSOF | 3137=deleted}} {{ Aug 29, 2017 8:56 PM 9024=handles | 10138=JenkLSOF | 3098=deleted}} {{ Aug 29, 2017 7:56 PM 9024=handles | 10243=JenkLSOF | 3028=deleted}} {{ Aug 29, 2017 6:56 PM 8896=handles | 9948=JenkLSOF | 2981=deleted}} {{ Aug 29, 2017 5:56 PM 8768=handles | 9879=JenkLSOF | 2913=deleted}} {{ Aug 29, 2017 4:56 PM 8832=handles | 9879=JenkLSOF | 2844=deleted}} {{ Aug 29, 2017 3:56 PM 8608=handles | 9731=JenkLSOF | 2773=deleted}} {{ Aug 29, 2017 2:56 PM 8448=handles | 9587=JenkLSOF | 2741=deleted}} {{ Aug 29, 2017 1:56 PM 8384=handles | 9556=JenkLSOF | 2681=deleted}} {{ Aug 29, 2017 12:56 PM 8192=handles | 9452=JenkLSOF | 2650=deleted}} {{ Aug 29, 2017 11:56 AM 8096=handles | 9306=JenkLSOF | 2590=deleted}} {{ Aug 29, 2017 1:56 AM 8064=handles | 8921=JenkLSOF | 2081=deleted}} The "deleted" items are all log entries like those described in the original incident.  NOTE: I have opened an incident under our support contract, but have posted details here in case they may help to diagnose the root cause.  Is there another tool we can use?  Or would the LSOF output over many hours be sufficient?
            Hide
            stevenatcisco Steven Christenson added a comment -

            Here is confirmation that the upgrade resolved the leak... mostly.

            We notice in the last 48 hours, there have been 6 file handle leaks. That would have been 100s previously.

            Show
            stevenatcisco Steven Christenson added a comment - Here is confirmation that the upgrade resolved the leak... mostly. We notice in the last 48 hours, there have been 6 file handle leaks. That would have been 100s previously.
            Hide
            oleg_nenashev Oleg Nenashev added a comment -

            Even 6 leaks is quite suspicious, but I'd guess we cannot do anything with it without File Leak Detector

            Show
            oleg_nenashev Oleg Nenashev added a comment - Even 6 leaks is quite suspicious, but I'd guess we cannot do anything with it without File Leak Detector
            Hide
            wheleph Volodymyr Sobotovych added a comment -

            Oleg Nenashev After upgrade to Jenkins 2.73.3 the issue became less severe but still we have to restart our Jenkins instance once a week (for 2.60 it was once a day).

            Here's the summary of 2 lsof runs with 1 day between them. The list of top files:

            Nov-17:

            100632 slave.log
            32294 log
            7685 timestamps
            4193 random
            3635 urandom

            Nov-18:

            708532 log
            297707 timestamps
            98280 slave.log
            90675 Common.groovy
            85995 BobHelper.groovy
            

            Does it give you more information to find the cause? Unfortunately it's a bit hard for me to provide the file leak detector plugin output because we use openjdk

            Show
            wheleph Volodymyr Sobotovych added a comment - Oleg Nenashev After upgrade to Jenkins 2.73.3 the issue became less severe but still we have to restart our Jenkins instance once a week (for 2.60 it was once a day). Here's the summary of 2 lsof runs with 1 day between them. The list of top files: Nov-17: 100632 slave.log 32294 log 7685 timestamps 4193 random 3635 urandom Nov-18: 708532 log 297707 timestamps 98280 slave.log 90675 Common.groovy 85995 BobHelper.groovy Does it give you more information to find the cause? Unfortunately it's a bit hard for me to provide the file leak detector plugin output because we use openjdk

              People

              • Assignee:
                jglick Jesse Glick
                Reporter:
                bbonacci Bruno Bonacci
              • Votes:
                13 Vote for this issue
                Watchers:
                29 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: