Details

    • Similar Issues:

      Description

      We are seeing hung git processes which seem to be left over from when Jenkins is using Git to poll for changes. This is likely because there was some IO problem (disk or network) when the polling was attempted. Instead of hanging forever, the fetch should eventually timeout.

        Attachments

          Issue Links

            Activity

            Hide
            recampbell Ryan Campbell added a comment -

            Perhaps the easiest approach is to set a global timeout for all Git operations (say 30 minutes) which could be overridden by a system property. This can be enforced in GitAPI.launchCommandIn() by using ProcStarter.joinWithTimeout() instead of just join().

            Instead of a global default for all commands, you could perhaps create an overloaded version of launchCommandIn which accepts a timeout, defaulting to Integer.MAX_VALUE. It looks like this method is already overloaded, so this may be messy.

            Or perhaps it is best to just perform the fetching using a Future/Task thread which could joinWithTimeout. It seems like polling is the one place where a timeout is required, since a build can/should timeout through other mechanisms.

            Show
            recampbell Ryan Campbell added a comment - Perhaps the easiest approach is to set a global timeout for all Git operations (say 30 minutes) which could be overridden by a system property. This can be enforced in GitAPI.launchCommandIn() by using ProcStarter.joinWithTimeout() instead of just join(). Instead of a global default for all commands, you could perhaps create an overloaded version of launchCommandIn which accepts a timeout, defaulting to Integer.MAX_VALUE. It looks like this method is already overloaded, so this may be messy. Or perhaps it is best to just perform the fetching using a Future/Task thread which could joinWithTimeout. It seems like polling is the one place where a timeout is required, since a build can/should timeout through other mechanisms.
            Hide
            karol_depka Karol Depka Pradzinski added a comment - - edited

            I've started working on this bug. Some questions:
            1. But what if the git process is really doing something that takes longer than the timeout, e.g. pulling some big changes?
            1.1. Maybe other approach would be to limit the number of running/hanging git processes?
            2. Would my changes be useful also if this setting is at first not configurable?

            Show
            karol_depka Karol Depka Pradzinski added a comment - - edited I've started working on this bug. Some questions: 1. But what if the git process is really doing something that takes longer than the timeout, e.g. pulling some big changes? 1.1. Maybe other approach would be to limit the number of running/hanging git processes? 2. Would my changes be useful also if this setting is at first not configurable?
            Hide
            karol_depka Karol Depka Pradzinski added a comment -

            I don't see hudson.Launcher.ProcStarter.joinWithTimeout() ... Is this supposed to be written or is this already present in another branch?

            Show
            karol_depka Karol Depka Pradzinski added a comment - I don't see hudson.Launcher.ProcStarter.joinWithTimeout() ... Is this supposed to be written or is this already present in another branch?
            Hide
            ndeloof Nicolas De Loof added a comment -

            ProcStarter.start() return a Proc that has this joinWithTimeout() method

            Show
            ndeloof Nicolas De Loof added a comment - ProcStarter.start() return a Proc that has this joinWithTimeout() method
            Hide
            recampbell Ryan Campbell added a comment -

            Answers:

            1. There should be some global timeout which you can tune with System properties. Default to 1 hour or so?
            1.1 While this may be a good idea, it still wouldn't solve polling which takes forever, it will just stop future polling.
            2. Yes, I think so. You can set it high enough that it won't bother people, it will just handle stange conditions.

            Show
            recampbell Ryan Campbell added a comment - Answers: 1. There should be some global timeout which you can tune with System properties. Default to 1 hour or so? 1.1 While this may be a good idea, it still wouldn't solve polling which takes forever, it will just stop future polling. 2. Yes, I think so. You can set it high enough that it won't bother people, it will just handle stange conditions.
            Hide
            tfoote Tully Foote added a comment -

            Q3. At some point it does need to timeout. We've found hung polling threads which were 2 weeks old from when our git repo had some downtime.

            And until this is released, does anyone have a suggested way to clear these hung polling threads? I ended up rebooting the slaves which was obviously overkill, but did clear the jobs.

            Show
            tfoote Tully Foote added a comment - Q3. At some point it does need to timeout. We've found hung polling threads which were 2 weeks old from when our git repo had some downtime. And until this is released, does anyone have a suggested way to clear these hung polling threads? I ended up rebooting the slaves which was obviously overkill, but did clear the jobs.
            Hide
            c_kirschner c Kirschner added a comment -

            1 year and 4 months later this is still an issue. In our Jenkins are regularly (2 times or more a week) git processes wich stall / hang forever. Its quite annoying if these fill all scm slots and our entire build server does nothing.
            To get it running again i have to kill all git and git-remote-https processes.

            Show
            c_kirschner c Kirschner added a comment - 1 year and 4 months later this is still an issue. In our Jenkins are regularly (2 times or more a week) git processes wich stall / hang forever. Its quite annoying if these fill all scm slots and our entire build server does nothing. To get it running again i have to kill all git and git-remote-https processes.
            Hide
            olivier Olivier Jolit added a comment -

            Same problem here, for each temporary network failure we have to restart our jenkins installations. Apparently assignee has no activity since October 2011, I guess it's not "In Progress" anymore.

            Show
            olivier Olivier Jolit added a comment - Same problem here, for each temporary network failure we have to restart our jenkins installations. Apparently assignee has no activity since October 2011, I guess it's not "In Progress" anymore.
            Hide
            karol_depka Karol Depka Pradzinski added a comment -

            Hi Guys. Due to time constraints, I was not able to fix this bug. Earlier I forgot to change the status and un-assign myself.

            Show
            karol_depka Karol Depka Pradzinski added a comment - Hi Guys. Due to time constraints, I was not able to fix this bug. Earlier I forgot to change the status and un-assign myself.
            Hide
            brondsem Dave Brondsema added a comment -

            Is this fix in the 2.0 release or something else? I'm not seeing it mentioned in the changelog.

            Show
            brondsem Dave Brondsema added a comment - Is this fix in the 2.0 release or something else? I'm not seeing it mentioned in the changelog .
            Hide
            joelmgallant Joel Gallant added a comment - - edited

            I noticed it mentioned as a system property in 1.4.6 - (org.jenkinsci.plugins.gitclient.Git.timeout).

            I tried setting this in the launch, and it shows in the environment variables, but doesn't appear to do anything - my timeout is still @10 minutes...

            Show
            joelmgallant Joel Gallant added a comment - - edited I noticed it mentioned as a system property in 1.4.6 - (org.jenkinsci.plugins.gitclient.Git.timeout). I tried setting this in the launch, and it shows in the environment variables, but doesn't appear to do anything - my timeout is still @10 minutes...
            Hide
            joelmgallant Joel Gallant added a comment -

            Ah! Now I see, it's in the 1.4.x maintenance branch!

            Show
            joelmgallant Joel Gallant added a comment - Ah! Now I see, it's in the 1.4.x maintenance branch!
            Hide
            scm_issue_link SCM/JIRA link daemon added a comment -

            Code changed in jenkins
            User: Nicolas De Loof
            Path:
            src/main/java/org/jenkinsci/plugins/gitclient/CliGitAPIImpl.java
            http://jenkins-ci.org/commit/git-client-plugin/1b7fd2b18d626d8ca081933d8a004fd7b2279210
            Log:
            JENKINS-11286 run git commands with a time-out

            Show
            scm_issue_link SCM/JIRA link daemon added a comment - Code changed in jenkins User: Nicolas De Loof Path: src/main/java/org/jenkinsci/plugins/gitclient/CliGitAPIImpl.java http://jenkins-ci.org/commit/git-client-plugin/1b7fd2b18d626d8ca081933d8a004fd7b2279210 Log: JENKINS-11286 run git commands with a time-out
            Hide
            drdt Don Ross added a comment -

            Why, oh why, was this implemented without an option to disable it?

            Show
            drdt Don Ross added a comment - Why, oh why, was this implemented without an option to disable it?
            Hide
            sc1478 Steve Cohen added a comment - - edited

            Agree with Don Ross

            I have a valid, albeit rare, use case for running without a timeout.  Given a large job with git LFS objects that change infrequently, it might take 5 or 6 hours for the original download.  Being able to specify no timeout would be a good thing here.  Is there some values (say, maybe -1) that means don't time out?  Once the initial LFS stuff is checked out timeouts are again reasonable.

            Show
            sc1478 Steve Cohen added a comment - - edited Agree with Don Ross I have a valid, albeit rare, use case for running without a timeout.  Given a large job with git LFS objects that change infrequently, it might take 5 or 6 hours for the original download.  Being able to specify no timeout would be a good thing here.  Is there some values (say, maybe -1) that means don't time out?  Once the initial LFS stuff is checked out timeouts are again reasonable.
            Hide
            markewaite Mark Waite added a comment -

            Steve Cohen there is no value which means "don't time out".

            I don't see much difference for your use case between a very large number and no timeout. If you set the timeout to 1000000 minutes, it seems unlikely that you'll reach the timeout before either interrupting the job, restarting Jenkins, or rebooting the computer.

            Show
            markewaite Mark Waite added a comment - Steve Cohen there is no value which means "don't time out". I don't see much difference for your use case between a very large number and no timeout. If you set the timeout to 1000000 minutes, it seems unlikely that you'll reach the timeout before either interrupting the job, restarting Jenkins, or rebooting the computer.
            Hide
            sc1478 Steve Cohen added a comment - - edited

            True enough Mark Waite.  However, see JENKINS-47616, wherein I request making this parameterizable, since with Git-LFS the initial checkout is much slower than subsequent ones.

            Parameterization is key, since otherwise you must keep messing around with configurations otherwise.

            Show
            sc1478 Steve Cohen added a comment - - edited True enough Mark Waite.  However, see JENKINS-47616 , wherein I request making this parameterizable, since with Git-LFS the initial checkout is much slower than subsequent ones. Parameterization is key, since otherwise you must keep messing around with configurations otherwise.
            Hide
            patrick_i Patrick B added a comment -

            Hey guys,

             

            in which file I can find this option? `org.jenkinsci.plugins.gitclient.Git.timeOut`

            I searched in all administrative areas and not find it.
            -> YES on each project I can add this as additional configuration.

            But I want this global on all projects without that people must set it up every time.

            (On Windows, I checked config.xml, settings.xml, plugins-settings.xml and everything)

            Show
            patrick_i Patrick B added a comment - Hey guys,   in which file I can find this option? `org.jenkinsci.plugins.gitclient.Git.timeOut` I searched in all administrative areas and not find it. -> YES on each project I can add this as additional configuration. But I want this global on all projects without that people must set it up every time. (On Windows, I checked config.xml, settings.xml, plugins-settings.xml and everything)
            Hide
            markewaite Mark Waite added a comment -

            Patrick B if your users need to increase the timeout generally, then I think you may have missed an opportunity to help your users with faster clones. Refer to the Jenkins World 2017 15 minute talk on "git in the large" (slides).

            For example, cloning a large git repository can be significantly reduced with a reference repository. Cloning a large git repository can be significantly reduced by using a narrow refspec. Cloning a large git repository can be significantly reduced with shallow clone.

            Even with the best of techniques, there still may be times when you choose to attempt to adjust the global git client plugin timeout value. That requires a change of the command line parameters used to start Jenkins. There is no user interface support for global adjustment of the git client plugin timeout value.

            In general, the Java process which starts Jenkins needs the argument -Dorg.jenkinsci.plugins.gitclient.Git.timeOut=12345

            Refer to my docker_run.py script as one example of a way to pass that argument to the java command which starts Jenkins. If your Jenkins starts from an init script on Ubuntu or Debian, you may be able to adjust command line arguments from /etc/defaults/jenkins. If your Jenkins starts from an init script on Red Hat, CentOS, OpenSUSE, or SUSE, you may be able to adjust command line arguments in /etc/sysconfig/jenkins. If your Jenkins is a service on Windows, I believe there is a configuration file that can be changed to add arguments to the Java command line which starts Jenkins.

            Show
            markewaite Mark Waite added a comment - Patrick B if your users need to increase the timeout generally, then I think you may have missed an opportunity to help your users with faster clones. Refer to the Jenkins World 2017 15 minute talk on " git in the large " ( slides ). For example, cloning a large git repository can be significantly reduced with a reference repository. Cloning a large git repository can be significantly reduced by using a narrow refspec. Cloning a large git repository can be significantly reduced with shallow clone. Even with the best of techniques, there still may be times when you choose to attempt to adjust the global git client plugin timeout value. That requires a change of the command line parameters used to start Jenkins. There is no user interface support for global adjustment of the git client plugin timeout value. In general, the Java process which starts Jenkins needs the argument -Dorg.jenkinsci.plugins.gitclient.Git.timeOut=12345 Refer to my docker_run.py script as one example of a way to pass that argument to the java command which starts Jenkins. If your Jenkins starts from an init script on Ubuntu or Debian, you may be able to adjust command line arguments from /etc/defaults/jenkins. If your Jenkins starts from an init script on Red Hat, CentOS, OpenSUSE, or SUSE, you may be able to adjust command line arguments in /etc/sysconfig/jenkins. If your Jenkins is a service on Windows, I believe there is a configuration file that can be changed to add arguments to the Java command line which starts Jenkins.
            Hide
            patrick_i Patrick B added a comment -

            @Mark Waite: Thank you very much.
            I understand that it is more professional if we fix the problem in general instead using larger timeouts.
            So I will check your slides in the next weeks but for the moment I added this command.

             

            For everybody Windows users else:
            C:\Program Files (x86)\Jenkins\jenkins.xml

            There you scroll down to service and arguments. There you can add this:

             -Dorg.jenkinsci.plugins.gitclient.Git.timeOut=60
            

             

            And then your log file shows # timeout=60 and everything is fine 

             

            Show
            patrick_i Patrick B added a comment - @ Mark Waite : Thank you very much. I understand that it is more professional if we fix the problem in general instead using larger timeouts. So I will check your slides in the next weeks but for the moment I added this command.   For everybody Windows users else: C:\Program Files (x86)\Jenkins\jenkins.xml There you scroll down to service and arguments . There you can add this:  -Dorg.jenkinsci.plugins.gitclient.Git.timeOut=60   And then your log file shows # timeout=60 and everything is fine   

              People

              • Assignee:
                ndeloof Nicolas De Loof
                Reporter:
                recampbell Ryan Campbell
              • Votes:
                6 Vote for this issue
                Watchers:
                15 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: