Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-20750

Git plugin 2.0 sometimes fail to fetch (timeouts) with weird error

    XMLWordPrintable

    Details

    • Similar Issues:

      Description

      Last week I started to experience some weird problems with the GIT fetch. As far as I remember, I did not update Jenkins or any of its plugins and I'm also not aware of any company GIT/networks changes. All of sudden, jobs started to fail fetching. And the problem is it happens randomly. Sometimes it works as before, sometimes it don't.

      Here is a typical error I'm getting:

      Started by timer
      Building on master in workspace C:\Documents and Settings\Tester\.jenkins\jobs\litebox3d_tiff_32_64\workspace
      Updating svn://krivan/Ranorex/trunk/xSpector at revision '2013-11-25T02:14:53.975 +0100'
      At revision 260
      no change for svn://krivan/Ranorex/trunk/xSpector since the previous build
      Fetching changes from the remote Git repository
      Fetching upstream changes from git@swserv:litebox3d
      ERROR: Timeout after 10 minutes
      FATAL: Failed to fetch from git@swserv:litebox3d
      hudson.plugins.git.GitException: Failed to fetch from git@swserv:litebox3d
      at hudson.plugins.git.GitSCM.fetchFrom(GitSCM.java:612)
      at hudson.plugins.git.GitSCM.retrieveChanges(GitSCM.java:836)
      at hudson.plugins.git.GitSCM.checkout(GitSCM.java:861)
      at org.jenkinsci.plugins.multiplescms.MultiSCM.checkout(MultiSCM.java:117)
      at hudson.model.AbstractProject.checkout(AbstractProject.java:1412)
      at hudson.model.AbstractBuild$AbstractBuildExecution.defaultCheckout(AbstractBuild.java:652)
      at jenkins.scm.SCMCheckoutStrategy.checkout(SCMCheckoutStrategy.java:88)
      at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:557)
      at hudson.model.Run.execute(Run.java:1679)
      at hudson.matrix.MatrixBuild.run(MatrixBuild.java:304)
      at hudson.model.ResourceController.execute(ResourceController.java:88)
      at hudson.model.Executor.run(Executor.java:230)
      at hudson.model.OneOffExecutor.run(OneOffExecutor.java:43)
      Caused by: hudson.plugins.git.GitException: Command "fetch -t git@swserv:litebox3d +refs/heads/:refs/remotes/origin/" returned status code -1:
      stdout:
      stderr: Could not create directory 'c/Documents and Settings/Tester/.ssh'.

      at org.jenkinsci.plugins.gitclient.CliGitAPIImpl.launchCommandIn(CliGitAPIImpl.java:981)
      at org.jenkinsci.plugins.gitclient.CliGitAPIImpl.launchCommandWithCredentials(CliGitAPIImpl.java:920)
      at org.jenkinsci.plugins.gitclient.CliGitAPIImpl.fetch(CliGitAPIImpl.java:187)
      at hudson.plugins.git.GitAPI.fetch(GitAPI.java:229)
      at hudson.plugins.git.GitSCM.fetchFrom(GitSCM.java:610)
      ... 12 more

      It does not happen always! Just from time to time, but enought frequently to break our build/test workflow.

      What's really weird is this error:
      stderr: Could not create directory 'c/Documents and Settings/Tester/.ssh'.
      Please notice the missing colon character in the path! In any case, the pointed path exists (of course, only if the colon is available) so there should be no reason for such error, except if the reason is missing colon? But then why it does not happen always?

      I also noticed, that after this failure, it's impossible to manually delete the content of the workspace/.git folder, because of some locked files. And it does not help to just stop Jenkins. The machine must be restarted to be able to delete the remaining files.

      I'm aware of the issue JENKINS-20445 (too small Git plugin timeout) and while it appears to be also my problem, the reason/source of problem seems to something else? Up to last week, I never experienced such timeout problem. Please don't hesitate to contact me in case of questions regarding our jenkins/jobs setup.

        Attachments

          Activity

          Hide
          markewaite Mark Waite added a comment -

          Since it seems the problem may be outside Jenkins and outside the git plugin, would you be willing to close this as "Not a Bug", or do you think there is still a reasonable chance this will be a bug in Jenkins or the git plugin?

          Show
          markewaite Mark Waite added a comment - Since it seems the problem may be outside Jenkins and outside the git plugin, would you be willing to close this as "Not a Bug", or do you think there is still a reasonable chance this will be a bug in Jenkins or the git plugin?
          Hide
          odklizec Pavel Kudrys added a comment -

          I agree with Michael that this problem has something to do with concurrent processes. I guess it has something to do with GIT pooling? The solution that worked for me was switching to Jenkins Credentials! Previously, I did not use credential stored in Jenkins. Since I stored the GIT credentials in Jenkins and I set those credentials in GIT plugin, all works OK! So far, I experienced only one fetch timeout with the same error as before and sure enough, there was multiple instances of git running at the same time. Once I killed them, all operations returned back to normal.

          In my opinion, there is something wrong either in Jenskin or GIT plugin. A question worth of a million is, what could be a reason of such behavior? In any case, switching to Jenkins Credentials seems minimize the error rate to minimum.

          Show
          odklizec Pavel Kudrys added a comment - I agree with Michael that this problem has something to do with concurrent processes. I guess it has something to do with GIT pooling? The solution that worked for me was switching to Jenkins Credentials! Previously, I did not use credential stored in Jenkins. Since I stored the GIT credentials in Jenkins and I set those credentials in GIT plugin, all works OK! So far, I experienced only one fetch timeout with the same error as before and sure enough, there was multiple instances of git running at the same time. Once I killed them, all operations returned back to normal. In my opinion, there is something wrong either in Jenskin or GIT plugin. A question worth of a million is, what could be a reason of such behavior? In any case, switching to Jenkins Credentials seems minimize the error rate to minimum.
          Hide
          vynce Michael Vincent added a comment -

          Based on my analysis so far, I'm certain this is an msys (or Windows?) bug. Jenkins' usage patterns are quite different from a typical developer using git manually and I think that's what's causing issue to show up.

          Using Jenkins credentials is an interesting idea! I can see how that would enable ssh to work even with a corrupt environment. Running other git commands from a build script might still run into issues with a corrupt environment though.

          Show
          vynce Michael Vincent added a comment - Based on my analysis so far, I'm certain this is an msys (or Windows?) bug. Jenkins' usage patterns are quite different from a typical developer using git manually and I think that's what's causing issue to show up. Using Jenkins credentials is an interesting idea! I can see how that would enable ssh to work even with a corrupt environment. Running other git commands from a build script might still run into issues with a corrupt environment though.
          Hide
          robduff Rob Duff added a comment - - edited

          Having spent days trying to track this down with our team, I thought I'd post in efforts to help others when this occurs. I'll explain as best I can, but I didn't actually fix the problem, so I may not have everything dead-on.

          In our case, we had contention between two instances of git running at the same time through SSH. The first instance would run, and the second would somehow get blocked when reading the known_hosts file and ask you to authenticate, causing the plugin to just sit there until the timeout occurs.

          This may be of use: http://www.joedog.org/2012/07/ssh-disable-known_hosts-prompt/

          Show
          robduff Rob Duff added a comment - - edited Having spent days trying to track this down with our team, I thought I'd post in efforts to help others when this occurs. I'll explain as best I can, but I didn't actually fix the problem, so I may not have everything dead-on. In our case, we had contention between two instances of git running at the same time through SSH. The first instance would run, and the second would somehow get blocked when reading the known_hosts file and ask you to authenticate, causing the plugin to just sit there until the timeout occurs. This may be of use: http://www.joedog.org/2012/07/ssh-disable-known_hosts-prompt/
          Hide
          drierp Peter Drier added a comment -

          We're having a similar problem.. 2008 server, Jenkins 1.560, git-client-plugin 1.8.0.

          git polling hangs, strange errors with can't create ~/.ssh folder.. (the folder is there) running as jenkins user on the server, not as system account. No HOMEDRIVE or HOMEPATH environment variables are set.

          We use this script to kill the >3 minute SCM Polling processes, which seems to get things going again fairly reliably.

          Jenkins.instance.getTrigger("SCMTrigger").getRunners().each()
          {
            item ->
              println(item.getTarget().name)
              println(item.getDuration())
              println(item.getStartTime())
              long millis = Calendar.instance.time.time - item.getStartTime()
          
              if(millis > (1000 * 60 * 3)) // 1000 millis in a second * 60 seconds in a minute * 3 minutes
              {
                Thread.getAllStackTraces().keySet().each()
                { 
                  tItem ->
                    if (tItem.getName().contains("SCM polling") && tItem.getName().contains(item.getTarget().name))
                    { 
                      println "Interrupting thread " + tItem.getName(); 
                      tItem.interrupt()
                    }
                 }
              }
          }
          

          It would be nice if we could set the SCM polling timeout separately from the general GIT one. (1 minute should always be sufficient)

          Show
          drierp Peter Drier added a comment - We're having a similar problem.. 2008 server, Jenkins 1.560, git-client-plugin 1.8.0. git polling hangs, strange errors with can't create ~/.ssh folder.. (the folder is there) running as jenkins user on the server, not as system account. No HOMEDRIVE or HOMEPATH environment variables are set. We use this script to kill the >3 minute SCM Polling processes, which seems to get things going again fairly reliably. Jenkins.instance.getTrigger( "SCMTrigger" ).getRunners().each() { item -> println(item.getTarget().name) println(item.getDuration()) println(item.getStartTime()) long millis = Calendar.instance.time.time - item.getStartTime() if (millis > (1000 * 60 * 3)) // 1000 millis in a second * 60 seconds in a minute * 3 minutes { Thread .getAllStackTraces().keySet().each() { tItem -> if (tItem.getName().contains( "SCM polling" ) && tItem.getName().contains(item.getTarget().name)) { println "Interrupting thread " + tItem.getName(); tItem.interrupt() } } } } It would be nice if we could set the SCM polling timeout separately from the general GIT one. (1 minute should always be sufficient)

            People

            • Assignee:
              ndeloof Nicolas De Loof
              Reporter:
              odklizec Pavel Kudrys
            • Votes:
              1 Vote for this issue
              Watchers:
              8 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: