Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-32479

multiple git repositories sometimes fail to checkout some of them into subdirectory

    Details

    • Type: Bug
    • Status: Closed (View Workflow)
    • Priority: Major
    • Resolution: Won't Fix
    • Labels:
      None
    • Environment:
      Jenkins 1.643 Linux
      Multiple SCMs Plugin 0.5
      Git Plugin 2.5.0-beta3
    • Similar Issues:

      Description

      I have yet to have a reliable production, and I am not sure if the bug is in Multiple SCM or within the Git plugin.

      I have 3 repositories, one checked out at the tip of workspace, and two checked out into subdirectories src/A and src/B.

      Sometimes the src/A or src/B fail to be checked out, despite the log looking perfectly normal, and saying it cloned and checked them out.

      Sometimes re-running the job simply fixes the issue.

      A sample of the log that Isee when this issue occurs:

      Started by upstream project "ixgbe/gerrit_compat" build number 17
      originally caused by:
      Manually triggered by user jekeller for Gerrit: https://git-amr-1.devtools.intel.com/gerrit/74271
      Building remotely on rhel5sp11-60c709d8 (rhel5sp11 swarm linux rhel5) in workspace /home/jenkins/workspace/ixgbe/gerrit_compat/build/PROJECT/label/rhel5
      Cloning the remote Git repository
      Cloning repository ssh://jbrandeb-host.jf.intel.com/home/jenkins/mirrors/nd_linux-compat
      > git init /home/jenkins/workspace/ixgbe/gerrit_compat/build/PROJECT/label/rhel5/src/COMPAT # timeout=10
      Fetching upstream changes from ssh://jbrandeb-host.jf.intel.com/home/jenkins/mirrors/nd_linux-compat
      > git --version # timeout=10
      using GIT_SSH to set credentials ND Linux CI Server
      > git fetch --tags --progress ssh://jbrandeb-host.jf.intel.com/home/jenkins/mirrors/nd_linux-compat +refs/heads/:refs/remotes/origin/
      > git config remote.origin.url ssh://jbrandeb-host.jf.intel.com/home/jenkins/mirrors/nd_linux-compat # timeout=10
      > git config --add remote.origin.fetch +refs/heads/:refs/remotes/origin/ # timeout=10
      > git config remote.origin.url ssh://jbrandeb-host.jf.intel.com/home/jenkins/mirrors/nd_linux-compat # timeout=10
      Fetching upstream changes from ssh://jbrandeb-host.jf.intel.com/home/jenkins/mirrors/nd_linux-compat
      using GIT_SSH to set credentials ND Linux CI Server
      > git fetch --tags --progress ssh://jbrandeb-host.jf.intel.com/home/jenkins/mirrors/nd_linux-compat refs/changes/71/74271/1
      Checking out Revision 73f1f63fb1f4c5da8030d46b3cdf762cefc28e10 (master)
      > git config core.sparsecheckout # timeout=10
      > git checkout -f 73f1f63fb1f4c5da8030d46b3cdf762cefc28e10
      First time build. Skipping changelog.
      Cloning the remote Git repository
      Cloning repository ssh://jbrandeb-host.jf.intel.com/home/jenkins/mirrors/nd_shared-ixgbe
      > git init /home/jenkins/workspace/ixgbe/gerrit_compat/build/PROJECT/label/rhel5/src/SHARED # timeout=10
      Fetching upstream changes from ssh://jbrandeb-host.jf.intel.com/home/jenkins/mirrors/nd_shared-ixgbe
      > git --version # timeout=10
      using GIT_SSH to set credentials ND Linux CI Server
      > git fetch --tags --progress ssh://jbrandeb-host.jf.intel.com/home/jenkins/mirrors/nd_shared-ixgbe +refs/heads/:refs/remotes/origin/
      > git config remote.origin.url ssh://jbrandeb-host.jf.intel.com/home/jenkins/mirrors/nd_shared-ixgbe # timeout=10
      > git config --add remote.origin.fetch +refs/heads/:refs/remotes/origin/ # timeout=10
      > git config remote.origin.url ssh://jbrandeb-host.jf.intel.com/home/jenkins/mirrors/nd_shared-ixgbe # timeout=10
      Fetching upstream changes from ssh://jbrandeb-host.jf.intel.com/home/jenkins/mirrors/nd_shared-ixgbe
      using GIT_SSH to set credentials ND Linux CI Server
      > git fetch --tags --progress ssh://jbrandeb-host.jf.intel.com/home/jenkins/mirrors/nd_shared-ixgbe +refs/heads/:refs/remotes/origin/
      Checking out Revision c0443ba24e527146ce7e8a331047aff366aaa578 (refs/remotes/origin/master)
      > git config core.sparsecheckout # timeout=10
      > git checkout -f c0443ba24e527146ce7e8a331047aff366aaa578
      First time build. Skipping changelog.
      Cloning the remote Git repository
      Cloning repository ssh://jbrandeb-host.jf.intel.com/home/jenkins/mirrors/nd_linux-ixgbe
      > git init /home/jenkins/workspace/ixgbe/gerrit_compat/build/PROJECT/label/rhel5 # timeout=10
      Fetching upstream changes from ssh://jbrandeb-host.jf.intel.com/home/jenkins/mirrors/nd_linux-ixgbe
      > git --version # timeout=10
      using GIT_SSH to set credentials ND Linux CI Server
      > git fetch --tags --progress ssh://jbrandeb-host.jf.intel.com/home/jenkins/mirrors/nd_linux-ixgbe +refs/heads/:refs/remotes/origin/
      > git config remote.origin.url ssh://jbrandeb-host.jf.intel.com/home/jenkins/mirrors/nd_linux-ixgbe # timeout=10
      > git config --add remote.origin.fetch +refs/heads/:refs/remotes/origin/ # timeout=10
      > git config remote.origin.url ssh://jbrandeb-host.jf.intel.com/home/jenkins/mirrors/nd_linux-ixgbe # timeout=10
      Fetching upstream changes from ssh://jbrandeb-host.jf.intel.com/home/jenkins/mirrors/nd_linux-ixgbe
      using GIT_SSH to set credentials ND Linux CI Server
      > git fetch --tags --progress ssh://jbrandeb-host.jf.intel.com/home/jenkins/mirrors/nd_linux-ixgbe +refs/heads/:refs/remotes/origin/
      Checking out Revision 08bb35e104c5b440e20933cec05f1ed91abcd6b0 (refs/remotes/origin/master)
      > git config core.sparsecheckout # timeout=10
      > git checkout -f 08bb35e104c5b440e20933cec05f1ed91abcd6b0
      First time build. Skipping changelog.
      [rhel5] $ /bin/bash -el /tmp/hudson1549523813705894193.sh
      + set -o pipefail
      ++ echo ixgbe/gerrit_compat/build=PROJECT,label=rhel5
      ++ sed s/label=//
      ++ sed s/BUILD_TYPE=//
      ++ tr /, -
      + project=ixgbe-gerrit_compat-build=PROJECT-rhel5
      ++ echo ixgbe-gerrit_compat-build=PROJECT-rhel5-17
      ++ tr '[:upper:]' '[:lower:]'
      + name=ixgbe-gerrit_compat-build=project-rhel5-17
      + logfile=ixgbe-gerrit_compat-build=project-rhel5-17.log
      + [[ PROJECT = ESX* ]]
      + which sparse
      which: no sparse in (/usr/kerberos/bin:/sbin:/usr/sbin:/bin:/usr/bin:/home/jenkins/bin)
      + C=0
      + CFLAGS='-Wno-missing-field-initializers -Wno-aggregate-return'
      + CHECKIDS_FLAGS=
      + W=1
      + make -C src -f build.mk TEST_BUILD=YES USE_ISYSTEM=YES CHECKIDS_FLAGS= BUILD=PROJECT
      + tee ixgbe-gerrit_compat-build=project-rhel5-17.log
      make: Entering directory `/home/jenkins/workspace/ixgbe/gerrit_compat/build/PROJECT/label/rhel5/src'
      build.mk:61: *** Cannot find COMPAT/compat.inc for kcompat flags. Stop.
      make: Leaving directory `/home/jenkins/workspace/ixgbe/gerrit_compat/build/PROJECT/label/rhel5/src'
      Build step 'Execute shell' marked build as failure
      [WARNINGS] Skipping publisher since build result is FAILURE
      Archiving artifacts
      Finished: FAILURE

      As you can see, 3 repositories are "checked out" according to the docs, but the workspace for this job appears to only include the nd_linux-i40e project, and not the nd_shared-i40e or nd_linux-compat projects.

      This results in a compilation failure due to inability to locate the COMPAT files.

      What are the next steps I should take to help debug this issue?

        Attachments

          Activity

          Hide
          jekeller Jacob Keller added a comment -

          This issue can occur if somehow the contents of the submodule directory get deleted such as due to a retry, and it will not be recovered by a submodule update unless the submodule changes locations and is configured to perform a checkout. One solution to this would be to perform a hard reset after the checkout completes. I don't believe this is currently done for submodules even if the clean option is used.

          It seems that the primary cause of this is due to somehow deleting the contents of the submodule when we perform an update. I am not sure how best to handle this yet. Somehow we end up deleting all the files, and then a future update does not correctly reset the working tree.

          Show
          jekeller Jacob Keller added a comment - This issue can occur if somehow the contents of the submodule directory get deleted such as due to a retry, and it will not be recovered by a submodule update unless the submodule changes locations and is configured to perform a checkout. One solution to this would be to perform a hard reset after the checkout completes. I don't believe this is currently done for submodules even if the clean option is used. It seems that the primary cause of this is due to somehow deleting the contents of the submodule when we perform an update. I am not sure how best to handle this yet. Somehow we end up deleting all the files, and then a future update does not correctly reset the working tree.
          Hide
          jekeller Jacob Keller added a comment -

          The variant of this caused by a submodule failure is fixed by https://github.com/jenkinsci/git-client-plugin/pull/200

          I am still investigating exactly what causes failures in the case with just multiple work trees checked out to subdirectories.

          Show
          jekeller Jacob Keller added a comment - The variant of this caused by a submodule failure is fixed by https://github.com/jenkinsci/git-client-plugin/pull/200 I am still investigating exactly what causes failures in the case with just multiple work trees checked out to subdirectories.
          Hide
          jekeller Jacob Keller added a comment -

          I believe but have not been able to prove that this is related to git-clean issues.

          Show
          jekeller Jacob Keller added a comment - I believe but have not been able to prove that this is related to git-clean issues.
          Hide
          djdevin Devin Zuczek added a comment -

          We have this issue as well, projects that were configured with "check out to a subdirectory" sometimes don't.

          We're also using Multi-SCM.

          Show
          djdevin Devin Zuczek added a comment - We have this issue as well, projects that were configured with "check out to a subdirectory" sometimes don't. We're also using Multi-SCM.
          Hide
          jekeller Jacob Keller added a comment -

          Any chance you happen to have a smaller recreation we could use for testing? I'd like to try and sort out exactly what's happening here, as it recently happened again. I believe it may only occur when a fresh clone is initiated.

          What version of Git command line are you using, and what version of Git and Multi SCM are you using?

          Show
          jekeller Jacob Keller added a comment - Any chance you happen to have a smaller recreation we could use for testing? I'd like to try and sort out exactly what's happening here, as it recently happened again. I believe it may only occur when a fresh clone is initiated. What version of Git command line are you using, and what version of Git and Multi SCM are you using?
          Hide
          jekeller Jacob Keller added a comment - - edited

          I believe the root cause is due to how the Git plugin handles clones, via "git init" + "git fetch" + "git checkout" which causes it to accidentally remove the other repositories. I am investigating this now.

          Show
          jekeller Jacob Keller added a comment - - edited I believe the root cause is due to how the Git plugin handles clones, via "git init" + "git fetch" + "git checkout" which causes it to accidentally remove the other repositories. I am investigating this now.
          Hide
          jekeller Jacob Keller added a comment -

          I found the root cause of the issue is due to lines 453 in the Git-client plugin, it runs a recursive "delete workspace" command, and so entirely depending on the ordering of the repositories the Multiple SCM plugin will fail.

          I don't even think the options are necessary to use, since a checkout -f will overwrite any files to begin with, and the only other option is implementing our own "clean" setup which would be more expensive than just relying on the clean extension.

          Show
          jekeller Jacob Keller added a comment - I found the root cause of the issue is due to lines 453 in the Git-client plugin, it runs a recursive "delete workspace" command, and so entirely depending on the ordering of the repositories the Multiple SCM plugin will fail. I don't even think the options are necessary to use, since a checkout -f will overwrite any files to begin with, and the only other option is implementing our own "clean" setup which would be more expensive than just relying on the clean extension.
          Hide
          jekeller Jacob Keller added a comment -

          So the recursive delete call was originally added since "git clone" does not let you clone into a non-empty directory. The root of this problem is due to how MultiSCM orders its checkouts, and there is no (obvious) way to tell which repositories belong in which directories from within the MultiSCM plugin, as this is part of the Git plugin settings.

          In addition, we can't really depend on a specific ordering from within the MultiSCM plugin, and we can't reliably remove the restriction that the work space be empty.

          I believe the problem is that MultiSCM should be made aware of the "checkout to separate directory" option, and indeed should actually implement it itself, since it is a very common or obvious problem when using multiple repositories. That would avoid the issues here without having to remove the constraint (which probably affects other SCMs as well!)

          Show
          jekeller Jacob Keller added a comment - So the recursive delete call was originally added since "git clone" does not let you clone into a non-empty directory. The root of this problem is due to how MultiSCM orders its checkouts, and there is no (obvious) way to tell which repositories belong in which directories from within the MultiSCM plugin, as this is part of the Git plugin settings. In addition, we can't really depend on a specific ordering from within the MultiSCM plugin, and we can't reliably remove the restriction that the work space be empty. I believe the problem is that MultiSCM should be made aware of the "checkout to separate directory" option, and indeed should actually implement it itself, since it is a very common or obvious problem when using multiple repositories. That would avoid the issues here without having to remove the constraint (which probably affects other SCMs as well!)
          Hide
          jekeller Jacob Keller added a comment -

          An alternative would be to implement getModuleRoots for the Git Plugin to return the actual root when Git is using the extension, and then order checkouts based on this. I am thinking this is simpler and will try to implement this.

          Show
          jekeller Jacob Keller added a comment - An alternative would be to implement getModuleRoots for the Git Plugin to return the actual root when Git is using the extension, and then order checkouts based on this. I am thinking this is simpler and will try to implement this.
          Hide
          jekeller Jacob Keller added a comment -

          Can we get some input from developers of Git plugin and Multiple-SCM plugin?

          The issue stems from ordering of repositories since the Git Plugin's "clone" operation calls a full delete of the current workspace. I have a possible solution revolving around making the Git Plugin report its actual sub directory using the getModuleRoot() interface, and then having Multiple-SCM "sort"/"order" the repositories in such a way as to avoid checking out into a sub directory first.

          Another option would be for Multiple SCM to implement for itself "checkout to a sub directory" instead of relying on the plugins to do so themselves. I believe this would be preferred since it would be much more universal, and allow easier correct access since my fix requires both a change to the git plugin and the multiple SCM plugin.

          Finally it may be possible to not need to remove the working directory when implementing clone() in the Git plugin. However, this requires continued use of the init+fetch setup, which currently breaks other uses such as Git-LFS and should be returned back to clone. There does not appear to be a good way to make git-clone behave correctly and allow cloning into a non-empty directory.

          Show
          jekeller Jacob Keller added a comment - Can we get some input from developers of Git plugin and Multiple-SCM plugin? The issue stems from ordering of repositories since the Git Plugin's "clone" operation calls a full delete of the current workspace. I have a possible solution revolving around making the Git Plugin report its actual sub directory using the getModuleRoot() interface, and then having Multiple-SCM "sort"/"order" the repositories in such a way as to avoid checking out into a sub directory first. Another option would be for Multiple SCM to implement for itself "checkout to a sub directory" instead of relying on the plugins to do so themselves. I believe this would be preferred since it would be much more universal, and allow easier correct access since my fix requires both a change to the git plugin and the multiple SCM plugin. Finally it may be possible to not need to remove the working directory when implementing clone() in the Git plugin. However, this requires continued use of the init+fetch setup, which currently breaks other uses such as Git-LFS and should be returned back to clone. There does not appear to be a good way to make git-clone behave correctly and allow cloning into a non-empty directory.
          Hide
          tewing Terry Ewing added a comment -

          I think this issue is the same as https://issues.jenkins-ci.org/browse/JENKINS-27668.

          Show
          tewing Terry Ewing added a comment - I think this issue is the same as https://issues.jenkins-ci.org/browse/JENKINS-27668 .
          Hide
          rodrigc Craig Rodrigues added a comment -

          Suggest that you migrate to Pipeline plugin, which offers a supported way of checking out from multiple scms
          https://wiki.jenkins-ci.org/display/JENKINS/Pipeline+Plugin

          Show
          rodrigc Craig Rodrigues added a comment - Suggest that you migrate to Pipeline plugin, which offers a supported way of checking out from multiple scms https://wiki.jenkins-ci.org/display/JENKINS/Pipeline+Plugin
          Hide
          jekeller Jacob Keller added a comment -

          Unfortunately this would still result in the same problem, because the git plugin wipes out the workspace when cloning, so it is still possible to have the issue occur. It is easier to avoid in the pipeline as long as the pipeline steps are ordered correctly.

          The user just has to know and understand that the git plugin will wipe out the workspace path so you must ensure that the repositories are ordered correctly. In general, for pipeline this is obvious, but it is not obvious for the multi-scm plugin.

          Show
          jekeller Jacob Keller added a comment - Unfortunately this would still result in the same problem, because the git plugin wipes out the workspace when cloning, so it is still possible to have the issue occur. It is easier to avoid in the pipeline as long as the pipeline steps are ordered correctly. The user just has to know and understand that the git plugin will wipe out the workspace path so you must ensure that the repositories are ordered correctly. In general, for pipeline this is obvious, but it is not obvious for the multi-scm plugin.

            People

            • Assignee:
              rodrigc Craig Rodrigues
              Reporter:
              jekeller Jacob Keller
            • Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: