Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-41320

Concurrent git fetches cause OOM

    XMLWordPrintable

    Details

    • Similar Issues:

      Description

      When multiple jobs trigger git fetches at the same time, our system ooms and dies, leaving the workspaces broken because the git lockfile is present from a dead git process.
      We have the "# of executors" for the node set to 1, to avoid this sort of problem, and while this does stop it from running the actual jobs in parallel, it will still perfrom git fetches in parallel, and there doesn't seem to be any way to stop it from doing so (does anyone know a workaround?).

      This has been triggered recently by the github api going down, then coming back up, causing all of our open pull requests to be rebuilt at the same time, running ~10 git fetches on large repos, which causes the whole instance to die and need a hard reboot.

        Attachments

          Activity

          Hide
          markewaite Mark Waite added a comment -

          I see this same condition frequently at startup of my lts-with-plugins docker instance. The instance includes job definitions for repositories which have a bug verification job on each branch. When the instance starts, a series of parallel git operations are started which overload the master node.

          I don't know of a work-around for the problem, and suspect that it will require some form of coordination to share polling results for a single repository. Stephen Connolly or Jesse Glick may have better suggestions for a technique.

          Show
          markewaite Mark Waite added a comment - I see this same condition frequently at startup of my lts-with-plugins docker instance . The instance includes job definitions for repositories which have a bug verification job on each branch. When the instance starts, a series of parallel git operations are started which overload the master node. I don't know of a work-around for the problem, and suspect that it will require some form of coordination to share polling results for a single repository. Stephen Connolly or Jesse Glick may have better suggestions for a technique.
          Hide
          tom_artomatix Tom Mason added a comment -

          I'd be happy to have it just block git operations until all other processes have finished. I would wrap git in a shell script that used flock to do that, but I suspect that would cause jenkins to think it timed out.

          Show
          tom_artomatix Tom Mason added a comment - I'd be happy to have it just block git operations until all other processes have finished. I would wrap git in a shell script that used flock to do that, but I suspect that would cause jenkins to think it timed out.
          Hide
          jglick Jesse Glick added a comment -

          No idea, without knowing (a) what is actually triggering the fetch in the observed case, (b) why even a bunch of concurrent fetches would require so much heap as to trigger an OOME.

          Show
          jglick Jesse Glick added a comment - No idea, without knowing (a) what is actually triggering the fetch in the observed case, (b) why even a bunch of concurrent fetches would require so much heap as to trigger an OOME.
          Hide
          tom_artomatix Tom Mason added a comment -

          GitHub pull request builder plugin causes the large amount of jobs to run. A bunch of concurrent fetches on large repos causes an oom... because git uses a lot of memory on fetches of large repos.

          Show
          tom_artomatix Tom Mason added a comment - GitHub pull request builder plugin causes the large amount of jobs to run. A bunch of concurrent fetches on large repos causes an oom... because git uses a lot of memory on fetches of large repos.

            People

            • Assignee:
              Unassigned
              Reporter:
              tom_artomatix Tom Mason
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated: