Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-54495

Better handling of GitHub Organization folder scan to avoid API quota

    XMLWordPrintable

    Details

    • Similar Issues:

      Description

      When having a big GitHub organization, with hundreds of repos, each with hundreds of branches and tags, refreshing the whole organization is not possible (or it takes ages) due to GitHub API quota being hit.

      This is particularly bad when trying to add a new repo, it could take days, which is completely impractical.

      There are several solutions to this issue that I can think of:

      • Use GitHub GraphQL API to query the whole thing in one (or very few) request(s)
      • Make a "shallow scan", that only discovers repos. Then each repo can be refreshed separately, which can 1. enable the quick addition of new repos and 2. distribute the refresh API bursts in time making hitting the API quota less likely
      • Add a separate function to only discover one repo specified by the user

        Attachments

          Activity

          Hide
          bitwiseman Liam Newman added a comment -

          Any of these seems like interesting options.

          Targeted scan - This would involve some working with Jelly and Jenkins UI, but it might be easier to implement due to the narrow target.
          Shallow scan - At very least, doing a breadth first scan that then requested scans from the child repo's over time.
          The GraphQL option would be a massive undertaking. But maybe if you switched just to top level repo scan or some other targeted scenario.

          They're all viable in different ways. Perhaps you could file a separate issue for each one and then work on them separately?

          Show
          bitwiseman Liam Newman added a comment - Any of these seems like interesting options. Targeted scan - This would involve some working with Jelly and Jenkins UI, but it might be easier to implement due to the narrow target. Shallow scan - At very least, doing a breadth first scan that then requested scans from the child repo's over time. The GraphQL option would be a massive undertaking. But maybe if you switched just to top level repo scan or some other targeted scenario. They're all viable in different ways. Perhaps you could file a separate issue for each one and then work on them separately?
          Hide
          kivagant Eugene G added a comment -

          This is very important. If an organization has tons of old repos and more than 50 repos with Jenkinsfiles it becomes crazy to maintain the list of repos in this small filter field.

          Show
          kivagant Eugene G added a comment - This is very important. If an organization has tons of old repos and more than 50 repos with Jenkinsfiles it becomes crazy to maintain the list of repos in this small filter field.

            People

            • Assignee:
              Unassigned
              Reporter:
              lucasocio Leandro Lucarella
            • Votes:
              4 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

              • Created:
                Updated: