Uploaded image for project: 'Infrastructure'
  1. Infrastructure
  2. INFRA-376

Don't crawl bintray to compute packer and groovy catalogs

    Details

    • Type: Bug
    • Status: Resolved (View Workflow)
    • Priority: Major
    • Resolution: Fixed
    • Component/s: etc
    • Labels:
      None
    • Similar Issues:

      Description

      In the end of august the catalog to download packer binaries was broken because the recorded URL was something like

      http://dl.bintray.com/mitchellh/packer/#packer_VERSION_linux_amd64.zip

      instead of

      http://dl.bintray.com/mitchellh/packer/%packer_VERSION_linux_amd64.zip

      It is now again broken because it is

      http://dl.bintray.com/mitchellh/packer/:packer_VERSION_linux_amd64.zip

      this change (# -> % -> : ->) is regularly done by Bintray to avoid web crawlers

      After discussing with JFrog team the solution is to avoid to do HTML scraping and to use bintray APIs

      3 tools installers catalogs are using bintray nowadays:

      Only SBT script is using bintray APIs.
      groovy and packer are using HTML scraping and should be updated to use Bintray APIs to be sure to not hardcode any URL pattern.

      Note that it won't fully solve the problem of the inability to download packer (and groovy) installers because:

      • Our catalog is updated every 4h thus we may have a delay where URLs in catalog are broken because they changed on bintray side
      • AFAIK users don't have the ability to force the update of catalogs on a jenkins instance. It occurs when Jenkins is restarted (and maybe periodically ?). Thus they'll have to wait for a restart (or scheduled update) to retrieve the fixed version of a catalog

        Attachments

          Activity

          Hide
          aheritier Arnaud Héritier added a comment -

          yes Daniel Beck it's always not a perfect solution but I don't see a better one at least in short term. The only solution is to not use a catalog and thus in such plugin to directly get the available versions from bintray. Data will be up-to-date (even if we have a little cache) but it will add many requests to bintray

          Show
          aheritier Arnaud Héritier added a comment - yes Daniel Beck it's always not a perfect solution but I don't see a better one at least in short term. The only solution is to not use a catalog and thus in such plugin to directly get the available versions from bintray. Data will be up-to-date (even if we have a little cache) but it will add many requests to bintray
          Hide
          danielbeck Daniel Beck added a comment -

          Yoann Dubreuil So the URLs do not change randomly when retrieved like this?

          Show
          danielbeck Daniel Beck added a comment - Yoann Dubreuil So the URLs do not change randomly when retrieved like this?
          Hide
          ydubreuil Yoann Dubreuil added a comment -

          Daniel Beck No, URL will always be good, it's the purpose of the API.

          Show
          ydubreuil Yoann Dubreuil added a comment - Daniel Beck No, URL will always be good, it's the purpose of the API.
          Hide
          rtyler R. Tyler Croy added a comment -

          is this ticket still being worked on, or should we try to find somebody else

          Show
          rtyler R. Tyler Croy added a comment - is this ticket still being worked on, or should we try to find somebody else
          Hide
          aheritier Arnaud Héritier added a comment -

          R. Tyler Croy the PR was merged. I think it was good to be closed (cc Yoann Dubreuil Daniel Beck)

          Show
          aheritier Arnaud Héritier added a comment - R. Tyler Croy the PR was merged. I think it was good to be closed (cc Yoann Dubreuil Daniel Beck )

            People

            • Assignee:
              ydubreuil Yoann Dubreuil
              Reporter:
              aheritier Arnaud Héritier
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: