Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-33412

Jenkins locks when started in HTTPS mode on a host with 37+ processors

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Closed (View Workflow)
    • Priority: Blocker
    • Resolution: Fixed
    • Component/s: winstone-jetty
    • Environment:
      Jenkins 1.652
      org.jenkins-ci:winstone 2.9
      Testing in the Linux JDK 1.7 and 1.8 as well as the Solaris JDK 1.7 1.8 (both OpenJDK and OracleJDK). Reproduces in Ubuntu, Debian, CentOS and SmartOS.
    • Similar Issues:

      Description

      Summary
      Using Winstone 2.9 (i.e. the embedded Jetty wrapper) or below it will not run in HTTPS mode on hosts with 37 cores/processors or more. This problem replicates regardless of the JDK or the operating system.

      Reproduction
      The easiest way to reproduce the error is to use qemu to virtualize a 37 core system. You can do that with the -smp <cores> parameter. For example, for testing I run:

      qemu-system-x86_64 -hda ubuntu.img -m 4096 -smp 48
      

      Once you have a VM with more than 37 cores setup, install Jenkins 1.652 and configure it to use HTTPS. Attempt to start it and connect to either the HTTP or the HTTPS port. The connection will time out for either port with the server effectively locked until you send it a SIGTERM. Please refer to the attached log file to see its start process.

      Why is this important?
      You may ask - who is running Jenkins on that big of a server? Well, with containerization technologies (e.g. Docker) taking a center stage, we are seeing more and more deployments where there is no VM involved and hence a container gets a slice of CPU but has visibility to all of the processors on a system. The official Docker image of Jenkins suffers from this defect. Sure a user can set up their own reverse proxy to run TLS through, but it adds unneeded complexity for users looking to containerize their Jenkins environment.

      Solution
      I've done the work of reducing the surface area of root cause analysis. I removed the entire jenkins war from the jenkins winstone runner (https://github.com/jenkinsci/winstone) and ran a simple hello world war instead. With HTTPS enabled, the issue still reproduced. It took a lot of fiddling to determine that it was exactly at 37 cores in which the hang occurred.

      Lastly, I tried the exact same reproduction steps with winstone-3.1. Luckily, with the upgrade to embedded Jetty in the 3.1 version the issue is resolved.

      Can we upgrade the next Jenkins release to use the winstone-3.1 component?

      This would be the easiest and the best fix. I would be happy to contribute to any efforts that would allow for us to get this into a release.

        Attachments

          Issue Links

            Activity

            Hide
            skeenan Shaun Keenan added a comment -

            Another ping - when will this make it into LTS?

            Show
            skeenan Shaun Keenan added a comment - Another ping - when will this make it into LTS?
            Hide
            danielbeck Daniel Beck added a comment -

            As this was not nominated as an LTS candidate, it will not be in 2.138.3 next week. I expect it'll be in the next LTS 2.1xx.1 scheduled for December 5.

            For reference https://jenkins.io/download/lts/#backporting-process

            Show
            danielbeck Daniel Beck added a comment - As this was not nominated as an LTS candidate, it will not be in 2.138.3 next week. I expect it'll be in the next LTS 2.1xx.1 scheduled for December 5. For reference https://jenkins.io/download/lts/#backporting-process
            Hide
            skeenan Shaun Keenan added a comment -

            thank you!

            Show
            skeenan Shaun Keenan added a comment - thank you!
            Hide
            mikescholze Mike Scholze added a comment -

            It is fixed with 2.138.2!

            https://jenkins.io/changelog-stable/

            Update Winstone-Jetty from 4.4 to 5.0 to fix HTTP/2 support and threading problems on hosts with 30+ cores. (issue 53239, issue 52804, issue 51136, issue 52358)

            Show
            mikescholze Mike Scholze added a comment - It is fixed with 2.138.2! https://jenkins.io/changelog-stable/ Update Winstone-Jetty from 4.4 to 5.0 to fix HTTP/2 support and threading problems on hosts with 30+ cores. (issue 53239, issue 52804, issue 51136, issue 52358)
            Hide
            danielbeck Daniel Beck added a comment -

            Sorry about that. The number of duplicates of this issue not collapsed into one meant one got the label and the others did not.

            Show
            danielbeck Daniel Beck added a comment - Sorry about that. The number of duplicates of this issue not collapsed into one meant one got the label and the others did not.

              People

              • Assignee:
                olamy Olivier Lamy
                Reporter:
                elijah Elijah Zupancic
              • Votes:
                0 Vote for this issue
                Watchers:
                12 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: