Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-34201

Pipeline plugin can't handle large numbers of parallel build jobs

    Details

    • Type: Bug
    • Status: Resolved (View Workflow)
    • Priority: Critical
    • Resolution: Duplicate
    • Component/s: pipeline
    • Labels:
      None
    • Environment:
      Jenkins 1.656 on RHEL7 with Pipeline plugin 2.0
    • Similar Issues:

      Description

      Consider following snippet:

      stage name: 'foo', concurrency: 10
      foo = [:]
      foo['failFast'] = true
      
      for (int i = 0; i < 25; i++) {
          for (int j = 0; j < 4; j++) {
              foo["branch${i}-${j}"] = {
                  node {
                      build job: 'job1', parameters: [
                          [ $class: 'StringParameterValue', name: 'foo', value: 'f' ],
                          [ $class: 'StringParameterValue', name: 'bar', value: 'b' ],
                          [ $class: 'StringParameterValue', name: 'baz', value: 'z' ]
                      ], quietPeriod: 0
                  }
      
                  node {
                      build job: 'job2', parameters: [
                          [ $class: 'StringParameterValue', name: 'foo', value: 'f' ],
                          [ $class: 'StringParameterValue', name: 'bar', value: 'b' ],
                          [ $class: 'StringParameterValue', name: 'baz', value: 'z' ]
                      ], quietPeriod: 0
                  }
              }
          }
      
          parallel foo
      }
      

      It starts to build the requested jobs just fine.

      INFO: job1 #125 main build action completed: SUCCES
      INFO: job1 #127 main build action completed: SUCCES
      INFO: job1 #126 main build action completed: SUCCES
      INFO: job2 #124 main build action completed: SUCCES
      ...

      However when all of the jobs are done it seems Pipeline can't seem to merge all those results and is just stuck. After 24h it's still hanging, seemingly waiting for a parallel job to finish.

      Now when I remove the outermost for loop things run just fine.

      for (int j = 0; j < 4; j++) {
              foo["branch${j}"] = {
                      ...
              }
      }
      

      Removing the inner for loop and increasing the outer loop to 100 results in foo[] being too big for Jenkins to handle. Same happens without said for loops obviously, which lead me to start using them.

      for (int i = 0; i < 100; j++) {
              foo["branch${i}"] = {
                      ...
              }
      }
      

      There's probably a better way to handle this.
      Any pointers how to get there?

        Attachments

          Issue Links

            Activity

            tomdevylder Tom De Vylder created issue -
            tomdevylder Tom De Vylder made changes -
            Field Original Value New Value
            Description Consider following snippet:

            {code:java}
            stage name: 'foo', concurrency: 10
            foo = [:]
            foo['failFast'] = true

            for (int i = 0; i < 25; i++) {
                for (int j = 0; j < 4; j++) {
                    foo["branch${i}-${j}"] = {
                        node {
                            build job: 'job1', parameters: [
                                [ $class: 'StringParameterValue', name: 'foo', value: 'f' ],
                                [ $class: 'StringParameterValue', name: 'bar', value: 'b' ],
                                [ $class: 'StringParameterValue', name: 'baz', value: 'z' ]
                            ], quietPeriod: 0
                        }

                        node {
                            build job: 'job2', parameters: [
                                [ $class: 'StringParameterValue', name: 'foo', value: 'f' ],
                                [ $class: 'StringParameterValue', name: 'bar', value: 'b' ],
                                [ $class: 'StringParameterValue', name: 'baz', value: 'z' ]
                            ], quietPeriod: 0
                        }
                    }
                }

                parallel foo
            }
            {code}

            It starts to build the requested jobs just fine.

            INFO: job1 #125 main build action completed: SUCCES
            INFO: job1 #127 main build action completed: SUCCES
            INFO: job1 #126 main build action completed: SUCCES
            INFO: job2 #124 main build action completed: SUCCES
            ...

            However when all of the jobs are done it seems Pipeline can't seem to merge all those results and is just stuck. After 24h it's still hanging, seemingly waiting for a parallel job to finish.

            Now when I remove the outermost for loop things run just fine.

            {code:java}
            for (int j = 0; j < 4; j++) {
                    foo["branch${j}"] = {
                            ...
                    }
            }
            {code}

            Removing the inner for loop and increasing the outer loop to 100 results in foo[] being too big for Jenkins to handle. Same happens without said for loops obviously, which lead me to start using them.

            {code:java}
            for (int i = 0; i < 100; j++) {
                    foo["branch${i}"] = {
                            ...
                    }
            }

            There's probably a better way to handle this.
            Any pointers how to get there?
            Consider following snippet:

            {code:java}
            stage name: 'foo', concurrency: 10
            foo = [:]
            foo['failFast'] = true

            for (int i = 0; i < 25; i++) {
                for (int j = 0; j < 4; j++) {
                    foo["branch${i}-${j}"] = {
                        node {
                            build job: 'job1', parameters: [
                                [ $class: 'StringParameterValue', name: 'foo', value: 'f' ],
                                [ $class: 'StringParameterValue', name: 'bar', value: 'b' ],
                                [ $class: 'StringParameterValue', name: 'baz', value: 'z' ]
                            ], quietPeriod: 0
                        }

                        node {
                            build job: 'job2', parameters: [
                                [ $class: 'StringParameterValue', name: 'foo', value: 'f' ],
                                [ $class: 'StringParameterValue', name: 'bar', value: 'b' ],
                                [ $class: 'StringParameterValue', name: 'baz', value: 'z' ]
                            ], quietPeriod: 0
                        }
                    }
                }

                parallel foo
            }
            {code}

            It starts to build the requested jobs just fine.

            INFO: job1 #125 main build action completed: SUCCES
            INFO: job1 #127 main build action completed: SUCCES
            INFO: job1 #126 main build action completed: SUCCES
            INFO: job2 #124 main build action completed: SUCCES
            ...

            However when all of the jobs are done it seems Pipeline can't seem to merge all those results and is just stuck. After 24h it's still hanging, seemingly waiting for a parallel job to finish.

            Now when I remove the outermost for loop things run just fine.

            {code:java}
            for (int j = 0; j < 4; j++) {
                    foo["branch${j}"] = {
                            ...
                    }
            }
            {code}


            Removing the inner for loop and increasing the outer loop to 100 results in foo[] being too big for Jenkins to handle. Same happens without said for loops obviously, which lead me to start using them.

            {code:java}
            for (int i = 0; i < 100; j++) {
                    foo["branch${i}"] = {
                            ...
                    }
            }

            There's probably a better way to handle this.
            Any pointers how to get there?
            tomdevylder Tom De Vylder made changes -
            Description Consider following snippet:

            {code:java}
            stage name: 'foo', concurrency: 10
            foo = [:]
            foo['failFast'] = true

            for (int i = 0; i < 25; i++) {
                for (int j = 0; j < 4; j++) {
                    foo["branch${i}-${j}"] = {
                        node {
                            build job: 'job1', parameters: [
                                [ $class: 'StringParameterValue', name: 'foo', value: 'f' ],
                                [ $class: 'StringParameterValue', name: 'bar', value: 'b' ],
                                [ $class: 'StringParameterValue', name: 'baz', value: 'z' ]
                            ], quietPeriod: 0
                        }

                        node {
                            build job: 'job2', parameters: [
                                [ $class: 'StringParameterValue', name: 'foo', value: 'f' ],
                                [ $class: 'StringParameterValue', name: 'bar', value: 'b' ],
                                [ $class: 'StringParameterValue', name: 'baz', value: 'z' ]
                            ], quietPeriod: 0
                        }
                    }
                }

                parallel foo
            }
            {code}

            It starts to build the requested jobs just fine.

            INFO: job1 #125 main build action completed: SUCCES
            INFO: job1 #127 main build action completed: SUCCES
            INFO: job1 #126 main build action completed: SUCCES
            INFO: job2 #124 main build action completed: SUCCES
            ...

            However when all of the jobs are done it seems Pipeline can't seem to merge all those results and is just stuck. After 24h it's still hanging, seemingly waiting for a parallel job to finish.

            Now when I remove the outermost for loop things run just fine.

            {code:java}
            for (int j = 0; j < 4; j++) {
                    foo["branch${j}"] = {
                            ...
                    }
            }
            {code}


            Removing the inner for loop and increasing the outer loop to 100 results in foo[] being too big for Jenkins to handle. Same happens without said for loops obviously, which lead me to start using them.

            {code:java}
            for (int i = 0; i < 100; j++) {
                    foo["branch${i}"] = {
                            ...
                    }
            }

            There's probably a better way to handle this.
            Any pointers how to get there?
            Consider following snippet:

            {code:java}
            stage name: 'foo', concurrency: 10
            foo = [:]
            foo['failFast'] = true

            for (int i = 0; i < 25; i++) {
                for (int j = 0; j < 4; j++) {
                    foo["branch${i}-${j}"] = {
                        node {
                            build job: 'job1', parameters: [
                                [ $class: 'StringParameterValue', name: 'foo', value: 'f' ],
                                [ $class: 'StringParameterValue', name: 'bar', value: 'b' ],
                                [ $class: 'StringParameterValue', name: 'baz', value: 'z' ]
                            ], quietPeriod: 0
                        }

                        node {
                            build job: 'job2', parameters: [
                                [ $class: 'StringParameterValue', name: 'foo', value: 'f' ],
                                [ $class: 'StringParameterValue', name: 'bar', value: 'b' ],
                                [ $class: 'StringParameterValue', name: 'baz', value: 'z' ]
                            ], quietPeriod: 0
                        }
                    }
                }

                parallel foo
            }
            {code}

            It starts to build the requested jobs just fine.

            INFO: job1 #125 main build action completed: SUCCES
            INFO: job1 #127 main build action completed: SUCCES
            INFO: job1 #126 main build action completed: SUCCES
            INFO: job2 #124 main build action completed: SUCCES
            ...

            However when all of the jobs are done it seems Pipeline can't seem to merge all those results and is just stuck. After 24h it's still hanging, seemingly waiting for a parallel job to finish.

            Now when I remove the outermost for loop things run just fine.

            {code:java}
            for (int j = 0; j < 4; j++) {
                    foo["branch${j}"] = {
                            ...
                    }
            }
            {code}

            Removing the inner for loop and increasing the outer loop to 100 results in foo[] being too big for Jenkins to handle. Same happens without said for loops obviously, which lead me to start using them.

            {code:java}
            for (int i = 0; i < 100; j++) {
                    foo["branch${i}"] = {
                            ...
                    }
            }
            {code}

            There's probably a better way to handle this.
            Any pointers how to get there?
            Hide
            jglick Jesse Glick added a comment -

            Why are you running build inside node? That makes no sense—just wasting an executor slot for no reason.

            Part of the issue could be the insufficiently unique build parameters. If you schedule a build when a queue item for that job already exists with a given set of parameters, the attempt to reschedule will simply be ignored. As of JENKINS-28063 that should not cause a hang, though.

            Reproducible from scratch somehow?

            Show
            jglick Jesse Glick added a comment - Why are you running build inside node ? That makes no sense—just wasting an executor slot for no reason. Part of the issue could be the insufficiently unique build parameters. If you schedule a build when a queue item for that job already exists with a given set of parameters, the attempt to reschedule will simply be ignored. As of JENKINS-28063 that should not cause a hang, though. Reproducible from scratch somehow?
            jglick Jesse Glick made changes -
            Link This issue is related to JENKINS-28063 [ JENKINS-28063 ]
            jglick Jesse Glick made changes -
            Labels pipeline
            rtyler R. Tyler Croy made changes -
            Workflow JNJira [ 170256 ] JNJira + In-Review [ 183826 ]
            abayer Andrew Bayer made changes -
            Component/s pipeline-general [ 21692 ]
            abayer Andrew Bayer made changes -
            Component/s workflow-plugin [ 18820 ]
            Hide
            jglick Jesse Glick added a comment -

            Probably due to a known bug already fixed.

            Show
            jglick Jesse Glick added a comment - Probably due to a known bug already fixed.
            jglick Jesse Glick made changes -
            Link This issue duplicates JENKINS-28063 [ JENKINS-28063 ]
            jglick Jesse Glick made changes -
            Status Open [ 1 ] Resolved [ 5 ]
            Resolution Duplicate [ 3 ]
            jglick Jesse Glick made changes -
            Link This issue is related to JENKINS-28063 [ JENKINS-28063 ]
            Hide
            jonasschneider Jonas Schneider added a comment -

            We're seeing the same behavior with a roughly similar pipeline, running Docker image `jenkinsci/blueocean:1.0.1` (Jenkins ver. 2.46.1) for master and slave. Abridged pipeline script, thread dump and process dump at:

            https://gist.github.com/jonasschneider/ed81faffdd96d3e541cb6f487871029a

            After the `docker run` finishes, Jenkins for some reason does not reap the `docker` processes as can be seen in the ps output. This is a blocker, since it hangs our ~10minute builds for multiple hours on end . We've only seen it appear under some amount of load, that is, when multiple builds of the same job are running. 

            Is there any way to better debug what's going on here?

             

            Show
            jonasschneider Jonas Schneider added a comment - We're seeing the same behavior with a roughly similar pipeline, running Docker image `jenkinsci/blueocean:1.0.1` (Jenkins ver. 2.46.1) for master and slave. Abridged pipeline script, thread dump and process dump at: https://gist.github.com/jonasschneider/ed81faffdd96d3e541cb6f487871029a After the `docker run` finishes, Jenkins for some reason does not reap the `docker` processes as can be seen in the ps output. This is a blocker, since it hangs our ~10minute builds for multiple hours on end  . We've only seen it appear under some amount of load, that is, when multiple builds of the same job are running.  Is there any way to better debug what's going on here?  
            Hide
            jonasschneider Jonas Schneider added a comment -

            (see other comment)

            Show
            jonasschneider Jonas Schneider added a comment - (see other comment)
            jonasschneider Jonas Schneider made changes -
            Resolution Duplicate [ 3 ]
            Status Resolved [ 5 ] Reopened [ 4 ]
            jonasschneider Jonas Schneider made changes -
            Priority Minor [ 4 ] Critical [ 2 ]
            Hide
            jonasschneider Jonas Schneider added a comment -

            Sorry, I meant to reopen JENKINS-37730. Closing this again.

            Show
            jonasschneider Jonas Schneider added a comment - Sorry, I meant to reopen JENKINS-37730 . Closing this again.
            jonasschneider Jonas Schneider made changes -
            Status Reopened [ 4 ] Resolved [ 5 ]
            Resolution Duplicate [ 3 ]

              People

              • Assignee:
                jglick Jesse Glick
                Reporter:
                tomdevylder Tom De Vylder
              • Votes:
                0 Vote for this issue
                Watchers:
                3 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: