Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-56544

failFast option for parallel stages sets build status to ABORTED when failure is inside of a stage with an agent

    XMLWordPrintable

    Details

    • Similar Issues:
    • Released As:
      pipeline-model-definition 1.3.7

      Description

      Symptom

      With a Pipeline using parallel stages with failFast true enabled, when a stage with an agent fails, the final build result is showing as ABORTED instead of FAILURE

      A similar bug was recently fixed JENKINS-55459 which corrected the build status result when using non-nested parallel stages with failFast true enabled, but it seems it did not catch the case where a nested stage inside of one of the parallel stages fails.

      Evidence

      The fix for JENKINS-55459 was delivered under version 1.3.5 of the Pipeline: Declarative Plugin, I am testing version 1.3.6.

      I started up a brand new Jenkins LTS 2.150.3 instance, with the 'recommended' plugins, including Pipeline: Declarative Plugin version 1.3.6

      I ran the testcase inside of JENKINS-55459 and the build status is correctly marked as failed, so the fix from JENKINS-55459 was correct.

      When I run the attached Jenkinsfile , the build still shows as ABORTED:

      ...
      ERROR: script returned exit code 1
      Finished: ABORTED
      

      Here is the full log: log

      I expected the build status to be marked as FAILURE, not ABORTED.

      Hypothesis

      I believe the fix from JENKINS-55459 worked, but does not account for failures inside of stages with agents within the parallel stages.

        Attachments

        1. Jenkinsfile
          2 kB
        2. log
          2 kB
        3. pipeline__Jenkins__and_Mozilla_Firefox.png
          pipeline__Jenkins__and_Mozilla_Firefox.png
          100 kB
        4. test.png
          test.png
          48 kB

          Issue Links

            Activity

            Hide
            dnusbaum Devin Nusbaum added a comment -

            I spent some time trying to reproduce this today but was not able to. Here is what I came up with as a base, but I experimented with various modifications to nesting, post stages, and swapping out error for a failing sh step but everything seemed to work fine.

            I do think there could be some issues with certain kinds of agents based on LabelScript.groovy, DockerPipelineFromDockerfileScript.groovy (1, 2, and 3), and DockerPipelineScript.groovy (1, and 2), but I don't think those places would matter for the Jenkinsfile you posted (Unless maybe LabelScript is being used to run the builds on your machine and that try/catch block is being triggered).

            One thing that is strange to me is that in your logs we see that the post conditions for failure and aborted both ran:

            [Pipeline] { (Declarative: Post Actions)
            [Pipeline] echo
             **** Pipeline ALWAYS ****
            [Pipeline] echo
             **** Pipeline Aborted **** 
            [Pipeline] echo
             **** Pipeline FAILURE **** 
            

            There is probably something wrong in Failure.meetsCondition, that is causing those two conditions to overlap in some cases. Maybe the new errorResult computation needs to be moved before this line so that line can check whether the errorResult matches Aborted specifically rather than just checking if error is null. Either way I think this is just a symptom of the build result being aborted in the first place, so it might not matter in practice once we figure out the other issue.

            Show
            dnusbaum Devin Nusbaum added a comment - I spent some time trying to reproduce this today but was not able to. Here is what I came up with as a base, but I experimented with various modifications to nesting, post stages, and swapping out error for a failing sh step but everything seemed to work fine. I do think there could be some issues with certain kinds of agents based on LabelScript.groovy , DockerPipelineFromDockerfileScript.groovy ( 1 , 2 , and 3 ), and DockerPipelineScript.groovy ( 1 , and 2 ), but I don't think those places would matter for the Jenkinsfile you posted (Unless maybe LabelScript is being used to run the builds on your machine and that try/catch block is being triggered). One thing that is strange to me is that in your logs we see that the post conditions for failure and aborted both ran: [Pipeline] { (Declarative: Post Actions) [Pipeline] echo **** Pipeline ALWAYS **** [Pipeline] echo **** Pipeline Aborted **** [Pipeline] echo **** Pipeline FAILURE **** There is probably something wrong in Failure.meetsCondition , that is causing those two conditions to overlap in some cases. Maybe the new errorResult computation needs to be moved before this line so that line can check whether the errorResult matches Aborted specifically rather than just checking if error is null. Either way I think this is just a symptom of the build result being aborted in the first place, so it might not matter in practice once we figure out the other issue.
            Hide
            dnusbaum Devin Nusbaum added a comment -

            Ok, I was able to reproduce the issue, here is the minimal reproduction case:

            pipeline {
                agent none
                stages {
                    stage("foo") {
                        failFast true
                        parallel {
                            stage("first") {
                                steps {
                                    error "First branch"
                                }
                            }
                            stage("second") {
                                agent any
                                steps {
                                    sleep 10
                                    echo "Second branch"
                                }
                            }
                        }
                    }
                }
            }
            

            Note that this reproduction does not have any nested stages. I think the root of the issue is the agent any in the branch that gets terminated early because of the explicit build result setting in LabelScript as noted in my previous comment.

            Show
            dnusbaum Devin Nusbaum added a comment - Ok, I was able to reproduce the issue, here is the minimal reproduction case: pipeline { agent none stages { stage("foo") { failFast true parallel { stage("first") { steps { error "First branch" } } stage("second") { agent any steps { sleep 10 echo "Second branch" } } } } } } Note that this reproduction does not have any nested stages. I think the root of the issue is the agent any in the branch that gets terminated early because of the explicit build result setting in LabelScript as noted in my previous comment.
            Hide
            rkivisto Ray Kivisto added a comment -

            I'm not sure why you were not able to reproduce with my original testcase, I just tried again now, and started a brand new Jenkins LTS 2.164.1 instance (latest LTS as of today), and installed the "recommended plugins", then ran the attached Jenkinsfile without any changes or Jenkins configuration changes (meaning the `agent any` blocks ran on the master), and the end of the build log shows:

            Finished: ABORTED

             

            Show
            rkivisto Ray Kivisto added a comment - I'm not sure why you were not able to reproduce with my original testcase, I just tried again now, and started a brand new Jenkins LTS 2.164.1 instance (latest LTS as of today), and installed the "recommended plugins", then ran the attached Jenkinsfile without any changes or Jenkins configuration changes (meaning the `agent any` blocks ran on the master), and the end of the build log shows: Finished: ABORTED  
            Hide
            rkivisto Ray Kivisto added a comment -

            I should also mention that your new reduced testcase also reproduces the issue with the build status as:

            Finished: ABORTED

            Show
            rkivisto Ray Kivisto added a comment - I should also mention that your new reduced testcase also reproduces the issue with the build status as: Finished: ABORTED
            Hide
            dnusbaum Devin Nusbaum added a comment -

            I'm not sure why you were not able to reproduce with my original testcase

            I probably didn't clean my plugin work directory or something and so was running code that didn't match 1.3.6, and then immediately started trying to simplify and got rid of the key piece that causes the issue. I think your reproduction should work fine, thanks for coming up with it!

            Show
            dnusbaum Devin Nusbaum added a comment - I'm not sure why you were not able to reproduce with my original testcase I probably didn't clean my plugin work directory or something and so was running code that didn't match 1.3.6, and then immediately started trying to simplify and got rid of the key piece that causes the issue. I think your reproduction should work fine, thanks for coming up with it!
            Hide
            dnusbaum Devin Nusbaum added a comment -

            Filed https://github.com/jenkinsci/pipeline-model-definition-plugin/pull/322 which I think should fix the problem and updated the ticket title/description with what seems like crux of the issue - the fix in JENKINS-55459 didn't work for any stage with an agent.

            Show
            dnusbaum Devin Nusbaum added a comment - Filed https://github.com/jenkinsci/pipeline-model-definition-plugin/pull/322 which I think should fix the problem and updated the ticket title/description with what seems like crux of the issue - the fix in JENKINS-55459 didn't work for any stage with an agent.
            Hide
            abayer Andrew Bayer added a comment -

            Merged, releasing as 1.3.7 right now.

            Show
            abayer Andrew Bayer added a comment - Merged, releasing as 1.3.7 right now.
            Hide
            rkivisto Ray Kivisto added a comment -

            Verified fixed in Pipeline: Declarative version 1.3.7, thanks!

            Show
            rkivisto Ray Kivisto added a comment - Verified fixed in Pipeline: Declarative version 1.3.7, thanks!

              People

              • Assignee:
                dnusbaum Devin Nusbaum
                Reporter:
                rkivisto Ray Kivisto
              • Votes:
                0 Vote for this issue
                Watchers:
                5 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: