Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-34212

Visualization shows stage failed when an exception is caught

    Details

    • Similar Issues:

      Description

      I have a stage with a try/catch block. It attempts to copy an artifact from a previous build. If it can't, the step throws an exception which is caught and handled. In the catch block it copies it from a different location (successfully) and my pipeline continues. However, in visualization, the stage is marked as failed. I would expect that regardless of exceptions thrown, if the stage completes successfully it should be green in visualization.

      This is the error that is thrown/caught:

      hudson.AbortException: Unable to find project for artifact copy: null
      This may be due to incorrect project name or permission settings; see help for project name in job configuration.
      

        Attachments

          Issue Links

            Activity

            Hide
            svanoort Sam Van Oort added a comment -

            Hi Michael,
            Please, can you provide a sample pipeline that reproduces this issue? I suspect this is a result of limitations in how the pipeline is currently scanned for errors (which should be addressed by https://issues.jenkins-ci.org/browse/JENKINS-34038 since the implementation is being completely replaced with a more robust algorithm). However, I would like to confirm this and have a test case I can use to verify.

            Show
            svanoort Sam Van Oort added a comment - Hi Michael, Please, can you provide a sample pipeline that reproduces this issue? I suspect this is a result of limitations in how the pipeline is currently scanned for errors (which should be addressed by https://issues.jenkins-ci.org/browse/JENKINS-34038 since the implementation is being completely replaced with a more robust algorithm). However, I would like to confirm this and have a test case I can use to verify.
            Hide
            mscharp Michael Scharp added a comment - - edited

            Hi Sam,

            This issue is related to JENKINS-33840 which has an example use case that produces the failed stage, even though the error is caught.

            Here is that test case:

            node() {
                stage "fail"
                try {
                    sh "false"
                }
                catch(err) {
                    stage "cleanup"
                    sh "true"
                }
                stage "done"
                echo "done"
            }
            
            Show
            mscharp Michael Scharp added a comment - - edited Hi Sam, This issue is related to JENKINS-33840 which has an example use case that produces the failed stage, even though the error is caught. Here is that test case: node() { stage "fail" try { sh " false " } catch (err) { stage "cleanup" sh " true " } stage "done" echo "done" }
            Hide
            svanoort Sam Van Oort added a comment -

            Michael Scharp I've got the root cause, unfortunately it's deeper than it would seem (not visualization). The issue is that an ErrorAction is attached to the executed shell step even though the failure is caught.

            For stages, we scan to see if the stage has any nodes with errors on them, and if so we mark the stage as failed: https://github.com/jenkinsci/pipeline-stage-view-plugin/blob/master/rest-api/src/main/java/com/cloudbees/workflow/rest/external/StageNodeExt.java#L133-L135

            Now, we could look at steps following this to see if it's safe to ignore the error. But I think that probably the better fix here lies in the actual pipeline execution (and ensuring we have some way to see in the flowgraph if a failure within a step has been caught in another block).

            Show
            svanoort Sam Van Oort added a comment - Michael Scharp I've got the root cause, unfortunately it's deeper than it would seem (not visualization). The issue is that an ErrorAction is attached to the executed shell step even though the failure is caught. For stages, we scan to see if the stage has any nodes with errors on them, and if so we mark the stage as failed: https://github.com/jenkinsci/pipeline-stage-view-plugin/blob/master/rest-api/src/main/java/com/cloudbees/workflow/rest/external/StageNodeExt.java#L133-L135 Now, we could look at steps following this to see if it's safe to ignore the error. But I think that probably the better fix here lies in the actual pipeline execution (and ensuring we have some way to see in the flowgraph if a failure within a step has been caught in another block).
            Hide
            jglick Jesse Glick added a comment -

            This is clearly a bug, in that we are able to tell from the flow graph whether or not an error was caught within a stage.

            But there is also an RFE to allow the script to indicate that a stage should be considered “failed” in some sense despite having caught and handled the exception, and proceeded to the end of the build or subsequent stages. See JENKINS-26522.

            Show
            jglick Jesse Glick added a comment - This is clearly a bug, in that we are able to tell from the flow graph whether or not an error was caught within a stage. But there is also an RFE to allow the script to indicate that a stage should be considered “failed” in some sense despite having caught and handled the exception, and proceeded to the end of the build or subsequent stages. See JENKINS-26522 .
            Hide
            svanoort Sam Van Oort added a comment -

            Jesse Glick Yes, clearly it is a bug that the step shows as errored at the FlowNode level when the error was caught; this represents an undeniable mismatch between the script handling and the DAG representation of the execution. We may assume that the error was ignored somehow if the next node executed successfully, but that is hacky and I think the assumption is likely to bite us in the future.

            JENKINS-26522 is one way to handle this - I do like the notion of custom step annotations by attaching an Action to the flownode, but disagree that making it a distinct step is the right course (for cases like this, the best handling is for the execution to attach appropriate information automatically).

            My ideal would be either a new StatusAction that takes an arbitrary string (with some standard ones defined), or better yet, a StatusCode field on the FlowNode that does the same (with deserialization logic handling the format before this). Normally I'd suggest an enum, but this use allows definition of custom states specific to plugins. Alternately, status could be an enum with a 'SPECIAL' status option and the ability to attach an action with more status details to support custom states for plugin use.

            Show
            svanoort Sam Van Oort added a comment - Jesse Glick Yes, clearly it is a bug that the step shows as errored at the FlowNode level when the error was caught; this represents an undeniable mismatch between the script handling and the DAG representation of the execution. We may assume that the error was ignored somehow if the next node executed successfully, but that is hacky and I think the assumption is likely to bite us in the future. JENKINS-26522 is one way to handle this - I do like the notion of custom step annotations by attaching an Action to the flownode, but disagree that making it a distinct step is the right course (for cases like this, the best handling is for the execution to attach appropriate information automatically). My ideal would be either a new StatusAction that takes an arbitrary string (with some standard ones defined), or better yet, a StatusCode field on the FlowNode that does the same (with deserialization logic handling the format before this). Normally I'd suggest an enum, but this use allows definition of custom states specific to plugins. Alternately, status could be an enum with a 'SPECIAL' status option and the ability to attach an action with more status details to support custom states for plugin use.
            Hide
            svanoort Sam Van Oort added a comment -

            Error handling behavior would want to mimic freestyle builds

            Show
            svanoort Sam Van Oort added a comment - Error handling behavior would want to mimic freestyle builds
            Hide
            svanoort Sam Van Oort added a comment -

            Jesse Glick I realized there's a major wrinkle in the workaround here: if you're using block-scoped-steps within the try block, the BlockEndNode will always get an ErrorAction. So, you can't just look at the following step after the one generating the error, you need to have full awareness of the block structure and look for the next step following the block containing the error. Doable but 100% requires JENKINS-34038 & block-scoped analysis to work, so not trivial.

            The other flipside is that you won't know if the error was caught until the first step in the catch block is generated and completes successfully. I really think we really need to annotate the flownodes with caught errors with an appropriate action (at execution time), or make try/catch an actual step.

            Show
            svanoort Sam Van Oort added a comment - Jesse Glick I realized there's a major wrinkle in the workaround here: if you're using block-scoped-steps within the try block, the BlockEndNode will always get an ErrorAction. So, you can't just look at the following step after the one generating the error, you need to have full awareness of the block structure and look for the next step following the block containing the error. Doable but 100% requires JENKINS-34038 & block-scoped analysis to work, so not trivial. The other flipside is that you won't know if the error was caught until the first step in the catch block is generated and completes successfully. I really think we really need to annotate the flownodes with caught errors with an appropriate action (at execution time), or make try/catch an actual step.
            Hide
            svanoort Sam Van Oort added a comment -

            Resolved via a dirty hack – but we should really offer an option to annotate the build with information in the event of errors.

            Show
            svanoort Sam Van Oort added a comment - Resolved via a dirty hack – but we should really offer an option to annotate the build with information in the event of errors.
            Hide
            svanoort Sam Van Oort added a comment -

            Tested, reviewed and merged - goes live with next release.

            Show
            svanoort Sam Van Oort added a comment - Tested, reviewed and merged - goes live with next release.
            Hide
            svanoort Sam Van Oort added a comment -

            Michael Scharp Released with v 1.6

            Show
            svanoort Sam Van Oort added a comment - Michael Scharp Released with v 1.6
            Hide
            clockman Piotr Zegar added a comment -

            Thank you for taking out the only way to set Success/Fail status to individual stages.
            I will stick to 1.5 until JENKINS-26522 will be implemented.

            The only thing I wanted is to run bunch of tasks from one job and see with one succeed and with one failed.

            Show
            clockman Piotr Zegar added a comment - Thank you for taking out the only way to set Success/Fail status to individual stages. I will stick to 1.5 until JENKINS-26522 will be implemented. The only thing I wanted is to run bunch of tasks from one job and see with one succeed and with one failed.
            Hide
            svanoort Sam Van Oort added a comment -

            Piotr Rogoża I'm sorry to hear that this fix broke your implementation. There are a several appropriate ways to accomplish what you're trying to do, which will be future-proof:

            • Use try/catch, and in the case of a caught error, collect a list of failures to report at the end (setting currentBuild.result to UNSTABLE at that time, so you know that there was partial failure).
            • Use parallel blocks without failfast for each case (where the components can execute concurrently). This will reduce overall runtime, as well as isolating each case. You will get an overall pass/fail/unstable for the stage containing the parallel.
            • Use subjobs for components which should have their own pass/fail statuses to report (but where you want to allow the pipeline to be able to continue).

            Any one of these will let you do what you want in one fashion or another. You are, of course, welcome to use a legacy plugin version and wait for JENKINS-26522 if you like. In general though, it isn't a great idea to make an implementation dependent on an obvious bug. While we try not to break users, responsible maintainers do have a duty to prioritize fixes for the general use case over an individual user's custom hack.

            Show
            svanoort Sam Van Oort added a comment - Piotr Rogoża I'm sorry to hear that this fix broke your implementation. There are a several appropriate ways to accomplish what you're trying to do, which will be future-proof: Use try/catch, and in the case of a caught error, collect a list of failures to report at the end (setting currentBuild.result to UNSTABLE at that time, so you know that there was partial failure). Use parallel blocks without failfast for each case (where the components can execute concurrently). This will reduce overall runtime, as well as isolating each case. You will get an overall pass/fail/unstable for the stage containing the parallel. Use subjobs for components which should have their own pass/fail statuses to report (but where you want to allow the pipeline to be able to continue). Any one of these will let you do what you want in one fashion or another. You are, of course, welcome to use a legacy plugin version and wait for JENKINS-26522 if you like. In general though, it isn't a great idea to make an implementation dependent on an obvious bug . While we try not to break users, responsible maintainers do have a duty to prioritize fixes for the general use case over an individual user's custom hack.
            Hide
            clockman Piotr Zegar added a comment -

            Well, making implementation that depend on workaround is also not great idea.
            I stick to old version because I like to see greed/red status for each stage, with new version all stages become red, but those that were red previously are darker. When using currentBuild.result that stages lose that darker color .

            As this is mainly information for me, I going to stick with current version and wait JENKINS-26522.
            With parallel task there is one problem, they don't visible directly in Stage View .
            And I like a current Stage View, nice to see specific steps build, ut, component tests, target tests, ... See how much time were needed for each stage...

            Show
            clockman Piotr Zegar added a comment - Well, making implementation that depend on workaround is also not great idea. I stick to old version because I like to see greed/red status for each stage, with new version all stages become red, but those that were red previously are darker. When using currentBuild.result that stages lose that darker color . As this is mainly information for me, I going to stick with current version and wait JENKINS-26522 . With parallel task there is one problem, they don't visible directly in Stage View . And I like a current Stage View, nice to see specific steps build, ut, component tests, target tests, ... See how much time were needed for each stage...

              People

              • Assignee:
                svanoort Sam Van Oort
                Reporter:
                mscharp Michael Scharp
              • Votes:
                5 Vote for this issue
                Watchers:
                10 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: