Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-51454

Pipeline retry operation doesn't retry when there is a timeout inside of it

    Details

    • Similar Issues:

      Description

      When a timeout call is fired inside of a retry, retry is not being triggered and job execution is aborted, the only way to make it work is by surrounding the timeout operation with a try/catch.

      without try/catch

      Log output

      Cancelling nested steps due to timeout
      

      Execution result

      Timeout has been exceeded
      Finished: ABORTED
      

      with try/catch

      Log output

      Timeout set to expire after 2 sec without activity
      Sleeping for 2 sec
      Cancelling nested steps due to timeout
      ERROR: catched timeout! org.jenkinsci.plugins.workflow.steps.FlowInterruptedException
      Retrying
      

      Execution result

      Finished: SUCCESS
      

       

      Examples to reproduce the issue

      Failing example

      node {
          def timeoutSeconds = 3
          stage('Preparation') { // for display purposes
              retry(3){
                  timeout(activity: true, time: 2, unit: 'SECONDS') {
                      sleep(timeoutSeconds)
                  }
                  timeoutSeconds--
              }
         }
      }
      

      Working example

      node {
          def timeoutSeconds = 3
          stage('Preparation') { // for display purposes
              retry(3){
                  try{
                      timeout(activity: true, time: 2, unit: 'SECONDS') {
                          sleep(timeoutSeconds)
                      }
                  }catch(err){
                      timeoutSeconds--
                      script{
                        def user = err.getCauses()[0].getUser()
                        error "[${user}] catched timeout! $err"
                      }
                  }
              }
         }
      }
      

        Attachments

          Issue Links

            Activity

            Hide
            nepomuk_seiler Nepomuk Seiler added a comment -

            It looks like the org.jenkinsci.plugins.workflow.steps.FlowInterruptedException exception could be the  cause. If the timeout directive would throw another exception, would this solve the issue?

            Show
            nepomuk_seiler Nepomuk Seiler added a comment - It looks like the org.jenkinsci.plugins.workflow.steps.FlowInterruptedException exception could be the  cause. If the timeout directive would throw another exception, would this solve the issue?
            Hide
            luispiedra Luis Piedra-Márquez added a comment -

            If capturing the `FlowInterruptedException` it will retry also when aborting the job for any other reason, like canceling. Definitively, timeout should thow a different exception.

            Show
            luispiedra Luis Piedra-Márquez added a comment - If capturing the ` FlowInterruptedException` it will retry also when aborting the job for any other reason, like canceling. Definitively, timeout should thow a different exception.
            Hide
            basil Basil Crow added a comment -

            I can reproduce this error in a scripted pipeline as well.

            Show
            basil Basil Crow added a comment - I can reproduce this error in a scripted pipeline as well.
            Hide
            basil Basil Crow added a comment -

            As a workaround, I was able to catch FlowInterruptedException and rethrow a more generic exception:

            import org.jenkinsci.plugins.workflow.steps.FlowInterruptedException
            
            retry(3) {
              try {
                timeout(time: 10, unit: 'MINUTES') {
                  [..]
                }
              } catch (FlowInterruptedException e) {
               // Work around https://issues.jenkins-ci.org/browse/JENKINS-51454
                error 'Timeout has been exceeded'
              }
            }
            
            Show
            basil Basil Crow added a comment - As a workaround, I was able to catch FlowInterruptedException and rethrow a more generic exception: import org.jenkinsci.plugins.workflow.steps.FlowInterruptedException retry(3) { try { timeout(time: 10, unit: 'MINUTES') { [..] } } catch (FlowInterruptedException e) { // Work around https://issues.jenkins-ci.org/browse/JENKINS-51454 error 'Timeout has been exceeded' } }
            Hide
            dnusbaum Devin Nusbaum added a comment -

            I think there are two desired behaviors depending on the placement of timeout and retry relative to each other.

            First case (this ticket):

            retry(3) {
              timeout(time: 5, unit: 'MINUTES') {
                // Something that can fail
              }
            }
            

            In this case, if the timeout triggers, I think the desired behavior is for retry to run its body again. This is not the current behavior.

            Second case (not this ticket):

            timeout(time: 5, unit: 'MINUTES') {
              retry(3) {
                // Something that can fail
              }
            }
            

            In this case, if the timeout triggers, I think the desired behavior is for retry to be aborted without retrying anything. This is the current behavior, and as far as I understand, is working as-designed after JENKINS-44379.

            Switching timeout to use a different kind of exception would fix the first case, but break the second case. To support both use cases, something more complex would be needed (see PR 81 for a possible approach, although retry would need to be updated as well)). Basil Crow Noted that a fix for JENKINS-60354 might overlap a bit with this issue.

            Show
            dnusbaum Devin Nusbaum added a comment - I think there are two desired behaviors depending on the placement of timeout and retry relative to each other. First case (this ticket): retry(3) { timeout(time: 5, unit: 'MINUTES' ) { // Something that can fail } } In this case, if the timeout triggers, I think the desired behavior is for retry to run its body again. This is not the current behavior. Second case (not this ticket): timeout(time: 5, unit: 'MINUTES' ) { retry(3) { // Something that can fail } } In this case, if the timeout triggers, I think the desired behavior is for retry to be aborted without retrying anything. This is the current behavior, and as far as I understand, is working as-designed after  JENKINS-44379 . Switching timeout to use a different kind of exception would fix the first case, but break the second case. To support both use cases, something more complex would be needed (see PR 81 for a possible approach, although retry would need to be updated as well)). Basil Crow Noted that a fix for  JENKINS-60354 might overlap a bit with this issue.

              People

              • Assignee:
                Unassigned
                Reporter:
                daconstenla David Constenla
              • Votes:
                9 Vote for this issue
                Watchers:
                11 Start watching this issue

                Dates

                • Created:
                  Updated: