Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-54643

A connection interruption causes the pipeline to fail when USE_WATCHING=true

    Details

    • Similar Issues:

      Description

      Run Jenkins with -Dorg.jenkinsci.plugins.workflow.steps.durable_task.DurableTaskStep.USE_WATCHING=true. Add an agent launched via SSH (the launch method may not be important; this is just what I've observed the issue with).

      Add a pipeline job with this script:

      node('mynode') {
          sh '''#!/bin/sh -e
              for n in $(seq 100); do
                  echo "$n"
                  sleep 1
              done
          '''
          sh 'echo OK'
      }
      

      Run the pipeline. When it starts printing numbers to the log, disconnect the master from the network. After 30 seconds, reconnect it.

      What happens is that for a while (haven't measured, but it feels like a couple of minutes) nothing new appears in the log. After that, the job instantly completes, but:

      • Some of the output is missing from the log.
      • The "echo OK" step doesn't run.
      • The pipeline fails with an EOFException.

      I'm attaching a full example log.

      By contrast, with USE_WATCHING=false the log resumes a few seconds after the reconnection, no output is skipped and the job succeeds.

        Attachments

          Issue Links

            Activity

            rdonchen_intel Roman Donchenko created issue -
            vivek Vivek Pandey made changes -
            Field Original Value New Value
            Labels regression regression triaged-2018-11
            jglick Jesse Glick made changes -
            Assignee Jesse Glick [ jglick ]
            jglick Jesse Glick made changes -
            Link This issue relates to JENKINS-52165 [ JENKINS-52165 ]
            jglick Jesse Glick made changes -
            Link This issue relates to JENKINS-41854 [ JENKINS-41854 ]
            rdonchen_intel Roman Donchenko made changes -
            Description Run Jenkins with {{-Dorg.jenkinsci.plugins.workflow.steps.durable_task.DurableTaskStep.USE_WATCHING=true}}. Add an agent launched via SSH (the launch method may not be important; this is just what I've observed the issue with).

            Add a pipeline job with this script:

            {code:groovy}
            node('mynode') {
                sh '''#!/bin/sh -e
                    for n in $(seq 100); do
                        echo "$n"
                        sleep 1
                    done
                '''
                sh 'echo OK'
            }
            {code}

            Run the pipeline. When is starts printing numbers to the log, disconnect the master from the network. After 30 seconds, reconnect it.

            What happens is that for a while (haven't measured, but it feels like a couple of minutes) nothing new appears in the log. After that, the job instantly completes, but:

            * Some of the output is missing from the log.
            * The "echo OK" step doesn't run.
            * The pipeline fails with an {{EOFException}}.

            I'm attaching a full example log.

            By contrast, with {{USE_WATCHING=false}} the log resumes a few seconds after the reconnection, no output is skipped and the job succeeds.
            Run Jenkins with {{-Dorg.jenkinsci.plugins.workflow.steps.durable_task.DurableTaskStep.USE_WATCHING=true}}. Add an agent launched via SSH (the launch method may not be important; this is just what I've observed the issue with).

            Add a pipeline job with this script:

            {code:groovy}
            node('mynode') {
                sh '''#!/bin/sh -e
                    for n in $(seq 100); do
                        echo "$n"
                        sleep 1
                    done
                '''
                sh 'echo OK'
            }
            {code}

            Run the pipeline. When it starts printing numbers to the log, disconnect the master from the network. After 30 seconds, reconnect it.

            What happens is that for a while (haven't measured, but it feels like a couple of minutes) nothing new appears in the log. After that, the job instantly completes, but:

            * Some of the output is missing from the log.
            * The "echo OK" step doesn't run.
            * The pipeline fails with an {{EOFException}}.

            I'm attaching a full example log.

            By contrast, with {{USE_WATCHING=false}} the log resumes a few seconds after the reconnection, no output is skipped and the job succeeds.
            jglick Jesse Glick made changes -
            Link This issue relates to JENKINS-56851 [ JENKINS-56851 ]
            jglick Jesse Glick made changes -
            Status Open [ 1 ] Resolved [ 5 ]
            Resolution Duplicate [ 3 ]

              People

              • Assignee:
                jglick Jesse Glick
                Reporter:
                rdonchen_intel Roman Donchenko
              • Votes:
                0 Vote for this issue
                Watchers:
                3 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: