Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-9189

truncation or corruption of zip workspace archive from slave

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Resolved (View Workflow)
    • Priority: Major
    • Resolution: Fixed
    • Component/s: core
    • Labels:
      None
    • Environment:
      Hudson/Jenkins >= 1.378, Debian, slave connected via SSH
    • Similar Issues:

      Description

      Downloading a ZIP archive of the workspace from a project that was built on a slave appears to be broken since Hudson 1.378 (found by bisection between 1.365 and 1.406-SNAPSHOT). It worked and still works when the project was built on the master, so no remoting takes place.

      How to reproduce:

      1. Set up a free-style project that just creates a few files in the workspace, such as:
        env > env.txt
        ls -la > ls-la.txt
        dd if=/dev/urandom of=random.bin bs=512 count=2048
        
      2. Restrict this project to run on a slave (connected via SSH in my case).
      3. Run this project.
      4. Using the "(all files in zip)" link, download the workspace and verify the downloaded Zip archive. With 1.377 and before, you can run the download and verification step in a loop from the command line for 100 times in a row without error. Since 1.378, it will usually fail at the second attempt and will, on first glance at the hexdump, look like a correct but truncated Zip archive. The script that I used for testing is this:
        $ cat test.sh 
        i=0
        while [ $i -lt 100 ]; do
                i=`expr $i + 1`
                echo $i
                wget -q -O test.zip 'http://localhost/jenkins/job/test/ws/*zip*/test.zip' && \
                unzip -l test.zip > /dev/null || exit $?
        done
        exit 0
        

      Known workaround:

      • Run the job on the Jenkins master. (This isn't an option in our setup.)

      Possibly related issues:

      The changelog of 1.378 mentions JENKINS-5977 "Improving the master/slave communication to avoid pipe clogging problem." and I suspect that this change introduced the problem. A later changelog entry for 1.397 mentions that it fixed "a master/slave communication problem since 1.378" (JENKINS-7745). However, using the steps described above I can still reproduce at least this issue, even in the current version 1.404 and the latest snapshot.

      As suggested in comments of other issues touching on the field of master/slave communication, it would seem reasonable to assume that this issue could be caused by a missing flush operation on an output stream, or something to that effect. Another possibility, however likely, might be the suspected thread concurrency problem noted in remoting/src/main/java/hudson/remoting/PipeWindow.java, where it also mentions the issues JENKINS-7745 or JENKINS-7581.

        Attachments

          Issue Links

            Activity

            Hide
            scm_issue_link SCM/JIRA link daemon added a comment -

            Code changed in jenkins
            User: Kohsuke Kawaguchi
            Path:
            remoting/src/main/java/hudson/remoting/ProxyOutputStream.java
            remoting/src/main/java/hudson/remoting/Request.java
            http://jenkins-ci.org/commit/jenkins/ef2c8a7d119611a40dfbca91e8c26af9ad8dcbb5
            Log:
            [FIXED JENKINS-9189] fixed the race condition between I/O operation and the return of the Channel.call() execution in more fundamental way.
            (cherry picked from commit 9cdd9cc0c5640beeb6bf36a4b26fa1ddcce7fd60)

            Show
            scm_issue_link SCM/JIRA link daemon added a comment - Code changed in jenkins User: Kohsuke Kawaguchi Path: remoting/src/main/java/hudson/remoting/ProxyOutputStream.java remoting/src/main/java/hudson/remoting/Request.java http://jenkins-ci.org/commit/jenkins/ef2c8a7d119611a40dfbca91e8c26af9ad8dcbb5 Log: [FIXED JENKINS-9189] fixed the race condition between I/O operation and the return of the Channel.call() execution in more fundamental way. (cherry picked from commit 9cdd9cc0c5640beeb6bf36a4b26fa1ddcce7fd60)
            Hide
            scm_issue_link SCM/JIRA link daemon added a comment -

            Code changed in jenkins
            User: Kohsuke Kawaguchi
            Path:
            remoting/src/main/java/hudson/remoting/ProxyOutputStream.java
            remoting/src/main/java/hudson/remoting/Request.java
            http://jenkins-ci.org/commit/jenkins/9cdd9cc0c5640beeb6bf36a4b26fa1ddcce7fd60
            Log:
            [FIXED JENKINS-9189] fixed the race condition between I/O operation and the return of the Channel.call() execution in more fundamental way.

            Show
            scm_issue_link SCM/JIRA link daemon added a comment - Code changed in jenkins User: Kohsuke Kawaguchi Path: remoting/src/main/java/hudson/remoting/ProxyOutputStream.java remoting/src/main/java/hudson/remoting/Request.java http://jenkins-ci.org/commit/jenkins/9cdd9cc0c5640beeb6bf36a4b26fa1ddcce7fd60 Log: [FIXED JENKINS-9189] fixed the race condition between I/O operation and the return of the Channel.call() execution in more fundamental way.
            Hide
            scm_issue_link SCM/JIRA link daemon added a comment -

            Code changed in jenkins
            User: Kohsuke Kawaguchi
            Path:
            remoting/src/main/java/hudson/remoting/ProxyOutputStream.java
            remoting/src/main/java/hudson/remoting/Request.java
            http://jenkins-ci.org/commit/jenkins/9cdd9cc0c5640beeb6bf36a4b26fa1ddcce7fd60
            Log:
            [FIXED JENKINS-9189] fixed the race condition between I/O operation and the return of the Channel.call() execution in more fundamental way.

            Show
            scm_issue_link SCM/JIRA link daemon added a comment - Code changed in jenkins User: Kohsuke Kawaguchi Path: remoting/src/main/java/hudson/remoting/ProxyOutputStream.java remoting/src/main/java/hudson/remoting/Request.java http://jenkins-ci.org/commit/jenkins/9cdd9cc0c5640beeb6bf36a4b26fa1ddcce7fd60 Log: [FIXED JENKINS-9189] fixed the race condition between I/O operation and the return of the Channel.call() execution in more fundamental way.
            Hide
            scm_issue_link SCM/JIRA link daemon added a comment -

            Code changed in jenkins
            User: Kohsuke Kawaguchi
            Path:
            src/main/java/hudson/remoting/ProxyOutputStream.java
            src/main/java/hudson/remoting/Request.java
            http://jenkins-ci.org/commit/remoting/8ffed0da4996934bfc28bf6b08c258d367a1c526
            Log:
            [JENKINS-11251 JENKINS-9189] Resurrecting what's deleted in e0e154d12d7a10759287b187467389c6e643c12b

            When communicating with remoting < 2.15, this allows them to continue to
            perform some degree of syncing, so that they can still enjoy the fix for
            JENKINS-9189.

            None of these code is exposed via API outside remoting, so at some point
            we can revert this change to simplify the code a bit and eliminate the
            redundancy, because as long as >= 2.15 remoting talk to each other,
            PipeWriter does everything we need.

            Show
            scm_issue_link SCM/JIRA link daemon added a comment - Code changed in jenkins User: Kohsuke Kawaguchi Path: src/main/java/hudson/remoting/ProxyOutputStream.java src/main/java/hudson/remoting/Request.java http://jenkins-ci.org/commit/remoting/8ffed0da4996934bfc28bf6b08c258d367a1c526 Log: [JENKINS-11251 JENKINS-9189] Resurrecting what's deleted in e0e154d12d7a10759287b187467389c6e643c12b When communicating with remoting < 2.15, this allows them to continue to perform some degree of syncing, so that they can still enjoy the fix for JENKINS-9189 . None of these code is exposed via API outside remoting, so at some point we can revert this change to simplify the code a bit and eliminate the redundancy, because as long as >= 2.15 remoting talk to each other, PipeWriter does everything we need.
            Hide
            scm_issue_link SCM/JIRA link daemon added a comment -

            Code changed in jenkins
            User: Kohsuke Kawaguchi
            Path:
            src/main/java/hudson/remoting/Channel.java
            src/main/java/hudson/remoting/Pipe.java
            src/main/java/hudson/remoting/PipeWriter.java
            src/main/java/hudson/remoting/ProxyOutputStream.java
            src/main/java/hudson/remoting/Request.java
            src/main/java/hudson/remoting/Response.java
            http://jenkins-ci.org/commit/remoting/e0e154d12d7a10759287b187467389c6e643c12b
            Log:
            [FIXED JENKINS-11251] reimplemented I/O and Request/Response sync

            See PipeWriter javadoc for the discussion and the context of this.
            This change re-implements the original fix for JENKINS-9189.

            Show
            scm_issue_link SCM/JIRA link daemon added a comment - Code changed in jenkins User: Kohsuke Kawaguchi Path: src/main/java/hudson/remoting/Channel.java src/main/java/hudson/remoting/Pipe.java src/main/java/hudson/remoting/PipeWriter.java src/main/java/hudson/remoting/ProxyOutputStream.java src/main/java/hudson/remoting/Request.java src/main/java/hudson/remoting/Response.java http://jenkins-ci.org/commit/remoting/e0e154d12d7a10759287b187467389c6e643c12b Log: [FIXED JENKINS-11251] reimplemented I/O and Request/Response sync See PipeWriter javadoc for the discussion and the context of this. This change re-implements the original fix for JENKINS-9189 .

              People

              • Assignee:
                kohsuke Kohsuke Kawaguchi
                Reporter:
                ustuehler Uwe Stuehler
              • Votes:
                6 Vote for this issue
                Watchers:
                7 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: