Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-63574

Write temporary build files to tmpdir, then copy to build directory

XMLWordPrintable

    • Icon: Improvement Improvement
    • Resolution: Unresolved
    • Icon: Minor Minor
    • core
    • None

      We are having some issues scaling our Jenkins instance, mostly related to I/O usage by the master for streaming logs from our somewhat large cluster of ~50 build nodes. We've done a lot of profiling of the master and have deduced that most Jenkins threads eventually get blocked while writing temporary files to the build directory. Here's some example output from bpfcc-tools demonstrating all file operations which take longer than 100ms:

      ci@ubuntu:~$ sudo ext4slower-bpfcc 100
      Tracing ext4 operations slower than 100 ms
      TIME     COMM           PID    T BYTES   OFF_KB   LAT(ms) FILENAME
      10:11:05 Running CpsFlo 1479   S 0       0         109.06 atomic5035202831358212690tmp
      10:11:05 Running CpsFlo 1479   S 0       0         102.48 atomic3566442655078485945tmp
      10:11:06 Running CpsFlo 1479   S 0       0         237.83 atomic4864556625324838693tmp
      10:11:06 Running CpsFlo 1479   S 0       0         232.48 atomic4542753629040480946tmp
      10:11:07 Running CpsFlo 1479   S 0       0         148.58 atomic4916498716229563983tmp
      10:11:07 Running CpsFlo 1479   S 0       0         107.19 atomic1691172645144959043tmp
      10:11:08 Running CpsFlo 1479   S 0       0         788.03 atomic2832058604570969326tmp
      10:11:08 Running CpsFlo 1479   S 0       0         125.40 atomic626139877724251546tmp
      10:11:09 Running CpsFlo 1479   S 0       0         128.26 atomic7892056760532110895tmp
      10:11:10 Running CpsFlo 1479   S 0       0         130.23 atomic4647402446487272860tmp
      10:11:10 Running CpsFlo 1479   S 0       0         118.76 atomic5612779110408076458tmp
      10:11:10 Running CpsFlo 1479   S 0       0         112.79 atomic3667236636019672640tmp
      

      My understanding here is that the atomic files are created as build logs are streamed from the nodes, and are flushed frequently to ensure data integrity.

      I understand the need for atomic file writes and am uncertain whether AtomicFileWriter.java can be optimized further. Instead, I think another solution might be to write these files to java.io.tmpdir and copy them to the build directory when they finish writing. For users who are willing to risk the data integrity in the name of performance (like myself), they could place the tmpdir on a ramdisk for much faster IO, and a single flush when the file is copied to the final location.

      I'm not well-versed in Jenkins internals, so please forgive me if my understanding of the atomic files used by the build logs is incorrect.

            Unassigned Unassigned
            nre_ableton Nik Reiman
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

              Created:
              Updated: