Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-15120

Minimize round trips for slave class loading

    Details

    • Similar Issues:

      Description

      Currently each attempt to load a class in a remote JVM makes a round-trip request to the master, which over a laggy network can make class loading quite slow, thus adding considerable overhead to the first build on a new slave.

      Two possible solutions have been put forward.

      Optimistic prefetch

      The idea: when sending a class file to be loaded, scan its bytecode for other statically linked classes which have not yet been loaded, and send those along as well. In the common case that a network of classes is loaded around the same time, this would avoid some round trips.

      Details

      The first part is for one side to keep track of what classes the other side has already loaded (into which class loader), which is basically memorizing the response from IClassLoader.fetch2.

      The second part is to add Collection<ClassFile> IClassLoader.fetch3 that works like fetch2, except it will also parse the class, figure out some of the referenced classes that are not yet loaded by the other side, then send them along.

      Those prefetched class files would need to be remembered by RemoteClassLoader so that when those are actually requested it can load a class in the right classloader without calling back RemoteClassLoader.proxy. (Assuming there are no side effects, it could also eagerly call RemoteClassLoader.loadClassFile on the prefetched classes.)

      The remoting layer supports talking to an earlier version of the remoting layer. We do this by a bitmask in Capability, so this needs one more bit defined there. There is no point in tracking the classes the other side has loaded if the other side will never call fetch3.

      Bulk transfer

      Send entire JAR files at a time, rather than individual classes; can wind up transferring more than is needed, but the reduction in latency is probably worth it. Since arbitrary class loader graphs might be in use, not just a flat classpath, some custom code needs to be run remotely which will implement the class loader delegation model without hitting the network for each class.

      Details

      An API sketch:

      class Channel {
        void setClassLoaderTrafficCop(TrafficCop cop);
      }
      interface TrafficCop {
        /** do I know/control/own this classloader? */
        boolean controls(ClassLoader cl);
        Set<JarFile> getJarFilesOf(ClassLoader cl);
        RemotePartOfTrafficCop getRemotePart();
      }
      class JarFile {
        String checksum();
        InputStream data();
      }
      /** runs in remote agent */
      interface RemotePartOfTrafficCop implements Serializable {
        /** given a class/resource and an originating loader, what is the defining loader? */
        RemoteClassLoader trafficControl(RemoteClassLoader origin, String resourceName);
      }
      

        Attachments

          Issue Links

            Activity

            jglick Jesse Glick created issue -
            Hide
            jglick Jesse Glick added a comment -

            Can add a dependency from remoting on ASM via http://kohsuke.org/2012/03/03/potd-package-renamed-asm/ if needed for prefetch.

            Show
            jglick Jesse Glick added a comment - Can add a dependency from remoting on ASM via http://kohsuke.org/2012/03/03/potd-package-renamed-asm/ if needed for prefetch.
            Hide
            scm_issue_link SCM/JIRA link daemon added a comment -

            Code changed in jenkins
            User: Jesse Glick
            Path:
            src/main/java/hudson/remoting/Channel.java
            http://jenkins-ci.org/commit/remoting/2b1ec8ab152805f01b4063dabc4dcdef64421fed
            Log:
            JENKINS-15120 Kohsuke’s explanation of why preloadJar does not really help.

            Show
            scm_issue_link SCM/JIRA link daemon added a comment - Code changed in jenkins User: Jesse Glick Path: src/main/java/hudson/remoting/Channel.java http://jenkins-ci.org/commit/remoting/2b1ec8ab152805f01b4063dabc4dcdef64421fed Log: JENKINS-15120 Kohsuke’s explanation of why preloadJar does not really help.
            jglick Jesse Glick made changes -
            Field Original Value New Value
            Status Open [ 1 ] In Progress [ 3 ]
            Show
            jglick Jesse Glick added a comment - https://github.com/jenkinsci/remoting/pull/10
            Hide
            jglick Jesse Glick added a comment -

            https://github.com/jenkinsci/mock-slave-plugin useful for testing impact on performance more controllably than simply connecting to some node in the cloud.

            Show
            jglick Jesse Glick added a comment - https://github.com/jenkinsci/mock-slave-plugin useful for testing impact on performance more controllably than simply connecting to some node in the cloud.
            Hide
            jglick Jesse Glick added a comment -

            remoting #37248f1 also suggests using

            sudo tc qdisc add dev lo root netem delay 100ms
            

            to simulate a laggy network.

            Show
            jglick Jesse Glick added a comment - remoting #37248f1 also suggests using sudo tc qdisc add dev lo root netem delay 100ms to simulate a laggy network.
            jglick Jesse Glick made changes -
            Link This issue is related to JENKINS-16261 [ JENKINS-16261 ]
            Hide
            jglick Jesse Glick added a comment -

            Using mock-slave with 10ms latency and building a multimodule Maven project in current trunk I get

            Loading Type Time (s) Count
            Classes 223.8 2993
            Resources 2.7 23

            With prefetch-JENKINS-15120 from Jenkins core checked out (which pulls in a branch of the same name from remoting), this was

            Loading Type Time (s) Count
            Classes 165.5 3361 (prefetch cache: 1651)
            Resources 2.3 26

            though timing is not exactly comparable since the branch currently enables rather verbose logging which slows down the connection.

            Show
            jglick Jesse Glick added a comment - Using mock-slave with 10ms latency and building a multimodule Maven project in current trunk I get Loading Type Time (s) Count Classes 223.8 2993 Resources 2.7 23 With prefetch-JENKINS-15120 from Jenkins core checked out (which pulls in a branch of the same name from remoting ), this was Loading Type Time (s) Count Classes 165.5 3361 (prefetch cache: 1651) Resources 2.3 26 though timing is not exactly comparable since the branch currently enables rather verbose logging which slows down the connection.
            Hide
            jglick Jesse Glick added a comment -

            Retesting with svn up for both builds (originally the first used co), and with verbose logging turned off in the branch. Trunk builds takes 6:02 min (of which the Maven build itself was 1:26):

            Loading Type Time (s) Count
            Classes 243.2 3320
            Resources 2.6 26

            I tried to run the branch build again but this time it failed with a java.lang.OutOfMemoryError: PermGen space (on the master) which is disconcerting; did the branch introduce some kind of class loader leak?

            I retried it, this time succeeding in 4:10 (Maven build 1:06):

            Loading Type Time (s) Count
            Classes 141.0 3323 (prefetch cache: 1633)
            Resources 2.2 26

            So that is a 31% reduction in build time, which I would say is pretty good.

            BTW using https://svn.codehaus.org/mojo/trunk/mojo/nbm-maven as the test project, more or less arbitrarily. Intentionally using a native Maven project since that puts far more load on the remoting layer than a freestyle project.

            Show
            jglick Jesse Glick added a comment - Retesting with svn up for both builds (originally the first used co ), and with verbose logging turned off in the branch. Trunk builds takes 6:02 min (of which the Maven build itself was 1:26): Loading Type Time (s) Count Classes 243.2 3320 Resources 2.6 26 I tried to run the branch build again but this time it failed with a java.lang.OutOfMemoryError: PermGen space (on the master) which is disconcerting; did the branch introduce some kind of class loader leak? I retried it, this time succeeding in 4:10 (Maven build 1:06): Loading Type Time (s) Count Classes 141.0 3323 (prefetch cache: 1633) Resources 2.2 26 So that is a 31% reduction in build time, which I would say is pretty good. BTW using https://svn.codehaus.org/mojo/trunk/mojo/nbm-maven as the test project, more or less arbitrarily. Intentionally using a native Maven project since that puts far more load on the remoting layer than a freestyle project.
            Hide
            scm_issue_link SCM/JIRA link daemon added a comment -

            Code changed in jenkins
            User: Kohsuke Kawaguchi
            Path:
            changelog.html
            core/src/main/java/hudson/ClassicPluginStrategy.java
            maven-plugin/src/main/java/hudson/maven/Maven3Builder.java
            pom.xml
            http://jenkins-ci.org/commit/jenkins/f7330d7a158eff6705706b1f812993a9b918c351
            Log:
            [FIXED JENKINS-15120]

            • integrated the newer release of remoting
            • jar caching won't work with class file directory, so plugin
              WEB-INF/classes are now exploded as WEB-INF/lib/classes.jar
              (This should also solve the problem of slow plugin extraction in the
              presene of Anti-virus software on Windows.)
            • because the structure of the exploded jar file has changed, I changed
              the up-to-date check timestamp file name to force re-extraction in
              existing installations.

            Compare: https://github.com/jenkinsci/jenkins/compare/b1f5d28f90ca...f7330d7a158e

            Show
            scm_issue_link SCM/JIRA link daemon added a comment - Code changed in jenkins User: Kohsuke Kawaguchi Path: changelog.html core/src/main/java/hudson/ClassicPluginStrategy.java maven-plugin/src/main/java/hudson/maven/Maven3Builder.java pom.xml http://jenkins-ci.org/commit/jenkins/f7330d7a158eff6705706b1f812993a9b918c351 Log: [FIXED JENKINS-15120] integrated the newer release of remoting jar caching won't work with class file directory, so plugin WEB-INF/classes are now exploded as WEB-INF/lib/classes.jar (This should also solve the problem of slow plugin extraction in the presene of Anti-virus software on Windows.) because the structure of the exploded jar file has changed, I changed the up-to-date check timestamp file name to force re-extraction in existing installations. Compare: https://github.com/jenkinsci/jenkins/compare/b1f5d28f90ca...f7330d7a158e
            scm_issue_link SCM/JIRA link daemon made changes -
            Status In Progress [ 3 ] Resolved [ 5 ]
            Resolution Fixed [ 1 ]
            Hide
            dogfood dogfood added a comment -

            Integrated in jenkins_main_trunk #2582
            [FIXED JENKINS-15120] (Revision f7330d7a158eff6705706b1f812993a9b918c351)

            Result = UNSTABLE
            kohsuke : f7330d7a158eff6705706b1f812993a9b918c351
            Files :

            • maven-plugin/src/main/java/hudson/maven/Maven3Builder.java
            • pom.xml
            • core/src/main/java/hudson/ClassicPluginStrategy.java
            • changelog.html
            Show
            dogfood dogfood added a comment - Integrated in jenkins_main_trunk #2582 [FIXED JENKINS-15120] (Revision f7330d7a158eff6705706b1f812993a9b918c351) Result = UNSTABLE kohsuke : f7330d7a158eff6705706b1f812993a9b918c351 Files : maven-plugin/src/main/java/hudson/maven/Maven3Builder.java pom.xml core/src/main/java/hudson/ClassicPluginStrategy.java changelog.html
            Show
            jglick Jesse Glick added a comment - Also: https://github.com/jenkinsci/jenkins/compare/f7330d7a158eff6705706b1f812993a9b918c351...cabbe941c73a0db63e59492ff2e3722c41b239f0
            Hide
            scm_issue_link SCM/JIRA link daemon added a comment -

            Code changed in jenkins
            User: Kohsuke Kawaguchi
            Path:
            src/main/java/org/jenkinsci/maven/plugins/hpi/AbstractHpiMojo.java
            src/main/java/org/jenkinsci/maven/plugins/hpi/HpiMojo.java
            http://jenkins-ci.org/commit/maven-hpi-plugin/603535119122e820751607812bb7d0bd3b8c3556
            Log:
            Modified to produce a jar file instead of WEB-INF/classes

            Because of the prefetching change in remoting (see JENKINS-15120), it
            is desirable now to produce class files in a jar file, not in a classes
            directory, so that slaves can prefetch them and cache them efficiently.

            Compare: https://github.com/jenkinsci/maven-hpi-plugin/compare/4d27f2880fa5...603535119122

            Show
            scm_issue_link SCM/JIRA link daemon added a comment - Code changed in jenkins User: Kohsuke Kawaguchi Path: src/main/java/org/jenkinsci/maven/plugins/hpi/AbstractHpiMojo.java src/main/java/org/jenkinsci/maven/plugins/hpi/HpiMojo.java http://jenkins-ci.org/commit/maven-hpi-plugin/603535119122e820751607812bb7d0bd3b8c3556 Log: Modified to produce a jar file instead of WEB-INF/classes Because of the prefetching change in remoting (see JENKINS-15120 ), it is desirable now to produce class files in a jar file, not in a classes directory, so that slaves can prefetch them and cache them efficiently. Compare: https://github.com/jenkinsci/maven-hpi-plugin/compare/4d27f2880fa5...603535119122
            Hide
            hx_unbanned Linards L added a comment - - edited

            Would this work affect freestyle projects using ant / nant / batch builds?

            Show
            hx_unbanned Linards L added a comment - - edited Would this work affect freestyle projects using ant / nant / batch builds?
            Hide
            jglick Jesse Glick added a comment -

            Would this work affect freestyle projects using ant / nant / batch builds?

            Some, though not as much as for Maven projects. Depends on the plugins in use during the build, particularly how large their transitive dependencies are.

            Show
            jglick Jesse Glick added a comment - Would this work affect freestyle projects using ant / nant / batch builds? Some, though not as much as for Maven projects. Depends on the plugins in use during the build, particularly how large their transitive dependencies are.
            drulli Ulli Hafner made changes -
            Link This issue is blocking JENKINS-18405 [ JENKINS-18405 ]
            drulli Ulli Hafner made changes -
            Link This issue is blocking JENKINS-18405 [ JENKINS-18405 ]
            drulli Ulli Hafner made changes -
            Link This issue is related to JENKINS-18405 [ JENKINS-18405 ]
            Hide
            drulli Ulli Hafner added a comment - - edited

            Seems that this new feature breaks class loading of the findbugs plug-in, see JENKINS-18405 and JENKINS-18394 for details...

            Show
            drulli Ulli Hafner added a comment - - edited Seems that this new feature breaks class loading of the findbugs plug-in, see JENKINS-18405 and JENKINS-18394 for details...
            Hide
            kutzi kutzi added a comment -

            Maybe also the cause of JENKINS-18401

            Show
            kutzi kutzi added a comment - Maybe also the cause of JENKINS-18401
            kutzi kutzi made changes -
            Link This issue is related to JENKINS-18401 [ JENKINS-18401 ]
            Hide
            drulli Ulli Hafner added a comment -

            Reopening this issue since it breaks class loading of findbugs plug-in and maven jobs. I can reproduce that issue on my machine, is there anything I can add in my plug-in to prevent these incompatible class exceptions?

            Show
            drulli Ulli Hafner added a comment - Reopening this issue since it breaks class loading of findbugs plug-in and maven jobs. I can reproduce that issue on my machine, is there anything I can add in my plug-in to prevent these incompatible class exceptions?
            drulli Ulli Hafner made changes -
            Resolution Fixed [ 1 ]
            Status Resolved [ 5 ] Reopened [ 4 ]
            kutzi kutzi made changes -
            Link This issue is related to JENKINS-18459 [ JENKINS-18459 ]
            Hide
            kohsuke Kohsuke Kawaguchi added a comment -

            Process wise, let's leave this bug closed. Instead, if you come across bugs that appear to be related, please link them.

            Show
            kohsuke Kohsuke Kawaguchi added a comment - Process wise, let's leave this bug closed. Instead, if you come across bugs that appear to be related, please link them.
            kohsuke Kohsuke Kawaguchi made changes -
            Status Reopened [ 4 ] Resolved [ 5 ]
            Assignee Jesse Glick [ jglick ] Kohsuke Kawaguchi [ kohsuke ]
            Resolution Fixed [ 1 ]
            kohsuke Kohsuke Kawaguchi made changes -
            Link This issue is related to JENKINS-18394 [ JENKINS-18394 ]
            Hide
            scm_issue_link SCM/JIRA link daemon added a comment -

            Code changed in jenkins
            User: Kohsuke Kawaguchi
            Path:
            src/main/java/hudson/remoting/ResourceImageDirect.java
            http://jenkins-ci.org/commit/remoting/781ccaec2e26797bc8fa9e6c982a031988d381e0
            Log:
            JENKINS-18405

            This fixes the 'Unknown url shema' error in the FindBugs plugin.

            Before JENKINS-15120, we used to faithfully recreate the expected
            resource path in a temporary resource file. For some reasons, we lost
            that. This change brings it back by recreting the directory structure.

            Note that this change doesn't address
            "java.lang.IncompatibleClassChangeError: Implementing class" error
            reported also in JENKINS-18405. That is still under investigation.

            Show
            scm_issue_link SCM/JIRA link daemon added a comment - Code changed in jenkins User: Kohsuke Kawaguchi Path: src/main/java/hudson/remoting/ResourceImageDirect.java http://jenkins-ci.org/commit/remoting/781ccaec2e26797bc8fa9e6c982a031988d381e0 Log: JENKINS-18405 This fixes the 'Unknown url shema' error in the FindBugs plugin. Before JENKINS-15120 , we used to faithfully recreate the expected resource path in a temporary resource file. For some reasons, we lost that. This change brings it back by recreting the directory structure. Note that this change doesn't address "java.lang.IncompatibleClassChangeError: Implementing class" error reported also in JENKINS-18405 . That is still under investigation.
            kutzi kutzi made changes -
            Link This issue is related to JENKINS-18525 [ JENKINS-18525 ]
            yyuu Yuu Yamashita made changes -
            Link This issue is related to JENKINS-18528 [ JENKINS-18528 ]
            Hide
            yyuu Yuu Yamashita added a comment -

            This also affects plugins written in JRuby. I created the issue of ruby plugins as JENKINS-18528

            Show
            yyuu Yuu Yamashita added a comment - This also affects plugins written in JRuby. I created the issue of ruby plugins as JENKINS-18528
            kutzi kutzi made changes -
            Link This issue is related to JENKINS-18533 [ JENKINS-18533 ]
            drulli Ulli Hafner made changes -
            Link This issue is related to JENKINS-18836 [ JENKINS-18836 ]
            Hide
            scm_issue_link SCM/JIRA link daemon added a comment -

            Code changed in jenkins
            User: Nicolas De Loof
            Path:
            src/main/resources/hudson/plugins/ec2/EC2Computer/configure.jelly
            http://jenkins-ci.org/commit/ec2-plugin/995b20442314a3edb615caf803123e77c7c2c9fb
            Log:
            due to JENKINS-15120 relative path isn't supported anymore by st:include

            Show
            scm_issue_link SCM/JIRA link daemon added a comment - Code changed in jenkins User: Nicolas De Loof Path: src/main/resources/hudson/plugins/ec2/EC2Computer/configure.jelly http://jenkins-ci.org/commit/ec2-plugin/995b20442314a3edb615caf803123e77c7c2c9fb Log: due to JENKINS-15120 relative path isn't supported anymore by st:include
            Hide
            scm_issue_link SCM/JIRA link daemon added a comment -

            Code changed in jenkins
            User: Francis Upton
            Path:
            src/main/resources/hudson/plugins/ec2/EC2Computer/configure.jelly
            http://jenkins-ci.org/commit/ec2-plugin/8d5be49e298d130fd4f16db9d5b28a9272825cbe
            Log:
            Merge pull request #62 from ndeloof/master

            due to JENKINS-15120 relative path isn't supported anymore by st:include

            Compare: https://github.com/jenkinsci/ec2-plugin/compare/6a374e309c22...8d5be49e298d

            Show
            scm_issue_link SCM/JIRA link daemon added a comment - Code changed in jenkins User: Francis Upton Path: src/main/resources/hudson/plugins/ec2/EC2Computer/configure.jelly http://jenkins-ci.org/commit/ec2-plugin/8d5be49e298d130fd4f16db9d5b28a9272825cbe Log: Merge pull request #62 from ndeloof/master due to JENKINS-15120 relative path isn't supported anymore by st:include Compare: https://github.com/jenkinsci/ec2-plugin/compare/6a374e309c22...8d5be49e298d
            Hide
            scm_issue_link SCM/JIRA link daemon added a comment -

            Code changed in jenkins
            User: Kohsuke Kawaguchi
            Path:
            src/main/java/hudson/maven/Maven3Builder.java
            http://jenkins-ci.org/commit/maven-plugin/97b452ecc95a5546c471198126834f770a63a249
            Log:
            [FIXED JENKINS-15120]

            • integrated the newer release of remoting
            • jar caching won't work with class file directory, so plugin
              WEB-INF/classes are now exploded as WEB-INF/lib/classes.jar
              (This should also solve the problem of slow plugin extraction in the
              presene of Anti-virus software on Windows.)
            • because the structure of the exploded jar file has changed, I changed
              the up-to-date check timestamp file name to force re-extraction in
              existing installations.

            Originally-Committed-As: f7330d7a158eff6705706b1f812993a9b918c351

            Show
            scm_issue_link SCM/JIRA link daemon added a comment - Code changed in jenkins User: Kohsuke Kawaguchi Path: src/main/java/hudson/maven/Maven3Builder.java http://jenkins-ci.org/commit/maven-plugin/97b452ecc95a5546c471198126834f770a63a249 Log: [FIXED JENKINS-15120] integrated the newer release of remoting jar caching won't work with class file directory, so plugin WEB-INF/classes are now exploded as WEB-INF/lib/classes.jar (This should also solve the problem of slow plugin extraction in the presene of Anti-virus software on Windows.) because the structure of the exploded jar file has changed, I changed the up-to-date check timestamp file name to force re-extraction in existing installations. Originally-Committed-As: f7330d7a158eff6705706b1f812993a9b918c351
            rtyler R. Tyler Croy made changes -
            Workflow JNJira [ 145840 ] JNJira + In-Review [ 191653 ]

              People

              • Assignee:
                kohsuke Kohsuke Kawaguchi
                Reporter:
                jglick Jesse Glick
              • Votes:
                2 Vote for this issue
                Watchers:
                12 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: