Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-2913

svn polling only working for a small subset of jobs in 1.277

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Fixed
    • Icon: Critical Critical
    • subversion-plugin
    • None
    • Platform: All, OS: All

      After upgrading from Hudson 1.270 to 1.277 I have started to see a major problem
      with Subversion polling in our environment.

      It appears to be caused, or at least catalyzed, by this statement on line 852 of
      java/hudson/scm/SubversionSCM.java:

      Thread.sleep(60*1000); // debug

      Judging from svn and hudson changelogs this line was introduced to
      SubversionSCM.java somewhere around 1.273.

      I commented out this line, compiled my own SubversionSCM.class file, and patched
      it into my Hudson 1.277 environment. After applying this change, polling seems
      to be working fine again. There is a clear difference in the Subversion polling
      logs. The polling duration has reduced from > 1 minute to 1-2 seconds, which is
      where it was before with 1.270.

      Configuration in our environment which is worth mentioning:

      • concurrent svn polling limit of 12 threads in Hudson system config
      • using ssh-slaves plugin to manage about 20 remote nodes
      • about 90 jobs which are configured to poll svn on differing schedules: every
        24 hours, every 10 minutes, every 5 minutes, every 3 minutes, and every 1 minute
        depending on the job.

      I attached to the Hudson JVM using jconsole and monitored the threads used for
      svn polling. As I expected, a pool of 12 threads was created. But
      surprisingly, only 4 of these threads would poll svn. The same 4 threads were
      just repeatedly polling the same 4 jobs. The rest of the threads appeared to be
      waiting for locks held by these 4 threads:

      "pool-2-thread-12" Id=249 WAITING on
      java.util.concurrent.locks.ReentrantLock$NonfairSync@18b9d48 owned by
      "pool-2-thread-8" Id=243
      at sun.misc.Unsafe.park(Native Method)

      • waiting on java.util.concurrent.locks.ReentrantLock$NonfairSync@18b9d48

      I think I understand why the 4 active threads kept polling the same jobs: these
      were jobs configured to poll every minute and the Hudson polling Java method was
      sleeping for 60 seconds during every call. So every time the code would leave
      the pollChanges method, the same job would immediately require another polling
      attempt.

      I had other jobs (not the 4 lucky ones) polling hours apart when it should
      have been minutes. The other 8 threads were always waiting for locks held by
      the 4 active threads. Perhaps this has exposed a bigger problem with
      synchronization between polling threads. It makes no sense to me that only 4 of
      the 12 threads would be actively polling as I have a large number of projects
      polling all the time. Maybe the sleep statement in 1.270 was exposing the
      problem already documented for earlier builds in issue 2516
      (https://hudson.dev.java.net/issues/show_bug.cgi?id=2516) in my environment.
      Yesterday I added a comment to this issue with a thread dump from my environment.

      If the sleep statement was checked in accidentally (the "debug" comment makes me
      wonder) it should be removed for a future Hudson release.

      -Andrew

      Versions:
      Hudson: 1.277 (winstone container)
      JDK: 1.6.0_06

            Unassigned Unassigned
            andrewgoktepe Andrew Goktepe
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

              Created:
              Updated:
              Resolved: