Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-33219

Job.updateNextBuildNumber can cause deadlock

XMLWordPrintable

      The attached thread dump shows two deadlocked threads:

      • "jenkins.util.Timer 7" daemon prio=5 BLOCKED
      • "Executor #1 for master : executing DSL Job Builder #390" daemon prio=5 BLOCKED)

      Jenkins 1.650

      This happens as follows:

      1. I use the Job DSL plugin to create a bunch of jobs. Many have SCM Poll trigger enabled, causing them to run instantly as soon as they are created (possibly also when updated, looking at the stacktrace).
      2. As part of job creation, Job.updateNextBuildNumber is called (Next Build Number plugin integrated with Job DSL)

      The two events above create deadlock.

      The first one locks Job since updateNextBuildNumber() is synchronized. Then it calls AbstractLazyLoadRunMap.getByNumber(). The synchronized block in this method is new as of 1.646(https://github.com/jenkinsci/jenkins/commit/d5167025a204750633c931ea8c1fff8d7561ab9c#diff-383116e240993025e5b727359e61db09)

      The second one calls AbstractLazyLoadRunMap.getByNumber() first, which causes a call to AbstractProject.save(), but Job is an AbstractProject... so, deadlock.

      Here are the relevant parts for convenience, bold elements are causing deadlock:

      "Executor #1 for master : executing DSL Job Builder #390" daemon prio=5 BLOCKED
      jenkins.model.lazy.AbstractLazyLoadRunMap.getByNumber(AbstractLazyLoadRunMap.java:356)
      jenkins.model.lazy.AbstractLazyLoadRunMap.search(AbstractLazyLoadRunMap.java:332)
      jenkins.model.lazy.AbstractLazyLoadRunMap.newestBuild(AbstractLazyLoadRunMap.java:274)
      jenkins.model.lazy.LazyBuildMixIn.getLastBuild(LazyBuildMixIn.java:238)
      hudson.model.AbstractProject.getLastBuild(AbstractProject.java:993)
      hudson.model.AbstractProject.getLastBuild(AbstractProject.java:144)
      hudson.model.Job.updateNextBuildNumber(Job.java:422)
      org.jvnet.hudson.plugins.nextbuildnumber.JobDslExtension.notifyItemUpdated(JobDslExtension.java:30)

      "jenkins.util.Timer 7" daemon prio=5 BLOCKED
      hudson.model.AbstractProject.save(AbstractProject.java:305)
      hudson.model.Job.addProperty(Job.java:523)
      hudson.model.AbstractProject.addProperty(AbstractProject.java:785)
      hudson.plugins.disk_usage.DiskUsageUtil.addProperty(DiskUsageUtil.java:58)
      hudson.plugins.disk_usage.BuildDiskUsageAction.<init>(BuildDiskUsageAction.java:38)
      hudson.plugins.disk_usage.DiskUsageBuildActionFactory.createFor(DiskUsageBuildActionFactory.java:31)
      hudson.plugins.disk_usage.DiskUsageBuildActionFactory.createFor(DiskUsageBuildActionFactory.java:21)
      hudson.model.Actionable.createFor(Actionable.java:107)
      hudson.model.Actionable.getAllActions(Actionable.java:98)
      hudson.model.Run.onLoad(Run.java:343)
      hudson.model.RunMap.retrieve(RunMap.java:224)
      hudson.model.RunMap.retrieve(RunMap.java:56)
      jenkins.model.lazy.AbstractLazyLoadRunMap.load(AbstractLazyLoadRunMap.java:479)
      jenkins.model.lazy.AbstractLazyLoadRunMap.load(AbstractLazyLoadRunMap.java:461)
      jenkins.model.lazy.AbstractLazyLoadRunMap.getByNumber(AbstractLazyLoadRunMap.java:367)
      jenkins.model.lazy.AbstractLazyLoadRunMap.search(AbstractLazyLoadRunMap.java:332)
      jenkins.model.lazy.AbstractLazyLoadRunMap.newestBuild(AbstractLazyLoadRunMap.java:274)
      jenkins.model.lazy.LazyBuildMixIn.getLastBuild(LazyBuildMixIn.java:238)
      hudson.model.AbstractProject.getLastBuild(AbstractProject.java:993)
      hudson.model.AbstractProject.getLastBuild(AbstractProject.java:144)
      hudson.views.AbstractBuildTrendFilter.matches(AbstractBuildTrendFilter.java:71)
      hudson.views.AbstractIncludeExcludeJobFilter.doFilter(AbstractIncludeExcludeJobFilter.java:68)
      hudson.views.AbstractIncludeExcludeJobFilter.filter(AbstractIncludeExcludeJobFilter.java:57)
      hudson.model.ListView.getItems(ListView.java:195)
      hudson.model.ListView.getItems(ListView.java:67)
      jenkins.advancedqueue.jobinclusion.strategy.ViewBasedJobInclusionStrategy.isJobInView(ViewBasedJobInclusionStrategy.java:182)
      jenkins.advancedqueue.jobinclusion.strategy.ViewBasedJobInclusionStrategy.contains(ViewBasedJobInclusionStrategy.java:149)
      jenkins.advancedqueue.PriorityConfiguration.getJobGroup(PriorityConfiguration.java:241)
      jenkins.advancedqueue.PriorityConfiguration.getPriorityInternal(PriorityConfiguration.java:225)
      jenkins.advancedqueue.PriorityConfiguration.getPriority(PriorityConfiguration.java:203)
      jenkins.advancedqueue.sorter.AdvancedQueueSorter.onNewItem(AdvancedQueueSorter.java:136)
      jenkins.advancedqueue.sorter.AdvancedQueueSorterQueueListener.onEnterWaiting(AdvancedQueueSorterQueueListener.java:46)
      hudson.model.Queue$WaitingItem.enter(Queue.java:2348)
      hudson.model.Queue.scheduleInternal(Queue.java:599)
      hudson.model.Queue.schedule2(Queue.java:555)
      jenkins.model.ParameterizedJobMixIn.scheduleBuild2(ParameterizedJobMixIn.java:138)
      jenkins.model.ParameterizedJobMixIn.scheduleBuild(ParameterizedJobMixIn.java:94)
      hudson.model.AbstractProject.scheduleBuild(AbstractProject.java:836)

      My Job DSL setup is rather complicated and I have not been able to extract a simple test for this - if I do, I'll attach it.

      Update: 1.645 also deadlocks. AbstractLazyLoadRunMap.load() is synchronized. I am confused about why this started happening once I upgraded from 1.645 but continues to happen once I downgraded back. Maybe the newer version of some other plugin is making this more likely (ie SVN plugin polls more agressively, etc)

            Unassigned Unassigned
            akom Alexander Komarov
            Votes:
            2 Vote for this issue
            Watchers:
            7 Start watching this issue

              Created:
              Updated: