Details

    • Type: Bug
    • Status: Open (View Workflow)
    • Priority: Minor
    • Resolution: Unresolved
    • Component/s: core
    • Labels:
      None
    • Environment:
      Jenkins 1.611, prioritysorter 3.2
    • Similar Issues:

      Description

      Unknown if core or plugin issue.
      Had a group of jobs stuck waiting in the queue.
      Hovering over them shows triggers and wait time as usual, but no indication of what they're waiting for.

      Suspicious log entry:
      May 21, 2015 4:43:19 PM hudson.util.DescribableList buildDependencyGraph
      SEVERE: Failed to build dependency graph for hudson.model.FreeStyleProject@57fdabe[MyStuckJob]
      java.lang.NullPointerException
      at hudson.tasks.Fingerprinter$FingerprintAction.getFingerprints(Fingerprinter.java:373)
      at hudson.tasks.Fingerprinter$FingerprintAction.getDependencies(Fingerprinter.java:403)
      at hudson.tasks.Fingerprinter$FingerprintAction.getDependencies(Fingerprinter.java:390)
      at hudson.tasks.Fingerprinter.buildDependencyGraph(Fingerprinter.java:157)
      at hudson.util.DescribableList.buildDependencyGraph(DescribableList.java:219)
      at hudson.model.Project.buildDependencyGraph(Project.java:207)
      at hudson.model.DependencyGraph.build(DependencyGraph.java:95)
      at jenkins.model.Jenkins.rebuildDependencyGraph(Jenkins.java:3748)
      at jenkins.model.Jenkins$25.call(Jenkins.java:3770)
      at jenkins.model.Jenkins$25.call(Jenkins.java:3766)
      at java.util.concurrent.FutureTask.run(FutureTask.java:262)
      at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:178)
      at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:292)
      at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
      at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
      at java.lang.Thread.run(Thread.java:745)

      Possibly caused by deleting some other jobs - they weren't supposed to be in the dependency tree, but coincidentally included identical files that were fingerprinted at some point.

      Queuing strategy is Weighted Fair Queuing.
      Weren't in the queue on startup, unless lack of JENKINS-28486 pulling them in would count.

        Attachments

          Activity

          Hide
          emsa23 Magnus Sandberg added a comment -

          PrioritySorter will only order the Jobs in the Queue and will not interfere with execution/scheduling (if you do not use "Run Exclusive") so this does look like a problem somewhere else.

          Totally of the topic: You are the first one I see that uses anything else besides "Absolute" sorting - I would be happy if you can share your thought on the Queueing strategy, feel free to email me. Thanks,

          Show
          emsa23 Magnus Sandberg added a comment - PrioritySorter will only order the Jobs in the Queue and will not interfere with execution/scheduling (if you do not use "Run Exclusive") so this does look like a problem somewhere else. Totally of the topic: You are the first one I see that uses anything else besides "Absolute" sorting - I would be happy if you can share your thought on the Queueing strategy, feel free to email me. Thanks,
          Hide
          jameshowe James Howe added a comment - - edited

          Downgraded plugin to 2.12. Restarted.
          Dependency graph errors not seen. Items already stuck in queue went through, though a few hours later they were stuck again.
          Status now shows as (pending—???)

          Manually deleted from disk fingerprints that mentioned jobs I'd recently deleted. Reloaded.
          As above.

          Opened and saved the config of a job that was stuck without making changes.
          Immediately all jobs that were stuck began executing.
          Next time the job came through the queue, it and some others got stuck again.

          Show
          jameshowe James Howe added a comment - - edited Downgraded plugin to 2.12. Restarted. Dependency graph errors not seen. Items already stuck in queue went through, though a few hours later they were stuck again. Status now shows as (pending—???) Manually deleted from disk fingerprints that mentioned jobs I'd recently deleted. Reloaded. As above. Opened and saved the config of a job that was stuck without making changes. Immediately all jobs that were stuck began executing. Next time the job came through the queue, it and some others got stuck again.
          Hide
          jameshowe James Howe added a comment - - edited

          Doesn't appear to have gotten stuck again since then.

          Jobs got stuck again for a few hours. Fewer than before.
          This time clicking on one of them was enough to have them immediately execute.

          Show
          jameshowe James Howe added a comment - - edited Doesn't appear to have gotten stuck again since then. Jobs got stuck again for a few hours. Fewer than before. This time clicking on one of them was enough to have them immediately execute.
          Hide
          jameshowe James Howe added a comment -

          Got stuck with a blank status, and a config save no longer pushes them though.
          Maybe there are two related issues here.

          May 28, 2015 2:12:32 PM FINE PrioritySorter.Queue.Items
          Blocking: Id: 1093, JobName: MyJob, jobGroupId: 1, reason: <none>, priority: 4, weight: 5.9999997E-4, status: BLOCKED

          Show
          jameshowe James Howe added a comment - Got stuck with a blank status, and a config save no longer pushes them though. Maybe there are two related issues here. May 28, 2015 2:12:32 PM FINE PrioritySorter.Queue.Items Blocking: Id: 1093, JobName: MyJob, jobGroupId: 1, reason: <none>, priority: 4, weight: 5.9999997E-4, status: BLOCKED
          Hide
          emsa23 Magnus Sandberg added a comment -

          The log-line you have is telling us that the jenkins-core has changed status of the Job to BLOCKED - not more than that.

          Show
          emsa23 Magnus Sandberg added a comment - The log-line you have is telling us that the jenkins-core has changed status of the Job to BLOCKED - not more than that.
          Hide
          jameshowe James Howe added a comment -

          I think the non-fixable blocking may be due to a detected circular dependency.
          Job A has job B listed as upstream, but job B should actually be a few layers downstream of job A.
          Both jobs, and all jobs downstream and in-between, are blocked with no message.

          This would have been computed using historical fingerprints?
          Any way to sort this out?
          At the moment I just have to cancel them all and manually trigger the bottom one.

          Show
          jameshowe James Howe added a comment - I think the non-fixable blocking may be due to a detected circular dependency. Job A has job B listed as upstream, but job B should actually be a few layers downstream of job A. Both jobs, and all jobs downstream and in-between, are blocked with no message. This would have been computed using historical fingerprints? Any way to sort this out? At the moment I just have to cancel them all and manually trigger the bottom one.
          Hide
          jameshowe James Howe added a comment -

          I disabled hudson.tasks.Fingerprinter.enableFingerprintsInDependencyGraph, which I had not realised had been set.
          No further problems.

          I suspect it's purely using the timestamp of a fingerprint rather than any other relationship, which lead to my nonsensical dependencies.
          Perhaps there should be a way to detect these cycles, to remove the offending fingerprint, or to override the detected dependency.

          Show
          jameshowe James Howe added a comment - I disabled hudson.tasks.Fingerprinter.enableFingerprintsInDependencyGraph, which I had not realised had been set. No further problems. I suspect it's purely using the timestamp of a fingerprint rather than any other relationship, which lead to my nonsensical dependencies. Perhaps there should be a way to detect these cycles, to remove the offending fingerprint, or to override the detected dependency.
          Hide
          danielbeck Daniel Beck added a comment -

          James Howe Could you explain how to reproduce the issue on a new Jenkins instance? Even if that happened, there should be a user-visible explanation of the behavior. "???" isn't exactly helpful.

          Show
          danielbeck Daniel Beck added a comment - James Howe Could you explain how to reproduce the issue on a new Jenkins instance? Even if that happened, there should be a user-visible explanation of the behavior. "???" isn't exactly helpful.
          Hide
          jameshowe James Howe added a comment -

          Not easily, this instance has been up for many years, and anything could have happened to mix up the fingerprints.

          There are also two different cases I've detailed above.
          The one with the blank reason, and the one with "???".
          Each has a different workaround to recover.

          Show
          jameshowe James Howe added a comment - Not easily, this instance has been up for many years, and anything could have happened to mix up the fingerprints. There are also two different cases I've detailed above. The one with the blank reason, and the one with "???". Each has a different workaround to recover.

            People

            • Assignee:
              emsa23 Magnus Sandberg
              Reporter:
              jameshowe James Howe
            • Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

              • Created:
                Updated: