Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-44056

Pipeline jobs waiting in queue on master (avialable executors)

    Details

    • Type: Bug
    • Status: Closed (View Workflow)
    • Priority: Major
    • Resolution: Cannot Reproduce
    • Component/s: job-dsl-plugin
    • Environment:
      Docker version 1.9.1, build 78ee77d/1.9.1
      Operating System: Linux 3.10.0-327.13.1.el7.x86_64 GNU/Linux
      Jenkins ver. 2.19.3
      Plugins:
       - job-dsl:1.47
       - Pipeline:2.4



    • Similar Issues:

      Description

      We have a problem with Pipeline jobs which starts on master node and then are redirecting to a specific slaves. They are starting and then they are waiting on last executor doing nothing for about 5-10 minutes (see 01.png picture), after this time the job is redirecting to a specific slave and checkout starts.

      On our master node we have jobs "***Generator" which are generating jobs through DSL-plugin and they are starting automatically by SVN polling. On our master node we have about 20 free executors( see 02.png picture). When they are all free pipeline jobs do not block themselves, but if on master node are running 1 or more DSL jobs, Pipeline jobs are waiting in queue on last executor and they do nothing for about 5-10 minutes and after this time job is redirecting to a specific slave and process continuous normally.

      Why one DSL job is blocking all Pipeline jobs even if there are 19 free executors ?

       

      UPDATE:

      On 03.png you can see how deploy jobs cumulate below last free executor ( they are not even assigned to a specific executor, they just wait below last master executor)

      • job-dsl:1.63

      I deduced that the problem of blocking executors by DSL-jobs is related to increasing build time trend. After jenkins restart dsl jobs are working slower and slower. See 04.png and 05.png pictures ( this two screens present build time trend for two different job generators). If more and more generators are working longer time, after one week from Jenkins restart dsl-generator are working very long time on master slave. If there are at least 2 of them and estimated time to generate job is about 7-10 minutes, other pipelines job are waiting in queue on below last executor.

      So the problem now is why dsl-jobs build time trend is increasing after Jenkins restart.

       

        Attachments

        1. 01.png
          01.png
          8 kB
        2. 02.png
          02.png
          19 kB
        3. 03.png
          03.png
          29 kB
        4. 04.png
          04.png
          67 kB
        5. 05.png
          05.png
          93 kB

          Activity

          Hide
          danielbeck Daniel Beck added a comment -

          It's not clear what the problem is or whether there's in fact a bug at all. Your description is not sufficient to tell.

          I recommend you ask on the Jenkins users mailing list for advice.

          Show
          danielbeck Daniel Beck added a comment - It's not clear what the problem is or whether there's in fact a bug at all. Your description is not sufficient to tell. I recommend you ask on the Jenkins users mailing list for advice.
          Hide
          mmorawski Mikołaj Morawski added a comment -

          Description updated ( more informations )

          Show
          mmorawski Mikołaj Morawski added a comment - Description updated ( more informations )
          Hide
          daspilker Daniel Spilker added a comment -

          Post a minimal DSL script that reproduces the problem.

          Also check the JVM memory stats for your master. There are several reports about memory leaks (e.g. JENKINS-46687 and JENKINS-46514). If this problem is caused by memory pressure, this issue can be closed as duplicate.

          Show
          daspilker Daniel Spilker added a comment - Post a minimal DSL script that reproduces the problem. Also check the JVM memory stats for your master. There are several reports about memory leaks (e.g. JENKINS-46687 and JENKINS-46514 ). If this problem is caused by memory pressure, this issue can be closed as duplicate.

            People

            • Assignee:
              daspilker Daniel Spilker
              Reporter:
              mmorawski Mikołaj Morawski
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: