Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-19743

Massive parallel builds sometimes cause errors in LogRotation

    Details

    • Similar Issues:

      Description

      We submit about 120 parallel builds with 15sec execution time => issue may be caused by frequent log rotation without locks.

      Jobs seem to be OK, but there is a lot of SEVERE messages in the Jenkins log.

      SEVERE: Executor threw an exception
      java.util.NoSuchElementException
      at com.google.common.collect.AbstractIterator.next(AbstractIterator.java:154)
      at java.util.Collections$UnmodifiableCollection$1.next(Collections.java:1067)
      at java.util.AbstractMap$2$1.next(AbstractMap.java:385)
      at hudson.util.RunList.subList(RunList.java:143)
      at hudson.tasks.LogRotator.perform(LogRotator.java:119)
      at hudson.model.Job.logRotate(Job.java:404)
      at hudson.model.Run.execute(Run.java:1655)
      at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46)
      at hudson.model.ResourceController.execute(ResourceController.java:88)
      at hudson.model.Executor.run(Executor.java:237)ct.AbstractIterator.next(AbstractIterator.java:154)

        Attachments

        1. BuildFlowPluginTest.cs
          0.7 kB
        2. LogRotator.txt
          1 kB
        3. Run.txt
          1 kB
        4. StressTest_Job_config_1.png
          StressTest_Job_config_1.png
          116 kB
        5. StressTest_Job_config_2.png
          StressTest_Job_config_2.png
          68 kB
        6. StressTest_Job_config_3.png
          StressTest_Job_config_3.png
          85 kB
        7. StressTest_SubJob_config_1.png
          StressTest_SubJob_config_1.png
          102 kB
        8. StressTest_SubJob_config_2.png
          StressTest_SubJob_config_2.png
          87 kB
        9. StressTest_SubJob_config_3.png
          StressTest_SubJob_config_3.png
          57 kB

          Activity

          Hide
          phraktle Viktor Szathmary added a comment -

          Using Jenkins ver. 1.636, we are seeing the following exception on a job allowing concurrent builds and having lots of short jobs:

          hudson.model.Run execute
          SEVERE: Failed to rotate log
          java.util.NoSuchElementException
          at jenkins.model.lazy.LazyLoadRunMapEntrySet$1.next(LazyLoadRunMapEntrySet.java:76)
          at jenkins.model.lazy.LazyLoadRunMapEntrySet$1.next(LazyLoadRunMapEntrySet.java:63)
          at java.util.AbstractMap$2$1.next(AbstractMap.java:396)
          at hudson.util.RunList.subList(RunList.java:139)
          at hudson.tasks.LogRotator.perform(LogRotator.java:125)
          at hudson.model.Job.logRotate(Job.java:467)
          at hudson.model.Run.execute(Run.java:1805)
          at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:43)
          at hudson.model.ResourceController.execute(ResourceController.java:98)
          at hudson.model.Executor.run(Executor.java:410)

          Show
          phraktle Viktor Szathmary added a comment - Using Jenkins ver. 1.636, we are seeing the following exception on a job allowing concurrent builds and having lots of short jobs: hudson.model.Run execute SEVERE: Failed to rotate log java.util.NoSuchElementException at jenkins.model.lazy.LazyLoadRunMapEntrySet$1.next(LazyLoadRunMapEntrySet.java:76) at jenkins.model.lazy.LazyLoadRunMapEntrySet$1.next(LazyLoadRunMapEntrySet.java:63) at java.util.AbstractMap$2$1.next(AbstractMap.java:396) at hudson.util.RunList.subList(RunList.java:139) at hudson.tasks.LogRotator.perform(LogRotator.java:125) at hudson.model.Job.logRotate(Job.java:467) at hudson.model.Run.execute(Run.java:1805) at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:43) at hudson.model.ResourceController.execute(ResourceController.java:98) at hudson.model.Executor.run(Executor.java:410)
          Hide
          phraktle Viktor Szathmary added a comment -

          We are also seeing this one (reported above as well). So it seems log rotation for concurrent jobs has a race condition.

          Nov 06, 2015 6:01:49 AM hudson.model.Run execute
          SEVERE: Failed to rotate log
          java.io.IOException:Redacted #173213: /redacted/builds/173213 looks to have already been deleted; siblings: [....lots of job ids....]
          at hudson.model.Run.delete(Run.java:1483)
          at hudson.tasks.LogRotator.perform(LogRotator.java:144)
          at hudson.model.Job.logRotate(Job.java:467)
          at hudson.model.Run.execute(Run.java:1805)
          at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:43)
          at hudson.model.ResourceController.execute(ResourceController.java:98)
          at hudson.model.Executor.run(Executor.java:410)

          Show
          phraktle Viktor Szathmary added a comment - We are also seeing this one (reported above as well). So it seems log rotation for concurrent jobs has a race condition. Nov 06, 2015 6:01:49 AM hudson.model.Run execute SEVERE: Failed to rotate log java.io.IOException:Redacted #173213: /redacted/builds/173213 looks to have already been deleted; siblings: [....lots of job ids....] at hudson.model.Run.delete(Run.java:1483) at hudson.tasks.LogRotator.perform(LogRotator.java:144) at hudson.model.Job.logRotate(Job.java:467) at hudson.model.Run.execute(Run.java:1805) at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:43) at hudson.model.ResourceController.execute(ResourceController.java:98) at hudson.model.Executor.run(Executor.java:410)
          Hide
          schniedergers Klaus Schniedergers added a comment -

          I am seeing the same (SEVERE: Failed to rotate log), on 1.625.3 (most recent LTS).

          Once the SEVERE log entries start, Jenkins service starts to use all CPU it can get and becomes very slow/unresponsive.

          Show
          schniedergers Klaus Schniedergers added a comment - I am seeing the same (SEVERE: Failed to rotate log), on 1.625.3 (most recent LTS). Once the SEVERE log entries start, Jenkins service starts to use all CPU it can get and becomes very slow/unresponsive.
          Hide
          columbusmonkey Gavin Shelley added a comment -

          We see this on 1.611. We too have 100s of parallel jobs running. This pollutes the logs with scary messages. Not sure if it actually affects us apart from logs, but it would be nice to see this resolved in case it is the cause of the occasional unexplained problem.

          Show
          columbusmonkey Gavin Shelley added a comment - We see this on 1.611. We too have 100s of parallel jobs running. This pollutes the logs with scary messages. Not sure if it actually affects us apart from logs, but it would be nice to see this resolved in case it is the cause of the occasional unexplained problem.
          Hide
          mcsf M Chon added a comment - - edited

          I am getting these errors on the machine we build pull requests on:

           

          Nov 09, 2017 10:11:53 AM SEVERE hudson.model.Run executeFailed to rotate logjava.io.IOException: My-Build #1000: /var/lib/jenkins/jobs/My-Build/builds/1000 looks to have already been deleted; siblings: [1021, 1027, 1047, 1091, 1010, 1034, 1040, 1013, 1041, 1083, 1016, 1049, 1030, 1070, 1004, 1099, 1073, 1024, 1009, 1084, 1039, 1001, 1094, 1100, 1057, 1003, 1007, 1052, 1065, lastSuccessfulBuild, 1045, 1026, 1022, 1061, 1054, 1044, 1093, .1000, 1087, 1063, 1072, 1018, 1096, 1074, 1019, lastUnstableBuild, 1092, 1031, 1033, 1005, 1051, 1043, 1068, 1075, 1095, 1079, 1036, 1032, 1029, 1048, 1042, legacyIds, lastFailedBuild, lastUnsuccessfulBuild, 1025, 1078, 1080, 1090, 1046, 1069, 999, 1014, 1020, lastStableBuild, 1067, 1053, 1028, 1002, 1064, 1059, 1082, 1056, 1017, 1071, 1077, 1097, 1037, 1086, 1076, 1008, 1006, 1081, 1088, 1058, 998, 1023, 1060, 1050, 1012, 1062, 1066, 1038, 1098, 1085, 1055, 1015, 1011, 1089, 1035]        at hudson.model.Run.delete(Run.java:1483)        at hudson.tasks.LogRotator.perform(LogRotator.java:131)        at hudson.model.Job.logRotate(Job.java:474)        at hudson.model.Run.execute(Run.java:1784)        at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:43)        at hudson.model.ResourceController.execute(ResourceController.java:98)        at hudson.model.Executor.run(Executor.java:404) 

           

          Nov 09, 2017 10:11:53 AM SEVERE hudson.model.Run executeFailed to rotate logjava.io.IOException: /var/lib/jenkins/jobs/My-Build/builds/1000 is in use        at hudson.model.Run.delete(Run.java:1503)        at hudson.tasks.LogRotator.perform(LogRotator.java:131)        at hudson.model.Job.logRotate(Job.java:474)        at hudson.model.Run.execute(Run.java:1784)        at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:43)        at hudson.model.ResourceController.execute(ResourceController.java:98)        at hudson.model.Executor.run(Executor.java:404)

          Show
          mcsf M Chon added a comment - - edited I am getting these errors on the machine we build pull requests on:   Nov 09, 2017 10:11:53 AM SEVERE hudson.model.Run execute Failed to rotate log java.io.IOException: My-Build #1000: /var/lib/jenkins/jobs/My-Build/builds/1000 looks to have already been deleted; siblings: [1021, 1027, 1047, 1091, 1010, 1034, 1040, 1013, 1041, 1083, 1016, 1049, 1030, 1070, 1004, 1099, 1073, 1024, 1009, 1084, 1039, 1001, 1094, 1100, 1057, 1003, 1007, 1052, 1065, lastSuccessfulBuild, 1045, 1026, 1022, 1061, 1054, 1044, 1093, .1000, 1087, 1063, 1072, 1018, 1096, 1074, 1019, lastUnstableBuild, 1092, 1031, 1033, 1005, 1051, 1043, 1068, 1075, 1095, 1079, 1036, 1032, 1029, 1048, 1042, legacyIds, lastFailedBuild, lastUnsuccessfulBuild, 1025, 1078, 1080, 1090, 1046, 1069, 999, 1014, 1020, lastStableBuild, 1067, 1053, 1028, 1002, 1064, 1059, 1082, 1056, 1017, 1071, 1077, 1097, 1037, 1086, 1076, 1008, 1006, 1081, 1088, 1058, 998, 1023, 1060, 1050, 1012, 1062, 1066, 1038, 1098, 1085, 1055, 1015, 1011, 1089, 1035]         at hudson.model.Run.delete(Run.java:1483)         at hudson.tasks.LogRotator.perform(LogRotator.java:131)         at hudson.model.Job.logRotate(Job.java:474)         at hudson.model.Run.execute(Run.java:1784)         at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:43)         at hudson.model.ResourceController.execute(ResourceController.java:98)         at hudson.model.Executor.run(Executor.java:404)     Nov 09, 2017 10:11:53 AM SEVERE hudson.model.Run execute Failed to rotate log java.io.IOException: /var/lib/jenkins/jobs/My-Build/builds/1000 is in use         at hudson.model.Run.delete(Run.java:1503)         at hudson.tasks.LogRotator.perform(LogRotator.java:131)         at hudson.model.Job.logRotate(Job.java:474)         at hudson.model.Run.execute(Run.java:1784)         at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:43)         at hudson.model.ResourceController.execute(ResourceController.java:98)         at hudson.model.Executor.run(Executor.java:404)

            People

            • Assignee:
              Unassigned
              Reporter:
              oleg_nenashev Oleg Nenashev
            • Votes:
              12 Vote for this issue
              Watchers:
              18 Start watching this issue

              Dates

              • Created:
                Updated: