Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-33168

Junit plugin took 7 hours to complete recording test failures

    XMLWordPrintable

    Details

    • Similar Issues:

      Description

      We heavily use the Junit plugin in our jobs to record our test results as part of our CD/CI pipeline
      Recently we hit over 10154 tests now, which is contained in about 50 junit.xml files we copy to the jenkins workspace of the job to have them recorded

      We use the jenkins junit post build step to record with these params
      Test report XMLs = */.xml
      Health report amplification factor =1.0
      Additional test report features = Allow claiming of failed tests & Test stability History

      We have had some regressions in our pipeline causing lots of tests to fail, today we had 235 failures, and the jenkins junit plugin blocked our job for 7 hours to complete the recording of the test results... this was crazy long, and "Aborting/canceling" the jenkins job did nothing, so we just had to wait it out..... this completely blocks our pipeline

      I didn't see any other bugs like this in the queue, so sorry if this is a duplicate. I am wondering if one of the "AddOns" we have like Test stability History could cause this?

      The Job console log literally took 7 hours to complete
      Recording test results
      Found 5922 test results
      Found failed test junit/com.ibm.test..........

      Anything we can do to fix this or make it so this doesn't happen we are willing to try, or ideas around this, as it pretty much blocks our entire pipeline, and if you want more info or debug etc, just let me know what you want and I can look to collect it

        Attachments

          Issue Links

            Activity

            Hide
            tsniatowski Tomasz Śniatowski added a comment -

            We're also seeing this problem on Jenkins ver. 2.7.2, junit plugin 1.18, Ubuntu 14.04 and it's causing a significant disruption because the slow parsing cannot be aborted.

            We're seeing threads in states like

            "Executor #0 for aa-01 : executing foo #141" Id=3787231 Group=main RUNNABLE
                at com.thoughtworks.xstream.converters.reflection.FieldDictionary.buildMap(FieldDictionary.java:122)
                -  locked com.thoughtworks.xstream.converters.reflection.FieldDictionary@4e6649e2
                at com.thoughtworks.xstream.converters.reflection.FieldDictionary.fieldOrNull(FieldDictionary.java:113)
                at com.thoughtworks.xstream.converters.reflection.PureJavaReflectionProvider.getFieldOrNull(PureJavaReflectionProvider.java:194)
                at hudson.util.RobustReflectionConverter.fieldDefinedInClass(RobustReflectionConverter.java:347)
                at hudson.util.RobustReflectionConverter.doUnmarshal(RobustReflectionConverter.java:284)
                at hudson.util.RobustReflectionConverter.unmarshal(RobustReflectionConverter.java:229)
                at com.thoughtworks.xstream.core.TreeUnmarshaller.convert(TreeUnmarshaller.java:72)
                at com.thoughtworks.xstream.core.AbstractReferenceUnmarshaller.convert(AbstractReferenceUnmarshaller.java:65)
                at com.thoughtworks.xstream.core.TreeUnmarshaller.convertAnother(TreeUnmarshaller.java:66)
                at com.thoughtworks.xstream.core.TreeUnmarshaller.convertAnother(TreeUnmarshaller.java:50)
                at com.thoughtworks.xstream.converters.collections.AbstractCollectionConverter.readItem(AbstractCollectionConverter.java:71)
                at hudson.util.RobustCollectionConverter.populateCollection(RobustCollectionConverter.java:85)
                at com.thoughtworks.xstream.converters.collections.CollectionConverter.unmarshal(CollectionConverter.java:80)
                at hudson.util.RobustCollectionConverter.unmarshal(RobustCollectionConverter.java:76)
                at com.thoughtworks.xstream.core.TreeUnmarshaller.convert(TreeUnmarshaller.java:72)
                at com.thoughtworks.xstream.core.AbstractReferenceUnmarshaller.convert(AbstractReferenceUnmarshaller.java:65)
                at com.thoughtworks.xstream.core.TreeUnmarshaller.convertAnother(TreeUnmarshaller.java:66)
                at hudson.util.RobustReflectionConverter.unmarshalField(RobustReflectionConverter.java:352)
                at hudson.util.RobustReflectionConverter.doUnmarshal(RobustReflectionConverter.java:290)
                at hudson.util.RobustReflectionConverter.unmarshal(RobustReflectionConverter.java:229)
                at com.thoughtworks.xstream.core.TreeUnmarshaller.convert(TreeUnmarshaller.java:72)
                at com.thoughtworks.xstream.core.AbstractReferenceUnmarshaller.convert(AbstractReferenceUnmarshaller.java:65)
                at com.thoughtworks.xstream.core.TreeUnmarshaller.convertAnother(TreeUnmarshaller.java:66)
                at com.thoughtworks.xstream.core.TreeUnmarshaller.convertAnother(TreeUnmarshaller.java:50)
                at com.thoughtworks.xstream.converters.collections.AbstractCollectionConverter.readItem(AbstractCollectionConverter.java:71)
                at hudson.util.RobustCollectionConverter.populateCollection(RobustCollectionConverter.java:85)
                at com.thoughtworks.xstream.converters.collections.CollectionConverter.unmarshal(CollectionConverter.java:80)
                at hudson.util.RobustCollectionConverter.unmarshal(RobustCollectionConverter.java:76)
                at com.thoughtworks.xstream.core.TreeUnmarshaller.convert(TreeUnmarshaller.java:72)
                at com.thoughtworks.xstream.core.AbstractReferenceUnmarshaller.convert(AbstractReferenceUnmarshaller.java:65)
                at com.thoughtworks.xstream.core.TreeUnmarshaller.convertAnother(TreeUnmarshaller.java:66)
                at hudson.util.RobustReflectionConverter.unmarshalField(RobustReflectionConverter.java:352)
                at hudson.util.RobustReflectionConverter.doUnmarshal(RobustReflectionConverter.java:290)
                at hudson.util.RobustReflectionConverter.unmarshal(RobustReflectionConverter.java:229)
                at com.thoughtworks.xstream.core.TreeUnmarshaller.convert(TreeUnmarshaller.java:72)
                at com.thoughtworks.xstream.core.AbstractReferenceUnmarshaller.convert(AbstractReferenceUnmarshaller.java:65)
                at com.thoughtworks.xstream.core.TreeUnmarshaller.convertAnother(TreeUnmarshaller.java:66)
                at com.thoughtworks.xstream.core.TreeUnmarshaller.convertAnother(TreeUnmarshaller.java:50)
                at com.thoughtworks.xstream.core.TreeUnmarshaller.start(TreeUnmarshaller.java:134)
                at com.thoughtworks.xstream.core.AbstractTreeMarshallingStrategy.unmarshal(AbstractTreeMarshallingStrategy.java:32)
                at com.thoughtworks.xstream.XStream.unmarshal(XStream.java:1189)
                at hudson.util.XStream2.unmarshal(XStream2.java:114)
                at com.thoughtworks.xstream.XStream.unmarshal(XStream.java:1173)
                at com.thoughtworks.xstream.XStream.fromXML(XStream.java:1053)
                at hudson.XmlFile.read(XmlFile.java:142)
                at hudson.tasks.junit.TestResultAction.load(TestResultAction.java:208)
                at hudson.tasks.junit.TestResultAction.getResult(TestResultAction.java:145)
                -  locked hudson.tasks.junit.TestResultAction@5c319ff8
                at hudson.tasks.junit.TestResultAction.getResult(TestResultAction.java:63)
                at hudson.tasks.test.AbstractTestResultAction.findCorrespondingResult(AbstractTestResultAction.java:248)
                at hudson.tasks.test.TestResult.getPreviousResult(TestResult.java:146)
                at hudson.tasks.junit.SuiteResult.getPreviousResult(SuiteResult.java:298)
                at hudson.tasks.junit.CaseResult.getPreviousResult(CaseResult.java:447)
                at hudson.tasks.junit.CaseResult.getPreviousResult(CaseResult.java:54)
                at de.esailors.jenkins.teststability.StabilityTestDataPublisher.buildUpInitialHistory(StabilityTestDataPublisher.java:149)
                at de.esailors.jenkins.teststability.StabilityTestDataPublisher.getTestData(StabilityTestDataPublisher.java:101)
                at hudson.tasks.junit.TestDataPublisher.contributeTestData(TestDataPublisher.java:75)
                at hudson.tasks.junit.JUnitResultArchiver.perform(JUnitResultArchiver.java:182)
                -  locked hudson.model.FreeStyleBuild@2aaa1e47
                at hudson.tasks.BuildStepCompatibilityLayer.perform(BuildStepCompatibilityLayer.java:78)
                at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:20)
                at hudson.model.AbstractBuild$AbstractBuildExecution.perform(AbstractBuild.java:782)
                at hudson.model.AbstractBuild$AbstractBuildExecution.performAllBuildSteps(AbstractBuild.java:723)
                at hudson.model.Build$BuildExecution.post2(Build.java:185)
                at hudson.model.AbstractBuild$AbstractBuildExecution.post(AbstractBuild.java:668)
                at hudson.model.Run.execute(Run.java:1763)
                at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:43)
                at hudson.model.ResourceController.execute(ResourceController.java:98)
                at hudson.model.Executor.run(Executor.java:410)
            
            Show
            tsniatowski Tomasz Śniatowski added a comment - We're also seeing this problem on Jenkins ver. 2.7.2, junit plugin 1.18, Ubuntu 14.04 and it's causing a significant disruption because the slow parsing cannot be aborted. We're seeing threads in states like "Executor #0 for aa-01 : executing foo #141" Id=3787231 Group=main RUNNABLE at com.thoughtworks.xstream.converters.reflection.FieldDictionary.buildMap(FieldDictionary.java:122) - locked com.thoughtworks.xstream.converters.reflection.FieldDictionary@4e6649e2 at com.thoughtworks.xstream.converters.reflection.FieldDictionary.fieldOrNull(FieldDictionary.java:113) at com.thoughtworks.xstream.converters.reflection.PureJavaReflectionProvider.getFieldOrNull(PureJavaReflectionProvider.java:194) at hudson.util.RobustReflectionConverter.fieldDefinedInClass(RobustReflectionConverter.java:347) at hudson.util.RobustReflectionConverter.doUnmarshal(RobustReflectionConverter.java:284) at hudson.util.RobustReflectionConverter.unmarshal(RobustReflectionConverter.java:229) at com.thoughtworks.xstream.core.TreeUnmarshaller.convert(TreeUnmarshaller.java:72) at com.thoughtworks.xstream.core.AbstractReferenceUnmarshaller.convert(AbstractReferenceUnmarshaller.java:65) at com.thoughtworks.xstream.core.TreeUnmarshaller.convertAnother(TreeUnmarshaller.java:66) at com.thoughtworks.xstream.core.TreeUnmarshaller.convertAnother(TreeUnmarshaller.java:50) at com.thoughtworks.xstream.converters.collections.AbstractCollectionConverter.readItem(AbstractCollectionConverter.java:71) at hudson.util.RobustCollectionConverter.populateCollection(RobustCollectionConverter.java:85) at com.thoughtworks.xstream.converters.collections.CollectionConverter.unmarshal(CollectionConverter.java:80) at hudson.util.RobustCollectionConverter.unmarshal(RobustCollectionConverter.java:76) at com.thoughtworks.xstream.core.TreeUnmarshaller.convert(TreeUnmarshaller.java:72) at com.thoughtworks.xstream.core.AbstractReferenceUnmarshaller.convert(AbstractReferenceUnmarshaller.java:65) at com.thoughtworks.xstream.core.TreeUnmarshaller.convertAnother(TreeUnmarshaller.java:66) at hudson.util.RobustReflectionConverter.unmarshalField(RobustReflectionConverter.java:352) at hudson.util.RobustReflectionConverter.doUnmarshal(RobustReflectionConverter.java:290) at hudson.util.RobustReflectionConverter.unmarshal(RobustReflectionConverter.java:229) at com.thoughtworks.xstream.core.TreeUnmarshaller.convert(TreeUnmarshaller.java:72) at com.thoughtworks.xstream.core.AbstractReferenceUnmarshaller.convert(AbstractReferenceUnmarshaller.java:65) at com.thoughtworks.xstream.core.TreeUnmarshaller.convertAnother(TreeUnmarshaller.java:66) at com.thoughtworks.xstream.core.TreeUnmarshaller.convertAnother(TreeUnmarshaller.java:50) at com.thoughtworks.xstream.converters.collections.AbstractCollectionConverter.readItem(AbstractCollectionConverter.java:71) at hudson.util.RobustCollectionConverter.populateCollection(RobustCollectionConverter.java:85) at com.thoughtworks.xstream.converters.collections.CollectionConverter.unmarshal(CollectionConverter.java:80) at hudson.util.RobustCollectionConverter.unmarshal(RobustCollectionConverter.java:76) at com.thoughtworks.xstream.core.TreeUnmarshaller.convert(TreeUnmarshaller.java:72) at com.thoughtworks.xstream.core.AbstractReferenceUnmarshaller.convert(AbstractReferenceUnmarshaller.java:65) at com.thoughtworks.xstream.core.TreeUnmarshaller.convertAnother(TreeUnmarshaller.java:66) at hudson.util.RobustReflectionConverter.unmarshalField(RobustReflectionConverter.java:352) at hudson.util.RobustReflectionConverter.doUnmarshal(RobustReflectionConverter.java:290) at hudson.util.RobustReflectionConverter.unmarshal(RobustReflectionConverter.java:229) at com.thoughtworks.xstream.core.TreeUnmarshaller.convert(TreeUnmarshaller.java:72) at com.thoughtworks.xstream.core.AbstractReferenceUnmarshaller.convert(AbstractReferenceUnmarshaller.java:65) at com.thoughtworks.xstream.core.TreeUnmarshaller.convertAnother(TreeUnmarshaller.java:66) at com.thoughtworks.xstream.core.TreeUnmarshaller.convertAnother(TreeUnmarshaller.java:50) at com.thoughtworks.xstream.core.TreeUnmarshaller.start(TreeUnmarshaller.java:134) at com.thoughtworks.xstream.core.AbstractTreeMarshallingStrategy.unmarshal(AbstractTreeMarshallingStrategy.java:32) at com.thoughtworks.xstream.XStream.unmarshal(XStream.java:1189) at hudson.util.XStream2.unmarshal(XStream2.java:114) at com.thoughtworks.xstream.XStream.unmarshal(XStream.java:1173) at com.thoughtworks.xstream.XStream.fromXML(XStream.java:1053) at hudson.XmlFile.read(XmlFile.java:142) at hudson.tasks.junit.TestResultAction.load(TestResultAction.java:208) at hudson.tasks.junit.TestResultAction.getResult(TestResultAction.java:145) - locked hudson.tasks.junit.TestResultAction@5c319ff8 at hudson.tasks.junit.TestResultAction.getResult(TestResultAction.java:63) at hudson.tasks.test.AbstractTestResultAction.findCorrespondingResult(AbstractTestResultAction.java:248) at hudson.tasks.test.TestResult.getPreviousResult(TestResult.java:146) at hudson.tasks.junit.SuiteResult.getPreviousResult(SuiteResult.java:298) at hudson.tasks.junit.CaseResult.getPreviousResult(CaseResult.java:447) at hudson.tasks.junit.CaseResult.getPreviousResult(CaseResult.java:54) at de.esailors.jenkins.teststability.StabilityTestDataPublisher.buildUpInitialHistory(StabilityTestDataPublisher.java:149) at de.esailors.jenkins.teststability.StabilityTestDataPublisher.getTestData(StabilityTestDataPublisher.java:101) at hudson.tasks.junit.TestDataPublisher.contributeTestData(TestDataPublisher.java:75) at hudson.tasks.junit.JUnitResultArchiver.perform(JUnitResultArchiver.java:182) - locked hudson.model.FreeStyleBuild@2aaa1e47 at hudson.tasks.BuildStepCompatibilityLayer.perform(BuildStepCompatibilityLayer.java:78) at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:20) at hudson.model.AbstractBuild$AbstractBuildExecution.perform(AbstractBuild.java:782) at hudson.model.AbstractBuild$AbstractBuildExecution.performAllBuildSteps(AbstractBuild.java:723) at hudson.model.Build$BuildExecution.post2(Build.java:185) at hudson.model.AbstractBuild$AbstractBuildExecution.post(AbstractBuild.java:668) at hudson.model.Run.execute(Run.java:1763) at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:43) at hudson.model.ResourceController.execute(ResourceController.java:98) at hudson.model.Executor.run(Executor.java:410)
            Hide
            tsniatowski Tomasz Śniatowski added a comment -

            The problem does not reliably reproduce when we take a "bad" output xml and try to process it with the junit plugin on an idle test instance of Jenkins.

            Show
            tsniatowski Tomasz Śniatowski added a comment - The problem does not reliably reproduce when we take a "bad" output xml and try to process it with the junit plugin on an idle test instance of Jenkins.
            Hide
            migi Michal Pryc added a comment -

            Hi,

            I can reproduce this bug.

            When many tests are failing junit takes ages to finish.

            Show
            migi Michal Pryc added a comment - Hi, I can reproduce this bug. When many tests are failing junit takes ages to finish.
            Hide
            orrc Christopher Orr added a comment -

            Michal Pryc: Do you also have the Test Stability plugin enabled for this build?

            Show
            orrc Christopher Orr added a comment - Michal Pryc : Do you also have the Test Stability plugin enabled for this build?
            Hide
            migi Michal Pryc added a comment -

            Christopher Orr Yes I do have Test Stability history added as "Additional test report features" with "Health report amplification factor" set to 1.
            I have the "Retain long standard output/error" unset.
            Do not fail the build on empty test results is unset.

            Show
            migi Michal Pryc added a comment - Christopher Orr Yes I do have Test Stability history added as "Additional test report features" with "Health report amplification factor" set to 1. I have the "Retain long standard output/error" unset. Do not fail the build on empty test results is unset.
            Hide
            seanf Sean Flanigan added a comment -

            Thanks for including those details Tomasz Śniatowski. Very helpful.

            I've been investigating a bit (in the code and the issue tracker), and it looks like the JUnit plugin's method CaseResult.getPreviousResult (which is called by the Test Stability plugin) gets really slow when there isn't enough heap memory to hold enough JUnit results in memory. TestResultAction uses WeakReferences to hold the actual results, so if there's not enough memory to go around, the system will spend a lot of time re-reading results from disk.

            It would get worse when tests are failing because JUnit retains stdout/stderr for each failing test, thus using more memory. An idle Jenkins would have less memory pressure, which would explain why it's harder to reproduce the problem.

            I'm not sure if there's an easy way of reducing the memory requirements, so for now throwing more heap memory at the problem might be the only workaround.

            A proper solution will probably require setting up some sort of indexed database to hold test results for efficient queries, updated after each test run.

            Although, it might help to iterate over the TestResult s differently, such that each build's TestResult s is examined exactly once. Right now, I think it may be loading previous builds' complete TestResult s many times, and also calling TestObject.findCorrespondingResult(id) many times.

            Refs: https://issues.jenkins-ci.org/browse/JENKINS-24457 and https://issues.jenkins-ci.org/browse/JENKINS-10175 (including comments)

            Show
            seanf Sean Flanigan added a comment - Thanks for including those details Tomasz Śniatowski . Very helpful. I've been investigating a bit (in the code and the issue tracker), and it looks like the JUnit plugin's method CaseResult.getPreviousResult (which is called by the Test Stability plugin) gets really slow when there isn't enough heap memory to hold enough JUnit results in memory. TestResultAction uses WeakReferences to hold the actual results, so if there's not enough memory to go around, the system will spend a lot of time re-reading results from disk. It would get worse when tests are failing because JUnit retains stdout/stderr for each failing test, thus using more memory. An idle Jenkins would have less memory pressure, which would explain why it's harder to reproduce the problem. I'm not sure if there's an easy way of reducing the memory requirements, so for now throwing more heap memory at the problem might be the only workaround. A proper solution will probably require setting up some sort of indexed database to hold test results for efficient queries, updated after each test run. Although , it might help to iterate over the TestResult s differently, such that each build's TestResult s is examined exactly once. Right now, I think it may be loading previous builds' complete TestResult s many times, and also calling TestObject.findCorrespondingResult(id) many times. Refs: https://issues.jenkins-ci.org/browse/JENKINS-24457 and https://issues.jenkins-ci.org/browse/JENKINS-10175 (including comments)
            Hide
            tsniatowski Tomasz Śniatowski added a comment -

            FWIW we worked around the issue by limiting the amount of history kept for the affected jobs (which is unfortunate as it limits the usefulness of the entire setup)

            Show
            tsniatowski Tomasz Śniatowski added a comment - FWIW we worked around the issue by limiting the amount of history kept for the affected jobs (which is unfortunate as it limits the usefulness of the entire setup)
            Hide
            seanf Sean Flanigan added a comment -

            This memory pressure problem tends to trigger JENKINS-31660 in the JUnit plugin if the number of builds is big enough to exceed the stack.

            Show
            seanf Sean Flanigan added a comment - This memory pressure problem tends to trigger JENKINS-31660 in the JUnit plugin if the number of builds is big enough to exceed the stack.

              People

              • Assignee:
                seanf Sean Flanigan
                Reporter:
                lorelei Lorelei McCollum
              • Votes:
                6 Vote for this issue
                Watchers:
                10 Start watching this issue

                Dates

                • Created:
                  Updated: