Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-5185

Improve performance of parsers for long log files

    XMLWordPrintable

    Details

    • Similar Issues:

      Description

      Currently, the performance when parsing the log files is poor for some parsers. Maybe we should use another regexp library or strip off some text from the log before starting the parser.

      Here are some performance results on a log file of 840 lines:

      AcuCobol Compiler: 2439ms
      Ada Compiler (gnat): 21ms
      Buckminster Compiler: 12ms
      Coolflux DSP Compiler: 959ms
      Doxygen: 18ms
      Eclipse Java Compiler: 19ms
      Erlang Compiler: 11ms
      Flex SDK Compilers (compc & mxmlc): 113ms
      GNU compiler (gcc): 710ms
      GNU compiler 4 (gcc): 19ms
      GNU compiler 4 (ld): 700ms
      IAR compiler (C/C++): 14ms
      Intel compiler: 977ms
      Java Compiler: 137ms
      JavaDoc: 17ms
      MSBuild: 4491ms
      Oracle Invalids: 53ms
      PC-Lint: 4678ms
      PHP Runtime Warning: 1569ms
      Perforce Compiler: 2531ms
      Robocopy (please use /V in your commands!): 1733ms
      SUN C++ Compiler: 14ms
      Texas Instruments Code Composer Studio (C/C++): 3ms

      Here are the results after the optimzation:

      AcuCobol Compiler: 15ms
      Ada Compiler (gnat): 27ms
      Buckminster Compiler: 19ms
      Coolflux DSP Compiler: 2ms
      Doxygen: 18ms
      Eclipse Java Compiler: 29ms
      Erlang Compiler: 24ms
      Flex SDK Compilers (compc & mxmlc): 6ms
      GNU compiler (gcc): 96ms
      GNU compiler 4 (gcc): 20ms
      GNU compiler 4 (ld): 68ms
      IAR compiler (C/C++): 19ms
      Intel compiler: 5ms
      Java Compiler: 53ms
      JavaDoc: 13ms
      MSBuild: 72ms
      Oracle Invalids: 6ms
      PC-Lint: 99ms
      PHP Runtime Warning: 2ms
      Perforce Compiler: 3ms
      Robocopy (please use /V in your commands!): 2ms
      SUN C++ Compiler: 4ms
      Texas Instruments Code Composer Studio (C/C++): 3ms

        Attachments

          Activity

          Hide
          fchateau fchateau added a comment -

          I just fixed this issue.
          The problem was occuring because most regular expressions were far too permissive. The matched substring could just begin anywhere in the line !!!
          By anchoring the regular expression to the beginning of a line, the complexity of regular expression matching is decreased by a factor of n (n being the length of the string). In other words: the regexp engine doesn't have to try matching the regexp by starting at each character of the line one after another !

          In the future we should check that every patterns begins by ^ and ends by $, and that there are no pipes '|' at the topmost level. Indeed, anchors do not enclose all alternatives if you don't put them into a group.
          In other words:
          ^a|b$ is wrong
          ^a$|^b$ is good, but redundant
          ^(?:a|b)$ is better

          Show
          fchateau fchateau added a comment - I just fixed this issue. The problem was occuring because most regular expressions were far too permissive. The matched substring could just begin anywhere in the line !!! By anchoring the regular expression to the beginning of a line, the complexity of regular expression matching is decreased by a factor of n (n being the length of the string). In other words: the regexp engine doesn't have to try matching the regexp by starting at each character of the line one after another ! In the future we should check that every patterns begins by ^ and ends by $, and that there are no pipes '|' at the topmost level. Indeed, anchors do not enclose all alternatives if you don't put them into a group. In other words: ^a|b$ is wrong ^a$|^b$ is good, but redundant ^(?:a|b)$ is better
          Hide
          drulli Ulli Hafner added a comment -

          Thanks for improving the regular expressions!

          Show
          drulli Ulli Hafner added a comment - Thanks for improving the regular expressions!
          Hide
          drulli Ulli Hafner added a comment -

          Integrated in Hudson Plug-ins #70
          JENKINS-5185: Fixed. Added beginning-of-line '^' and end-of-line '$' anchors to all regular expressions. Just putting these marks achieves a tremendous speedup (which is logical because it decrease algorithm complexity by one degree).

          Show
          drulli Ulli Hafner added a comment - Integrated in Hudson Plug-ins #70 JENKINS-5185 : Fixed. Added beginning-of-line '^' and end-of-line '$' anchors to all regular expressions. Just putting these marks achieves a tremendous speedup (which is logical because it decrease algorithm complexity by one degree).
          Hide
          scm_issue_link SCM/JIRA link daemon added a comment -

          Code changed in hudson
          User: : fchateau
          Path:
          trunk/hudson/plugins/warnings/src/main/java/hudson/plugins/warnings/parser/DoxygenParser.java
          trunk/hudson/plugins/warnings/src/test/java/hudson/plugins/warnings/parser/DoxygenParserTest.java
          http://jenkins-ci.org/commit/31336
          Log:
          JENKINS-5185: Fixed CheckStyle warnings introduced by [31231].

          Show
          scm_issue_link SCM/JIRA link daemon added a comment - Code changed in hudson User: : fchateau Path: trunk/hudson/plugins/warnings/src/main/java/hudson/plugins/warnings/parser/DoxygenParser.java trunk/hudson/plugins/warnings/src/test/java/hudson/plugins/warnings/parser/DoxygenParserTest.java http://jenkins-ci.org/commit/31336 Log: JENKINS-5185 : Fixed CheckStyle warnings introduced by [31231] .
          Hide
          drulli Ulli Hafner added a comment -

          Integrated in Hudson Plug-ins #71
          JENKINS-5185: Fixed CheckStyle warnings introduced by [31231].

          Show
          drulli Ulli Hafner added a comment - Integrated in Hudson Plug-ins #71 JENKINS-5185 : Fixed CheckStyle warnings introduced by [31231] .

            People

            • Assignee:
              drulli Ulli Hafner
              Reporter:
              drulli Ulli Hafner
            • Votes:
              5 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: