Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-38335

Search API too slow for dashboard

    Details

    • Similar Issues:
    • Epic Link:
    • Sprint:
      pacific, atlantic, 1.0-b05/b-06

      Description

      Search API (page limit 26)

      • Took 7 sec on average (in other cases, 12seconds for 11kb of data - from Ben Walding)
      • To address sorting by name and exclude certain items from flattening all possible items are loaded in memory
        • jenkins.getAllItems()
        • Exclude items that are children of multi-branch and matrix project
        • Don’t see how we can avoid it unless the jobs are stored in some kind of db with indexed column to speed up sorting and only give limited records instead of all
      • new PipelineContainerImpl().getPipelines(items) should be pagination aware
      • At present it creates BluePipeline object for all possible items (ouch!)

      Also: check that runs/activity benefits from same pagination aware optimisation

        Attachments

          Issue Links

            Activity

            michaelneale Michael Neale created issue -
            michaelneale Michael Neale made changes -
            Field Original Value New Value
            Link This issue relates to JENKINS-38087 [ JENKINS-38087 ]
            michaelneale Michael Neale made changes -
            Epic Link JENKINS-35759 [ 171771 ]
            Hide
            michaelneale Michael Neale added a comment -
            Show
            michaelneale Michael Neale added a comment - ping James Dumay
            michaelneale Michael Neale made changes -
            Link This issue blocks JENKINS-38079 [ JENKINS-38079 ]
            michaelneale Michael Neale made changes -
            Description *Search API (page limit 26)*
            * Took 7 sec on average
            * To address sorting by name and exclude certain items from flattening all possible items are loaded in memory
            * jenkins.getAllItems()
            * Exclude items that are children of multi-branch and matrix project
            * Don’t see how we can avoid it unless the jobs are stored in some kind of db with indexed column to speed up sorting and only give limited records instead of all
            * new PipelineContainerImpl().getPipelines(items) should be pagination aware
            * At present it creates BluePipeline object for all possible items (ouch!)
            *Search API (page limit 26)*
            * Took 7 sec on average
            * To address sorting by name and exclude certain items from flattening all possible items are loaded in memory
            ** jenkins.getAllItems()
            ** Exclude items that are children of multi-branch and matrix project
            ** Don’t see how we can avoid it unless the jobs are stored in some kind of db with indexed column to speed up sorting and only give limited records instead of all
            * new PipelineContainerImpl().getPipelines(items) should be pagination aware
            * At present it creates BluePipeline object for all possible items (ouch!)
            michaelneale Michael Neale made changes -
            Sprint 1.0-b05/b-06 [ 111 ]
            michaelneale Michael Neale made changes -
            Description *Search API (page limit 26)*
            * Took 7 sec on average
            * To address sorting by name and exclude certain items from flattening all possible items are loaded in memory
            ** jenkins.getAllItems()
            ** Exclude items that are children of multi-branch and matrix project
            ** Don’t see how we can avoid it unless the jobs are stored in some kind of db with indexed column to speed up sorting and only give limited records instead of all
            * new PipelineContainerImpl().getPipelines(items) should be pagination aware
            * At present it creates BluePipeline object for all possible items (ouch!)
            *Search API (page limit 26)*
            * Took 7 sec on average (in other cases, 12seconds for 11kb of data - from [~bwalding])
            * To address sorting by name and exclude certain items from flattening all possible items are loaded in memory
            ** jenkins.getAllItems()
            ** Exclude items that are children of multi-branch and matrix project
            ** Don’t see how we can avoid it unless the jobs are stored in some kind of db with indexed column to speed up sorting and only give limited records instead of all
            * new PipelineContainerImpl().getPipelines(items) should be pagination aware
            * At present it creates BluePipeline object for all possible items (ouch!)
            jamesdumay James Dumay made changes -
            Assignee Vivek Pandey [ vivek ]
            jamesdumay James Dumay made changes -
            Rank Ranked higher
            vivek Vivek Pandey made changes -
            Status Open [ 1 ] In Progress [ 3 ]
            Hide
            vivek Vivek Pandey added a comment -

            Michael Neale https://issues.jenkins-ci.org/browse/JENKINS-38087 is duplicate, one of them should be closed.

            Show
            vivek Vivek Pandey added a comment - Michael Neale https://issues.jenkins-ci.org/browse/JENKINS-38087 is duplicate, one of them should be closed.
            michaelneale Michael Neale made changes -
            Description *Search API (page limit 26)*
            * Took 7 sec on average (in other cases, 12seconds for 11kb of data - from [~bwalding])
            * To address sorting by name and exclude certain items from flattening all possible items are loaded in memory
            ** jenkins.getAllItems()
            ** Exclude items that are children of multi-branch and matrix project
            ** Don’t see how we can avoid it unless the jobs are stored in some kind of db with indexed column to speed up sorting and only give limited records instead of all
            * new PipelineContainerImpl().getPipelines(items) should be pagination aware
            * At present it creates BluePipeline object for all possible items (ouch!)
            *Search API (page limit 26)*
            * Took 7 sec on average (in other cases, 12seconds for 11kb of data - from [~bwalding])
            * To address sorting by name and exclude certain items from flattening all possible items are loaded in memory
            ** jenkins.getAllItems()
            ** Exclude items that are children of multi-branch and matrix project
            ** Don’t see how we can avoid it unless the jobs are stored in some kind of db with indexed column to speed up sorting and only give limited records instead of all
            * new PipelineContainerImpl().getPipelines(items) should be pagination aware
            * At present it creates BluePipeline object for all possible items (ouch!)

            Also: check that runs/activity benefits from same pagination aware optimisation
            michaelneale Michael Neale made changes -
            Link This issue is duplicated by JENKINS-38087 [ JENKINS-38087 ]
            vivek Vivek Pandey made changes -
            Attachment search-perf-dogfood-remote.tiff [ 34031 ]
            vivek Vivek Pandey made changes -
            Attachment search-perf-dogfood-local.tiff [ 34032 ]
            Hide
            vivek Vivek Pandey added a comment - - edited

            Keith Zantow My initial investigation report:

            • PipelineSearch calls, Jenkins.getInstance().getAllItems(). Jenkins keeps all these items in memory so its all in memory cost
            • PipelineSearch does few (at least 2 iteration) of all items and wraps them in BluePipeline implementation object
            • Only pagination size limit, for example 26 BluePipeline items gets serialized over wire
            • Most cost is paid during serialization
              • Things like get build id, get last build, run details, actions attached to pipeline job, actions attached to jenkins.model.Run etc, where the real cost is.
            • I tried ci.blueocean.io jobs locally (using downloaded jenkins home content of dogfood ci server) as well as the one running over ci.blueocean.io - but they do not indicate expensiv e search API calls. 95ms local, 364ms at ci.blueocean.io. See attached images.
            • I tried profiling running instance blueocean locally with dogfood home directory, but the cost paid during fetching pipeline results was negligible.
            • I suspect there are some plugins, who contribute to actions that contribute to the slow behavior so if Ben Walding could his work load that can help us identify why its slow on his instance.

            search-perf-dogfood-remote.tiff
            search-perf-dogfood-local.tiff

            Show
            vivek Vivek Pandey added a comment - - edited Keith Zantow My initial investigation report: PipelineSearch calls, Jenkins.getInstance().getAllItems(). Jenkins keeps all these items in memory so its all in memory cost PipelineSearch does few (at least 2 iteration) of all items and wraps them in BluePipeline implementation object Only pagination size limit, for example 26 BluePipeline items gets serialized over wire Most cost is paid during serialization Things like get build id, get last build, run details, actions attached to pipeline job, actions attached to jenkins.model.Run etc, where the real cost is. I tried ci.blueocean.io jobs locally (using downloaded jenkins home content of dogfood ci server) as well as the one running over ci.blueocean.io - but they do not indicate expensiv e search API calls. 95ms local, 364ms at ci.blueocean.io. See attached images. I tried profiling running instance blueocean locally with dogfood home directory, but the cost paid during fetching pipeline results was negligible. I suspect there are some plugins, who contribute to actions that contribute to the slow behavior so if Ben Walding could his work load that can help us identify why its slow on his instance. search-perf-dogfood-remote.tiff search-perf-dogfood-local.tiff
            michaelneale Michael Neale made changes -
            Assignee Vivek Pandey [ vivek ] Keith Zantow [ kzantow ]
            Hide
            michaelneale Michael Neale added a comment -

            Punting this to keith for now as Vivek away. Looks like until we get a JENKINS_HOME that shows the search slowness, there are probably other avenues to explore to optimise.

            Show
            michaelneale Michael Neale added a comment - Punting this to keith for now as Vivek away. Looks like until we get a JENKINS_HOME that shows the search slowness, there are probably other avenues to explore to optimise.
            jamesdumay James Dumay made changes -
            Sprint 1.0-b05/b-06 [ 111 ] 26-september, 1.0-b05/b-06 [ 101, 111 ]
            michaelneale Michael Neale made changes -
            Epic Link JENKINS-35759 [ 171771 ] JENKINS-37957 [ 174099 ]
            michaelneale Michael Neale made changes -
            Sprint pacific, 1.0-b05/b-06 [ 101, 111 ] pacific, atlantic, 1.0-b05/b-06 [ 101, 106, 111 ]
            vivek Vivek Pandey made changes -
            Assignee Keith Zantow [ kzantow ] Vivek Pandey [ vivek ]
            Hide
            vivek Vivek Pandey added a comment -

            Looks like computing numberOfRunningPipelines is causing this extremely slow behavior. Below is my analysis:

            Running following GET call takes anywhere from 13 sec to 30sec or sometimes 504 error.

            GET https://JENKINS_HOST/blue/rest/search/?q=type:pipeline;excludedFromFlattening:jenkins.branch.MultiBranchProject,hudson.matrix.MatrixProject&filter=no-folders&start=0&limit=26&tree=class,name,_links,displayName,fullName,permissions,numberOfQueuedPipelines,organization,branchNames,latestRun,numberOfRunningPipelines
            

            or

            GET https://JENKINS_HOST/blue/rest/search/?q=type:pipeline;excludedFromFlattening:jenkins.branch.MultiBranchProject,hudson.matrix.MatrixProject&filter=no-folders&start=0&limit=26&tree=numberOfRunningPipelines
            

            Running query below takes 148 ms. Notice there is no numberOfRunningPipelines element serialized. Clearly computation of this in case of multi-branch pipeline means going over all builds of each branch and then filter running builds. I will get back on this after more investigation.

            GET https://JENKINS_HOST/blue/rest/search/?q=type:pipeline;excludedFromFlattening:jenkins.branch.MultiBranchProject,hudson.matrix.MatrixProject&filter=no-folders&start=0&limit=26&tree=class,name,_links,displayName,fullName,permissions,numberOfQueuedPipelines,organization,branchNames,latestRun
            

            I noticed that UI is not even using numberOfRunningPipelines and there is no optimal way to compute this number so I am going to remove it from backend. I will submit a PR where will get UI team to take a look at it.

            Show
            vivek Vivek Pandey added a comment - Looks like computing numberOfRunningPipelines is causing this extremely slow behavior. Below is my analysis: Running following GET call takes anywhere from 13 sec to 30sec or sometimes 504 error. GET https: //JENKINS_HOST/blue/ rest /search/?q=type:pipeline;excludedFromFlattening:jenkins.branch.MultiBranchProject,hudson.matrix.MatrixProject&filter=no-folders&start=0&limit=26&tree= class, name,_links,displayName,fullName,permissions,numberOfQueuedPipelines,organization,branchNames,latestRun,numberOfRunningPipelines or GET https: //JENKINS_HOST/blue/ rest /search/?q=type:pipeline;excludedFromFlattening:jenkins.branch.MultiBranchProject,hudson.matrix.MatrixProject&filter=no-folders&start=0&limit=26&tree=numberOfRunningPipelines Running query below takes 148 ms. Notice there is no numberOfRunningPipelines element serialized. Clearly computation of this in case of multi-branch pipeline means going over all builds of each branch and then filter running builds. I will get back on this after more investigation. GET https: //JENKINS_HOST/blue/ rest /search/?q=type:pipeline;excludedFromFlattening:jenkins.branch.MultiBranchProject,hudson.matrix.MatrixProject&filter=no-folders&start=0&limit=26&tree= class, name,_links,displayName,fullName,permissions,numberOfQueuedPipelines,organization,branchNames,latestRun I noticed that UI is not even using numberOfRunningPipelines and there is no optimal way to compute this number so I am going to remove it from backend. I will submit a PR where will get UI team to take a look at it.
            vivek Vivek Pandey made changes -
            Link This issue is blocked by JENKINS-38981 [ JENKINS-38981 ]
            vivek Vivek Pandey made changes -
            Status In Progress [ 3 ] In Review [ 10005 ]
            Hide
            michaelneale Michael Neale added a comment -

            YAY, looks like this solves a lot...

            Show
            michaelneale Michael Neale added a comment - YAY, looks like this solves a lot...
            vivek Vivek Pandey made changes -
            Status In Review [ 10005 ] Resolved [ 5 ]
            Resolution Fixed [ 1 ]
            jbriden Jenn Briden made changes -
            Status Resolved [ 5 ] Closed [ 6 ]

              People

              • Assignee:
                vivek Vivek Pandey
                Reporter:
                michaelneale Michael Neale
              • Votes:
                0 Vote for this issue
                Watchers:
                2 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: