Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-48434

Perforce based Pipelines with polling can Delete Jenkins Home directory

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Fixed
    • Icon: Critical Critical
    • p4-plugin
    • P4 Plugin 1.7.0
      Jenkins 2.63

       It is possible to get into a race condition that results in polling for changes to cause a jenkins home directory wipe.  Because polling sets the perforce workspace root to the jenkins install directory while polling, if another job uses the same workspace for any reason, you risk the overlapping race condition deleting your server. 

      To reproduce using two different jenkins jobs, configure the following way:

      1. Create two pipeline jobs that sync their pipeline groovy Jenkinsfile from perforce
      2. Configure both jobs to use the same workspace file.  For example, set the workspace to "jenkins-${NODE_NAME}"
      3. Turn polling on one of the pipeline jobs, set to low interval, like 5 minutes.
      4. Set the other job to run continuously as fast as you can.
      5. On both jobs make sure the perforce settings wipe the workspace before sync'ing.

      Within a few hours to a day, your JENKINS_HOME dir will be wiped. 

       

      Here's why this is possible:

      1. Perforce Polling appears to use JENKINS_HOME as the workspace root.  Whenever polling runs it sets the root of the workspace in question to JENKINS_HOME.
      2. When two jobs share the same workspace like this they get into a race condition with the Perforce workspace edits, because polling is setting the workspace to jenkins_home, inevitably, at some point, that will happen while the other job is attempting to sync and run.  

      Typically when a job runs /after/ polling it resets the workspace to the working directory for the job, but if the timing is just right, it will do that, the polling job will then set the workspace root back to Jenkins_home, the other job will then wipe local workspace and sync and you will lose your jenkins server.

      This flaw took down our production Jenkins server with 6000+ build jobs and took nearly 48 hours to identify the root cause and create a reproduction.

      My request:  Please configure the p4-plugin to check if it's being asked to sync to Jenkins_Home and hard fail if it is.  That or change the design of how the polling system works to not set the workspace root to JENKINS_HOME when polling (leave it as the workspace root of the job that needs polling). Either approach resolves the issue.

      This becomes even more deadly if you run more than one jenkins master server and teams have jobs of the same name on each server, then you can run into this problem with the default workspace setting of "jenkins-${NODE_NAME}-${JOB_NAME}".  (Which is actually the scenario that happened to us).  The "fix" in the latest perforce plugin to add ${EXECUTOR_NUMBER} does not solve for this multi-server scenario with two jobs with the same name that may end up on the same executor relative to each jenkins server.

      What makes this problem worse is that there's no easy way to detect it's happening. I can't prevent teams from making this mistake in the future because of the flaw in the design of the plugin.

       

            p4paul Paul Allen
            maxfields2000 Maxfield Stewart
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

              Created:
              Updated:
              Resolved: