Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-62142

Race condition during init between jobs and agent

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Unresolved
    • Icon: Major Major
    • branch-api-plugin
    • None
    • org.jenkins-ci.main:jenkins-war:2.222.3
      org.jenkins-ci.main:remoting:4.2
      ec2-plugin:1.49.1
      configuration-as-code:1.39
      job-dsl:1.77

      There appears to be a race condition between the initialization of tasks and the initialization of nodes. This appears to be within remoting, but I have included my version of ec2 plugin because we see this on ec2 agents. 

      We are seeing that jobs are being deleted from nodes after a reboot. This appears to be caused by branch-api-plugin WorkspaceLocatorImpl.java (When a computer comes online check for jobs that exist on the computer but do not exist in jenkins (via getItemByFullName)). 

      It seems that either branch-api-plugin needs a change to wait for jobs to be loaded or maybe jenkins should wait for jobs to be loaded before lauching nodes. 

       

      As an aside the way we found this issue is that it manifests to us a a very long startup time caused by running out of heap space because large objects were allocated when connected to nodes to receive stack traces of exceptions on the nodes caused by jenkins trying to delete the folder of a job in progress that jenkins did not have permission to delete. From here I found that this was caused by the remoting plugin trying to delete the build

            Unassigned Unassigned
            legonigel Nigel Armstrong
            Votes:
            1 Vote for this issue
            Watchers:
            3 Start watching this issue

              Created:
              Updated: