Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-32321

"The reference task was not found" prevents Jenkins from starting

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Fixed
    • Icon: Major Major
    • amazon-ecs-plugin
    • None
    • Jenkins 1.6.4.1 on Ubuntu 14.04

      On every Jenkins restart (or start) the ECS plugin throws a "The reference task was not found" exception which prevents Jenkins from starting correctly.

      I've also noticed that there are occasions when this happens when you are attempting to apply or save within the Jenkins configuration which bubbles up to the UI & stops you from saving.

      The full stack trace is here:

      hudson.util.HudsonFailedToLoad: com.amazonaws.services.ecs.model.ClientException: The referenced task was not found. (Service: AmazonECS; Status Code: 400; Error Code: ClientException; Request ID: 041ee786-b4c1-11e5-a864-d7bdaaa4b5cd)
      	at hudson.WebAppMain$3.run(WebAppMain.java:237)
      Caused by: com.amazonaws.services.ecs.model.ClientException: The referenced task was not found. (Service: AmazonECS; Status Code: 400; Error Code: ClientException; Request ID: 041ee786-b4c1-11e5-a864-d7bdaaa4b5cd)
      	at com.amazonaws.http.AmazonHttpClient.handleErrorResponse(AmazonHttpClient.java:1181)
      	at com.amazonaws.http.AmazonHttpClient.executeOneRequest(AmazonHttpClient.java:766)
      	at com.amazonaws.http.AmazonHttpClient.executeHelper(AmazonHttpClient.java:485)
      	at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:306)
      	at com.amazonaws.services.ecs.AmazonECSClient.invoke(AmazonECSClient.java:2199)
      	at com.amazonaws.services.ecs.AmazonECSClient.stopTask(AmazonECSClient.java:1874)
      	at com.cloudbees.jenkins.plugins.amazonecs.ECSCloud.deleteTask(ECSCloud.java:205)
      	at com.cloudbees.jenkins.plugins.amazonecs.ECSSlave._terminate(ECSSlave.java:90)
      	at hudson.slaves.AbstractCloudSlave.terminate(AbstractCloudSlave.java:67)
      	at hudson.slaves.CloudRetentionStrategy.check(CloudRetentionStrategy.java:58)
      	at hudson.slaves.CloudRetentionStrategy.check(CloudRetentionStrategy.java:42)
      	at hudson.slaves.SlaveComputer$4.run(SlaveComputer.java:717)
      	at hudson.model.Queue._withLock(Queue.java:1346)
      	at hudson.model.Queue.withLock(Queue.java:1229)
      	at hudson.slaves.SlaveComputer.setNode(SlaveComputer.java:714)
      	at hudson.model.AbstractCIBase.updateComputer(AbstractCIBase.java:118)
      	at hudson.model.AbstractCIBase.access$000(AbstractCIBase.java:44)
      	at hudson.model.AbstractCIBase$2.run(AbstractCIBase.java:186)
      	at hudson.model.Queue._withLock(Queue.java:1346)
      	at hudson.model.Queue.withLock(Queue.java:1229)
      	at hudson.model.AbstractCIBase.updateComputerList(AbstractCIBase.java:169)
      	at jenkins.model.Jenkins.updateComputerList(Jenkins.java:1247)
      	at jenkins.model.Jenkins.<init>(Jenkins.java:844)
      	at hudson.model.Hudson.<init>(Hudson.java:83)
      	at hudson.model.Hudson.<init>(Hudson.java:79)
      	at hudson.WebAppMain$3.run(WebAppMain.java:225)
      

      When you have containers / tasks which execute correctly & aren't spawning a lot of builds then it happens less, but when you've got issues with the `jenkins-slave` entrypoint coming up correctly of the server rejecting the JNLP agent (because of a key issue or something) then it happens a lot & basically stops you from using the configuration UI correctly.

      A workaround on Jenkins startup is to delete /var/lib/jenkins/plugins/amazon-ecs & then the startup can happen normally

      I'm thinking of putting a try / catch around com.cloudbees.jenkins.plugins.amazonecs.ECSCloud.deleteTask(ECSCloud.java:205) but am not sure whether this is appropriate (it'll certainly stop the issues above though)

      Thoughts before I make a PR?

            ndeloof Nicolas De Loof
            nullify005 Lee Webb
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

              Created:
              Updated:
              Resolved: