Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-47327

Offline node (due to being in DOS) does not re-launch when the node is available again.

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Unresolved
    • Icon: Blocker Blocker
    • None

      (I'm almost certain the component above is wrong.)

      In my environment, Jenkins (is being set up to) runs our SSD testing lab, which works like this:

       

      All nodes are booted into Linux.  When the test begins, some scripts are run to copy files around, set up logging, etc.  Then, the node is booted into FreeDOS (which has NO networking available) to run the tests.  This takes anywhere from 10 minutes to 10 days (or more).

       

      When the test is complete, the node is booted back in to Linux, Jenkins is notified that the stage is complete, and then Jenkins needs to run another script on the node.

       

      But Jenkins, which by this time has noticed that the node is offline, NEVER tries to restart the node.  If I go and hit 'relaunch agent' it restarts fine and the build finishes.

      I tried hitting 'reload' in the Nodes view in hopes that would do something, nothing happened.

      I have to have this work - the whole idea here is to automate the build , not require a manual step to make it continue.  So, either I need to know a way to get Jenkins to notice when the node becomes available again, or I need to have a URL I can curl to make Jenkins notice (remember, the node is up and running Linux, so I can do just about anything on the node...)

       

      Here's a greatly trimmed version of the Pipeline script which is running this thing:

      node("master") {
         result="OK"
        ...stuff - note that most of the below won't work exactly, as I've removed most of the variable getting stuff...
         sh "~/jenkins/jenkins_test_run_setup_and_status_update.sh $TESTRUNCONFIGFILE  $HOST $BASENAME"
      }

      try {
        node(HOST) {
          stage("setup") {
            cmdline="~/bin/jenkins_perform_job_setup_on_me.sh $FWLIST"

          sh "date ; $cmdline ; date"
          }
      }
      node("master") {
          def tests = readFile "TESTRUNCONFIGFILE"
          print tests
          tests=Arrays.asList(tests.split("\\r?\n"))
          for( line in tests){
              fields = line.split(",")
              if ( fields[0][0] != '#') {
                     hook=registerWebhook()
                     hookurl=hook.getURL()
                     // beware that we are still on node master as well as switching to HOST:
                     node(HOST) {
                          cmdline="~/bin/run_stage.sh $batfile "
                          sh "date ; $cmdline "
                      } // end of 'node host'
          data = waitForWebhook hook
          ok=0
      if ( data == "OK") ok=1
      }
      }
      }
      // end loop running through the testconfig here.
      }
      }
      } catch (all) {
      result="FAIL"
      }
      // once ALL the stages are done, run finish:
      node(HOST)
      {
      stage ("finish")
      {
      sh "date ; /home/engineer/bin/finish $result --notmanual --jenkins ; date"
      }
      }

       

            Unassigned Unassigned
            rustycar54 Rusty Carruth
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

              Created:
              Updated: