Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-42166

ProcessLiveness.workingLaunchers heuristic is flaky

    Details

    • Similar Issues:

      Description

      After running the docker-workflow demo recently, I saw numerous warnings:

      ... org.jenkinsci.plugins.durabletask.ProcessLiveness isAlive
      WARNING: hudson.Launcher$LocalLauncher@... on hudson.remoting.LocalChannel@... does not seem able to determine whether processes are alive or not
      

      I suspect that I had simply gotten to the point of having launched >10k processes in my current OS session, and so _isAlive(..., 9999, ...) correctly returned true.

      On the one hand this fake PID seems much too low; on the other I am not sure how high a valid pid_t might be. And turning off workingLaunchers entirely would mean either always trusting Liveness on nondecorated launchers—which could cause big problems if libc is not loadable properly—or never trusting it—which means a dead controller process (incl. reboot) will not be detected.

      Or we could always use the ps trick, even on supposedly local launchers. This means checking extra carefully that the command is POSIX-compliant.

        Attachments

          Issue Links

            Activity

            Hide
            jglick Jesse Glick added a comment -

            This tip suggests ps -p $PID, and there are nearby tips for Windows which could help with JENKINS-25053.

            Show
            jglick Jesse Glick added a comment - This tip suggests ps -p $PID , and there are nearby tips for Windows which could help with JENKINS-25053 .
            Hide
            pgeorgiev Pavel Georgiev added a comment -

            Recently i started seeing this every second. I'm on linux running jenkins 2.32.3. Any ideas what is wrong? I tried restarting jenkins but there is no difference. It's really annoying as i cannot see the important messages....

             
            Mar 06, 2017 3:24:29 PM WARNING org.jenkinsci.plugins.durabletask.ProcessLiveness
            hudson.Launcher$RemoteLauncher@7b1f9078 on hudson.remoting.Channel@509f76f6:slave1 does not seem able to determine whether processes are alive or not
            Mar 06, 2017 3:24:30 PM WARNING org.jenkinsci.plugins.durabletask.ProcessLiveness
            hudson.Launcher$RemoteLauncher@301b1441 on hudson.remoting.Channel@509f76f6:slave1 does not seem able to determine whether processes are alive or not

            Show
            pgeorgiev Pavel Georgiev added a comment - Recently i started seeing this every second. I'm on linux running jenkins 2.32.3. Any ideas what is wrong? I tried restarting jenkins but there is no difference. It's really annoying as i cannot see the important messages....   Mar 06, 2017 3:24:29 PM WARNING org.jenkinsci.plugins.durabletask.ProcessLiveness hudson.Launcher$RemoteLauncher@7b1f9078 on hudson.remoting.Channel@509f76f6:slave1 does not seem able to determine whether processes are alive or not Mar 06, 2017 3:24:30 PM WARNING org.jenkinsci.plugins.durabletask.ProcessLiveness hudson.Launcher$RemoteLauncher@301b1441 on hudson.remoting.Channel@509f76f6:slave1 does not seem able to determine whether processes are alive or not
            Hide
            csanchez Carlos Sanchez added a comment - - edited

            Related problems when running in busybox or alpine (ie. docker jenkinsci/jnlp-slave:alpine). ps -o pid=9999 always succeeds

            And the docker image may not even have ps as shown in JENKINS-43881

            Show
            csanchez Carlos Sanchez added a comment - - edited Related problems when running in busybox or alpine (ie. docker jenkinsci/jnlp-slave:alpine). ps -o pid=9999 always succeeds And the docker image may not even have ps as shown in  JENKINS-43881
            Hide
            csanchez Carlos Sanchez added a comment -

            example

            docker run -ti --rm --entrypoint bash jenkinsci/jnlp-slave:alpine -c "ps -o pid=9999"; echo $?
            9999
                   1
                    0
            
            ps --help
            BusyBox v1.25.1 (2016-10-26 16:15:20 GMT) multi-call binary.
            Usage: ps [-o COL1,COL2=HEADER]
            Show list of processes
            -o COL1,COL2=HEADER	Select columns for display
            
            Show
            csanchez Carlos Sanchez added a comment - example docker run -ti --rm --entrypoint bash jenkinsci/jnlp-slave:alpine -c "ps -o pid=9999" ; echo $? 9999 1 0 ps --help BusyBox v1.25.1 (2016-10-26 16:15:20 GMT) multi-call binary. Usage: ps [-o COL1,COL2=HEADER] Show list of processes -o COL1,COL2=HEADER Select columns for display
            Hide
            jglick Jesse Glick added a comment -

            Regarding Docker containers, see JENKINS-40101—a known bug.

            Show
            jglick Jesse Glick added a comment - Regarding Docker containers, see  JENKINS-40101 —a known bug.
            Hide
            jglick Jesse Glick added a comment -

            Obsolete as of JENKINS-47791.

            Show
            jglick Jesse Glick added a comment - Obsolete as of  JENKINS-47791 .

              People

              • Assignee:
                Unassigned
                Reporter:
                jglick Jesse Glick
              • Votes:
                7 Vote for this issue
                Watchers:
                9 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: