Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-39231

WinSW: Automatically terminate runaway processes in Windows services

    Details

    • Type: New Feature
    • Status: Resolved (View Workflow)
    • Priority: Major
    • Resolution: Fixed
    • Component/s: core
    • Labels:
    • Similar Issues:

      Description

      In Jenkins projects we have many users complaining that the slave/agent is "already connected", because they have a runaway slave/agent process. It happens when WinSW gets terminated without executing the process shutdown logic (force kill) or when WinSW fails to terminate the process.

      As a part of WinSW 2.0, it would be great to create a logic, which...

      • records PID of the created process to the disc
      • performs status check of the previously spawned process upon restart
      • terminates the runaway process if required

      It can be done via WinSW 2 "plugin"
      Issue: https://github.com/kohsuke/winsw/issues/125

        Attachments

          Issue Links

            Activity

            oleg_nenashev Oleg Nenashev created issue -
            oleg_nenashev Oleg Nenashev made changes -
            Field Original Value New Value
            Remote Link This issue links to "winsw/issues/125 (Web Link)" [ 14981 ]
            oleg_nenashev Oleg Nenashev made changes -
            Assignee Oleg Nenashev [ oleg_nenashev ]
            oleg_nenashev Oleg Nenashev made changes -
            Link This issue is related to JENKINS-28492 [ JENKINS-28492 ]
            oleg_nenashev Oleg Nenashev made changes -
            Link This issue is related to JENKINS-26020 [ JENKINS-26020 ]
            mariem_baccar mariem baccar made changes -
            Attachment Slave errors [ 34659 ]
            Hide
            mariem_baccar mariem baccar added a comment -

            I have the same issue with the slave "agent is already connected"". you find attached the whole meassage. I think that this issue decreases the performance of jenkins because it consumes a lot CPU. We are waiting for your help!

            Show
            mariem_baccar mariem baccar added a comment - I have the same issue with the slave "agent is already connected"". you find attached the whole meassage. I think that this issue decreases the performance of jenkins because it consumes a lot CPU. We are waiting for your help!
            oleg_nenashev Oleg Nenashev made changes -
            Status Open [ 1 ] In Progress [ 3 ]
            Hide
            oleg_nenashev Oleg Nenashev added a comment -

            So there is a pul request for this feature: https://github.com/kohsuke/winsw/pull/133
            The pull request also references the release on GitHub with the binary file for evaluation: https://github.com/kohsuke/winsw/pull/133

            Show
            oleg_nenashev Oleg Nenashev added a comment - So there is a pul request for this feature: https://github.com/kohsuke/winsw/pull/133 The pull request also references the release on GitHub with the binary file for evaluation: https://github.com/kohsuke/winsw/pull/133
            oleg_nenashev Oleg Nenashev made changes -
            Remote Link This issue links to "WinSW Pull 133 (Web Link)" [ 15073 ]
            Hide
            mariem_baccar mariem baccar added a comment -
            Show
            mariem_baccar mariem baccar added a comment - This problem is related to JENKINS-39078 : There is a problem in Docker Slave Plugin 1.0.5 (Fix: https://github.com/jenkinsci/docker-slaves-plugin/commit/451929125fd8ff39c6f84c30476c26cccb912140 ). So, you can uninstall this plugin if it's not needed. You must be careful. This plugin is also responsible of many other bugs ( JENKINS-39214 , docker compose fails after scrutation...)
            Hide
            oleg_nenashev Oleg Nenashev added a comment -

            mariem baccar Not sure how your comment is related to Windows Service wrapper. Have you modified the wrong issue?

            Show
            oleg_nenashev Oleg Nenashev added a comment - mariem baccar Not sure how your comment is related to Windows Service wrapper. Have you modified the wrong issue?
            Hide
            mariem_baccar mariem baccar added a comment - - edited

            I am modifying the right issue.You will find below more explications for my problem related to slave:
            -Before installing the plugin "Docker Slave v1.0.5", the slave operates correctly without any problems.
            -After installing this plugin, I encountered many problems. One of them is about slave: I always get this message "slave agent is already connected". For more details, you find attached the whole message.
            Finally, I discover that this plugin recently installed is the source of all my new problems in Jenkins 2.19.1 LTS. After uninstalling this one, all problems are resolved.

            Show
            mariem_baccar mariem baccar added a comment - - edited I am modifying the right issue.You will find below more explications for my problem related to slave: -Before installing the plugin "Docker Slave v1.0.5", the slave operates correctly without any problems. -After installing this plugin, I encountered many problems. One of them is about slave: I always get this message "slave agent is already connected". For more details, you find attached the whole message. Finally, I discover that this plugin recently installed is the source of all my new problems in Jenkins 2.19.1 LTS. After uninstalling this one, all problems are resolved.
            Hide
            oleg_nenashev Oleg Nenashev added a comment -

            mariem baccar I'm pretty sure you're modifying the wrong issue. This is a Feature Request to the Windows Service Wrapper, which has nothing to do with the problem you describe.

            If you see the "slave agent is already connected" issue only after the Docker Slaves installation, please comment in JENKINS-28492 or create another issue to Docker Slaves plugin

            Show
            oleg_nenashev Oleg Nenashev added a comment - mariem baccar I'm pretty sure you're modifying the wrong issue. This is a Feature Request to the Windows Service Wrapper , which has nothing to do with the problem you describe. If you see the "slave agent is already connected" issue only after the Docker Slaves installation, please comment in JENKINS-28492 or create another issue to Docker Slaves plugin
            Hide
            krogan mark mann added a comment -

            Jenkins master is on 2.32.1
            Master and slaves running Win2012
            The symptoms sound very familiar to a problem where we've had a jenkins slave up.... then we reboot the windows server (slave).
            When the server returns and the slave is automatically started, it hangs around for about 30secs then terminates connection which kills our job.
            We've also witnessed the hosting windows service winsw 1.17 (which auto upgrades to 1.18) bombs out but leaves the java process running.
            The java process is still keeping the slave active to the master for an indiscriminate amount of time (anywhere between 20secs to 2hrs) before eventually dying of its own accord, with no fresh jobs sent or interaction with the windows service.

            Show
            krogan mark mann added a comment - Jenkins master is on 2.32.1 Master and slaves running Win2012 The symptoms sound very familiar to a problem where we've had a jenkins slave up.... then we reboot the windows server (slave). When the server returns and the slave is automatically started, it hangs around for about 30secs then terminates connection which kills our job. We've also witnessed the hosting windows service winsw 1.17 (which auto upgrades to 1.18) bombs out but leaves the java process running. The java process is still keeping the slave active to the master for an indiscriminate amount of time (anywhere between 20secs to 2hrs) before eventually dying of its own accord, with no fresh jobs sent or interaction with the windows service.
            oleg_nenashev Oleg Nenashev made changes -
            Epic Link JENKINS-38833 [ 175240 ]
            Hide
            oleg_nenashev Oleg Nenashev added a comment -

            mark mann Smells like another issue.
            Anyway, makes sense to retest after the WinSW 2 integration. You can install the new Windows service wrapper manually

            Show
            oleg_nenashev Oleg Nenashev added a comment - mark mann Smells like another issue. Anyway, makes sense to retest after the WinSW 2 integration. You can install the new Windows service wrapper manually
            Hide
            scm_issue_link SCM/JIRA link daemon added a comment -

            Code changed in jenkins
            User: Oleg Nenashev
            Path:
            src/main/resources/org/jenkinsci/modules/windows_slave_installer/jenkins-slave.xml
            http://jenkins-ci.org/commit/windows-slave-installer-module/8dcf02da16e7c95c67c7de95fd078089a8ecf8df
            Log:
            JENKINS-39231 - Integrate the Runaway Process Killer extension to terminate runaway processes

            Show
            scm_issue_link SCM/JIRA link daemon added a comment - Code changed in jenkins User: Oleg Nenashev Path: src/main/resources/org/jenkinsci/modules/windows_slave_installer/jenkins-slave.xml http://jenkins-ci.org/commit/windows-slave-installer-module/8dcf02da16e7c95c67c7de95fd078089a8ecf8df Log: JENKINS-39231 - Integrate the Runaway Process Killer extension to terminate runaway processes
            Hide
            scm_issue_link SCM/JIRA link daemon added a comment -

            Code changed in jenkins
            User: Oleg Nenashev
            Path:
            core/pom.xml
            core/src/main/resources/windows-service/jenkins-slave.xml
            core/src/main/resources/windows-service/jenkins.xml
            war/pom.xml
            http://jenkins-ci.org/commit/jenkins/e698d1de41d4311bf5f8b1d2c40b591109e696e2
            Log:
            Update Windows Agent Installer to 1.7 and WinSW to 2.0.2 (#2765)

                1. WinSW changes

            The update includes many fixes and improvements, the full list is provided in the [WinSW changelog](https://github.com/kohsuke/winsw/blob/master/CHANGELOG.md). There are several issues referenced in Jenkins bugtracker:

                1. Windows Agent Installer changes
            • Adapt the default configurations to pick fixes above
            • Slave => Agent renaming where possible
                1. Jenkins core changes
            • Modify the configuration template, reference advanced options
            • Enable Runaway Process Killer by default
            • Update Windows Agent Installer to 1.7
            • Remove the obsolete jenkins-slave.xml file from the core.

            Now it is within windows-slave-installer

            • Use the deployed Snapshot for CI
            • Pick the release version of windows-slave-installer-1.7
            Show
            scm_issue_link SCM/JIRA link daemon added a comment - Code changed in jenkins User: Oleg Nenashev Path: core/pom.xml core/src/main/resources/windows-service/jenkins-slave.xml core/src/main/resources/windows-service/jenkins.xml war/pom.xml http://jenkins-ci.org/commit/jenkins/e698d1de41d4311bf5f8b1d2c40b591109e696e2 Log: Update Windows Agent Installer to 1.7 and WinSW to 2.0.2 (#2765) WinSW changes The update includes many fixes and improvements, the full list is provided in the [WinSW changelog] ( https://github.com/kohsuke/winsw/blob/master/CHANGELOG.md ). There are several issues referenced in Jenkins bugtracker: JENKINS-22692 ( https://issues.jenkins-ci.org/browse/JENKINS-22692 ) - Connection reset issues when WinSW gets terminated due to the system shutdown JENKINS-23487 ( https://issues.jenkins-ci.org/browse/JENKINS-23487)- Support of shared directories in WinSW JENKINS-39231 ( https://issues.jenkins-ci.org/browse/JENKINS-39231 ) - Enable Runaway Process Killer by default JENKINS-39237 ( https://issues.jenkins-ci.org/browse/JENKINS-39237 ) - Auto-upgrade of JNLP agent versions on the slaves Windows Agent Installer changes Adapt the default configurations to pick fixes above Slave => Agent renaming where possible Jenkins core changes Modify the configuration template, reference advanced options Enable Runaway Process Killer by default Update Windows Agent Installer to 1.7 Remove the obsolete jenkins-slave.xml file from the core. Now it is within windows-slave-installer Use the deployed Snapshot for CI Pick the release version of windows-slave-installer-1.7
            Hide
            oleg_nenashev Oleg Nenashev added a comment -

            Released in Jenkins 2.50. See the upgrade guidelines for more info

            Show
            oleg_nenashev Oleg Nenashev added a comment - Released in Jenkins 2.50. See the upgrade guidelines for more info
            oleg_nenashev Oleg Nenashev made changes -
            Status In Progress [ 3 ] Resolved [ 5 ]
            Resolution Fixed [ 1 ]
            oleg_nenashev Oleg Nenashev made changes -
            Link This issue is related to JENKINS-22024 [ JENKINS-22024 ]
            oleg_nenashev Oleg Nenashev made changes -
            Link This issue is duplicated by JENKINS-29825 [ JENKINS-29825 ]
            oleg_nenashev Oleg Nenashev made changes -
            Link This issue is related to JENKINS-24155 [ JENKINS-24155 ]

              People

              • Assignee:
                oleg_nenashev Oleg Nenashev
                Reporter:
                oleg_nenashev Oleg Nenashev
              • Votes:
                4 Vote for this issue
                Watchers:
                7 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: