Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-26558

createSlave Node name collision avoidance creates dead nodes

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Closed (View Workflow)
    • Priority: Minor
    • Resolution: Fixed
    • Component/s: swarm-plugin
    • Labels:
      None
    • Environment:
      Jenkins 1.590 on Linux with Swarm Plugin 1.21
    • Similar Issues:

      Description

      In the swarm server-side slave creation logic, when a node of the provided name already exists, '-$IP' is appended to the end in an effort to end with a unique name (see https://github.com/jenkinsci/swarm-plugin/blob/master/plugin/src/main/java/hudson/plugins/swarm/PluginImpl.java#L59)
      However, as far as I can tell, that new name is never provided to the slave, so it doesn't seem like it'd be possible for the slave to connect with that name, and in my experience I've seen hundreds of collision-avoidance nodes in this setup, and have never once seen one online or connected in any way.

      These dead "hyphen nodes" don't hurt running builds, but they are visual noise and false positives in our offline slave metrics, so it'd be nice if they could be avoided.

        Attachments

          Activity

          Hide
          ernetas Ernestas Lukoševičius added a comment - - edited

          That's actually of a high priority to some... All it takes to take down a cluster is plug out the wire for a second, Jenkins swarm clients loose connections and then try to reconnect, while Jenkins has not yet forgot about the previous slaves. I think that checking if a node (slave in my dictionary) is online should be done more often and better, especially when there are name collisions.

          Show
          ernetas Ernestas Lukoševičius added a comment - - edited That's actually of a high priority to some... All it takes to take down a cluster is plug out the wire for a second, Jenkins swarm clients loose connections and then try to reconnect, while Jenkins has not yet forgot about the previous slaves. I think that checking if a node (slave in my dictionary) is online should be done more often and better, especially when there are name collisions.
          Hide
          stephenconnolly Stephen Connolly added a comment -
          Show
          stephenconnolly Stephen Connolly added a comment - https://github.com/jenkinsci/swarm-plugin/pull/26 should fix this
          Hide
          scm_issue_link SCM/JIRA link daemon added a comment -

          Code changed in jenkins
          User: Stephen Connolly
          Path:
          client/src/main/java/hudson/plugins/swarm/Client.java
          client/src/main/java/hudson/plugins/swarm/SwarmClient.java
          plugin/src/main/java/hudson/plugins/swarm/PluginImpl.java
          http://jenkins-ci.org/commit/swarm-plugin/ab37bc84eb9639888f3a66c68a9b1536c5882c88
          Log:
          [FIXED JENKINS-26558] Clients should provide a unique ID to be used for name collision avoidance

          • The current name collision avoidance uses the requests address, which could very likely be the same for all clients
            as they could be being routed through a HTTP proxy (or two) so that is not a good disambiguator
          • We use a digest of the client's interfaces and MAC addresses and the remoteFSRoot to try and give a consistent ID
          • We ALWAYS append the ID if we have it as otherwise during reconnect the slaves with the same name will shuffle around
            which defeats a lot of the login that Jenkins has internally based on slaves having a consistent name
          • In the event of legacy clients that do not have the ID we will let them connect with their name as long as there
            is no online slave with that name. This does mean that where there are multiple legacy swarm clients with the
            same name, only one can be on-line at any moment in time, but that is an improvement on the current where
            once a shuffle starts, none can stay on-line
          Show
          scm_issue_link SCM/JIRA link daemon added a comment - Code changed in jenkins User: Stephen Connolly Path: client/src/main/java/hudson/plugins/swarm/Client.java client/src/main/java/hudson/plugins/swarm/SwarmClient.java plugin/src/main/java/hudson/plugins/swarm/PluginImpl.java http://jenkins-ci.org/commit/swarm-plugin/ab37bc84eb9639888f3a66c68a9b1536c5882c88 Log: [FIXED JENKINS-26558] Clients should provide a unique ID to be used for name collision avoidance The current name collision avoidance uses the requests address, which could very likely be the same for all clients as they could be being routed through a HTTP proxy (or two) so that is not a good disambiguator We use a digest of the client's interfaces and MAC addresses and the remoteFSRoot to try and give a consistent ID We ALWAYS append the ID if we have it as otherwise during reconnect the slaves with the same name will shuffle around which defeats a lot of the login that Jenkins has internally based on slaves having a consistent name In the event of legacy clients that do not have the ID we will let them connect with their name as long as there is no online slave with that name. This does mean that where there are multiple legacy swarm clients with the same name, only one can be on-line at any moment in time, but that is an improvement on the current where once a shuffle starts, none can stay on-line
          Hide
          scm_issue_link SCM/JIRA link daemon added a comment -

          Code changed in jenkins
          User: Peter Jönsson
          Path:
          client/src/main/java/hudson/plugins/swarm/Client.java
          client/src/main/java/hudson/plugins/swarm/SwarmClient.java
          plugin/src/main/java/hudson/plugins/swarm/PluginImpl.java
          http://jenkins-ci.org/commit/swarm-plugin/5f622da0e1eb54ec84626dbc9aceaa1aafb4a0ac
          Log:
          Merge pull request #26 from stephenc/jenkins-26558

          [FIXED JENKINS-26558] Clients should provide a unique ID to be used for ...

          Compare: https://github.com/jenkinsci/swarm-plugin/compare/bef8ccbe8906...5f622da0e1eb

          Show
          scm_issue_link SCM/JIRA link daemon added a comment - Code changed in jenkins User: Peter Jönsson Path: client/src/main/java/hudson/plugins/swarm/Client.java client/src/main/java/hudson/plugins/swarm/SwarmClient.java plugin/src/main/java/hudson/plugins/swarm/PluginImpl.java http://jenkins-ci.org/commit/swarm-plugin/5f622da0e1eb54ec84626dbc9aceaa1aafb4a0ac Log: Merge pull request #26 from stephenc/jenkins-26558 [FIXED JENKINS-26558] Clients should provide a unique ID to be used for ... Compare: https://github.com/jenkinsci/swarm-plugin/compare/bef8ccbe8906...5f622da0e1eb
          Hide
          oleg_nenashev Oleg Nenashev added a comment -

          KK does not maintain this plugin anymore. Moving to unassigned to set the expectation

          Show
          oleg_nenashev Oleg Nenashev added a comment - KK does not maintain this plugin anymore. Moving to unassigned to set the expectation

            People

            • Assignee:
              Unassigned
              Reporter:
              kylec Kyle C
            • Votes:
              1 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: