Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-55738

NodeMonitor out of disk space message is wrong for master

    XMLWordPrintable

    Details

    • Similar Issues:
    • Released As:
      Jenkins 2.164

      Description

      When the master goes out of disk space, it is put offline by the hudson.node_monitors.AbstractDiskSpaceMonitor system.

      The message has a blank where the node name should be:

      Putting back online as there is enough disk space again

      I think this is just because c.getName() returns an empty string in https://github.com/jenkinsci/jenkins/blob/9d61a9a13171c94e55779d166d81599cdd0f9cb7/core/src/main/java/hudson/node_monitors/AbstractDiskSpaceMonitor.java#L43-L71.

      Acceptance criteria

      • This should show master instead of a space as the node name, i.e.

      Putting back master online as there is enough disk space

        Attachments

        1. Capture1.JPG
          Capture1.JPG
          59 kB
        2. Capture1.JPG
          Capture1.JPG
          59 kB
        3. Capture2.JPG
          Capture2.JPG
          125 kB
        4. Capture2 - Copy.JPG
          Capture2 - Copy.JPG
          193 kB
        5. Capture3.JPG
          Capture3.JPG
          121 kB
        6. Capture3 - Copy.JPG
          Capture3 - Copy.JPG
          181 kB
        7. Capture4 - Copy.JPG
          Capture4 - Copy.JPG
          127 kB
        8. Capture5 - Copy.JPG
          Capture5 - Copy.JPG
          159 kB
        9. image-2019-02-09-00-30-55-101.png
          image-2019-02-09-00-30-55-101.png
          193 kB

          Activity

          batmat Baptiste Mathus created issue -
          batmat Baptiste Mathus made changes -
          Field Original Value New Value
          Description When the master goes out of disk space, it is put offline by the {{hudson.node_monitors.AbstractDiskSpaceMonitor}} system.

          The message has a blank where the node name should be:

          {{Putting back online as there is enough disk space again}}


          I think this is just because {{c.getName()}} returns an empty string in https://github.com/jenkinsci/jenkins/blob/9d61a9a13171c94e55779d166d81599cdd0f9cb7/core/src/main/java/hudson/node_monitors/AbstractDiskSpaceMonitor.java#L43-L71.

          h3. Acceptance criteria

          * This should show {{master}} instead of a space as the node name
          When the master goes out of disk space, it is put offline by the {{hudson.node_monitors.AbstractDiskSpaceMonitor}} system.

          The message has a blank where the node name should be:

          {{Putting back online as there is enough disk space again}}


          I think this is just because {{c.getName()}} returns an empty string in https://github.com/jenkinsci/jenkins/blob/9d61a9a13171c94e55779d166d81599cdd0f9cb7/core/src/main/java/hudson/node_monitors/AbstractDiskSpaceMonitor.java#L43-L71.

          h3. Acceptance criteria

          * This should show {{master}} instead of a space as the node name
          batmat Baptiste Mathus made changes -
          Description When the master goes out of disk space, it is put offline by the {{hudson.node_monitors.AbstractDiskSpaceMonitor}} system.

          The message has a blank where the node name should be:

          {{Putting back online as there is enough disk space again}}


          I think this is just because {{c.getName()}} returns an empty string in https://github.com/jenkinsci/jenkins/blob/9d61a9a13171c94e55779d166d81599cdd0f9cb7/core/src/main/java/hudson/node_monitors/AbstractDiskSpaceMonitor.java#L43-L71.

          h3. Acceptance criteria

          * This should show {{master}} instead of a space as the node name
          When the master goes out of disk space, it is put offline by the {{hudson.node_monitors.AbstractDiskSpaceMonitor}} system.

          The message has a blank where the node name should be:

          {{Putting back online as there is enough disk space again}}


          I think this is just because {{c.getName()}} returns an empty string in https://github.com/jenkinsci/jenkins/blob/9d61a9a13171c94e55779d166d81599cdd0f9cb7/core/src/main/java/hudson/node_monitors/AbstractDiskSpaceMonitor.java#L43-L71.

          h3. Acceptance criteria

          * This should show {{master}} instead of a space as the node name
          Hide
          nisarg14 Nisarg Shah added a comment -

          I would like to work on this issue. So can I assign this to me?

          Show
          nisarg14 Nisarg Shah added a comment - I would like to work on this issue. So can I assign this to me?
          Hide
          batmat Baptiste Mathus added a comment -

          Nisarg Shah absolutely, please do. Thanks! If you need help, please reach out to us either on IRC, on on the developers mailing list.

          Thanks again!

          Show
          batmat Baptiste Mathus added a comment - Nisarg Shah absolutely, please do. Thanks! If you need help, please reach out to us either on IRC, on on the developers mailing list. Thanks again!
          nisarg14 Nisarg Shah made changes -
          Assignee Nisarg Shah [ nisarg14 ]
          Hide
          nisarg14 Nisarg Shah added a comment -

          Sure. Thank you!!

          Actually I am new to open source and also very much excited to contribute.

          Show
          nisarg14 Nisarg Shah added a comment - Sure. Thank you!! Actually I am new to open source and also very much excited to contribute.
          Hide
          jglick Jesse Glick added a comment -

          I think it would suffice to use Computer.getDisplayName, which is overridden for MasterComputer.

          Show
          jglick Jesse Glick added a comment - I think it would suffice to use Computer.getDisplayName , which is overridden for MasterComputer .
          Hide
          nisarg14 Nisarg Shah added a comment -

          I want to ask few doubts. So which gitter room should I join to discuss? Or should I ask and discuss here?

          Show
          nisarg14 Nisarg Shah added a comment - I want to ask few doubts. So which gitter room should I join to discuss? Or should I ask and discuss here?
          Hide
          jglick Jesse Glick added a comment -

          https://gitter.im/jenkinsci/jenkins is always available for live chat, though I generally feel that if you have a specific question about a specific JIRA issue, commenting in that issue is best unless you think it is a broader topic that many people might want to participate in.

          Show
          jglick Jesse Glick added a comment - https://gitter.im/jenkinsci/jenkins is always available for live chat, though I generally feel that if you have a specific question about a specific JIRA issue, commenting in that issue is best unless you think it is a broader topic that many people might want to participate in.
          Hide
          nisarg14 Nisarg Shah added a comment -

          Okay. Thanks

          I will go through some necessary part of the code and will post comments here if required.

          Show
          nisarg14 Nisarg Shah added a comment - Okay. Thanks I will go through some necessary part of the code and will post comments here if required.
          Hide
          nisarg14 Nisarg Shah added a comment -

          If we pass "Putting back online as there is enough disk space again" in Logger.info() then would it be a better behaviour and would it work? As if we change getName() method of Computer class than it would create a problem as this method is used in many places.

          Show
          nisarg14 Nisarg Shah added a comment - If we pass "Putting back online as there is enough disk space again" in Logger.info() then would it be a better behaviour and would it work? As if we change getName() method of Computer class than it would create a problem as this method is used in many places.
          Hide
          jglick Jesse Glick added a comment -

          I suspect replacing the two occurrences of c.getName() in the abovementioned code section with c.getDisplayName() would fix this. Most of the work would be figuring out how to verify the fix.

          Show
          jglick Jesse Glick added a comment - I suspect replacing the two occurrences of c.getName() in the abovementioned code section with c.getDisplayName() would fix this. Most of the work would be figuring out how to verify the fix.
          Hide
          nisarg14 Nisarg Shah added a comment -

          Yes it may be. But i looked each and every line where c.getName() is called and i think that it would create issue editing that method. So either we have to call another Display method or we should pass directly the string as it is. And in c.getDisplayName() it gets override by returning nodeName which exactly c.getName() does. So I am not sure but it may probably not work. There are two functions of getDisplayName in which one gets override by returning nodeName.

          Show
          nisarg14 Nisarg Shah added a comment - Yes it may be. But i looked each and every line where c.getName() is called and i think that it would create issue editing that method. So either we have to call another Display method or we should pass directly the string as it is. And in c.getDisplayName() it gets override by returning nodeName which exactly c.getName() does. So I am not sure but it may probably not work. There are two functions of getDisplayName in which one gets override by returning nodeName.
          Hide
          jglick Jesse Glick added a comment -

          The reported issue is about MasterComputer, which overrides getDisplayName as previously noted.

          Show
          jglick Jesse Glick added a comment - The reported issue is about MasterComputer , which overrides getDisplayName as previously noted.
          Hide
          nisarg14 Nisarg Shah added a comment -

          Yes issue is of MasterComputer. But if we replace c.getName() with c.getDisplayName() than would it fix this issue? 

          Show
          nisarg14 Nisarg Shah added a comment - Yes issue is of MasterComputer. But if we replace c.getName() with c.getDisplayName() than would it fix this issue? 
          Hide
          jglick Jesse Glick added a comment -

          Maybe you should try it and find out.

          Show
          jglick Jesse Glick added a comment - Maybe you should try it and find out.
          nisarg14 Nisarg Shah made changes -
          Comment [ Okay and Can I get just one help that how can I test my change in code. Like how would i run that code and check whether that issue has solved or not? ]
          Hide
          nisarg14 Nisarg Shah added a comment -

          https://github.com/jenkinsci/jenkins/pull/3863

          In these pull request 4 existing tests got failed. Can you please help me in figuring out what happened? As no code of that modules were changed.

          Show
          nisarg14 Nisarg Shah added a comment - https://github.com/jenkinsci/jenkins/pull/3863 In these pull request 4 existing tests got failed. Can you please help me in figuring out what happened? As no code of that modules were changed.
          Hide
          nisarg14 Nisarg Shah added a comment -

          Can anyone tell me how can I run tests on my local computer?

          Show
          nisarg14 Nisarg Shah added a comment - Can anyone tell me how can I run tests on my local computer?
          Hide
          jglick Jesse Glick added a comment -
          mvn -DskipTests clean install && mvn -f test surefire:test -Dtest=WhateverTest\#method
          

          if there is a specific test you are interested in, though you can also just use CI (failures look unrelated to your code). In this case I doubt there is any relevant automated test coverage—the issue would need to be verified manually by actually simulating an out of disk space event.

          Show
          jglick Jesse Glick added a comment - mvn -DskipTests clean install && mvn -f test surefire:test -Dtest=WhateverTest\#method if there is a specific test you are interested in, though you can also just use CI (failures look unrelated to your code). In this case I doubt there is any relevant automated test coverage—the issue would need to be verified manually by actually simulating an out of disk space event.
          Hide
          nisarg14 Nisarg Shah added a comment -

          Okay Thanks...

          I think here the behaviour of changed code is not okay and the tests failed are unrelated to my code. So now how should I approach to solve this?

          Show
          nisarg14 Nisarg Shah added a comment - Okay Thanks... I think here the behaviour of changed code is not okay and the tests failed are unrelated to my code. So now how should I approach to solve this?
          Hide
          nisarg14 Nisarg Shah added a comment -

          What about passing a string message directly into Logger.info(). I think by this change no behaviour of code will get changed and problem can also be solved?

          Show
          nisarg14 Nisarg Shah added a comment - What about passing a string message directly into Logger.info(). I think by this change no behaviour of code will get changed and problem can also be solved?
          Hide
          nisarg14 Nisarg Shah added a comment - - edited

          In my pull request yesterday testing was being done and it passed all the tests. And a green tick was given that all tests cases passed and this commit can be build. But today tests were running once again and one error occurred which is different from the errors occurred in earlier test. Error is of Remote call on jenkinsinfra-highmem8ca11 failed

          Show
          nisarg14 Nisarg Shah added a comment - - edited In my pull request yesterday testing was being done and it passed all the tests. And a green tick was given that all tests cases passed and this commit can be build. But today tests were running once again and one error occurred which is different from the errors occurred in earlier test. Error is of Remote call on jenkinsinfra-highmem8ca11 failed
          Hide
          nisarg14 Nisarg Shah added a comment -

          I just need to verify the changes done by doing tests as I had a look on code once again and thought that the changes done in the pr looks fine just needed to get verify. Are there any specific tests needed for this through which I can get to know that whether it works or not?

           

          Show
          nisarg14 Nisarg Shah added a comment - I just need to verify the changes done by doing tests as I had a look on code once again and thought that the changes done in the pr looks fine just needed to get verify. Are there any specific tests needed for this through which I can get to know that whether it works or not?  
          Hide
          jglick Jesse Glick added a comment -

          here the behaviour of changed code is not okay

          Why do you say that?

          Error is of Remote call on jenkinsinfra-highmem8ca11 failed

          Likely a problem in CI infrastructure, not the code being tested.

          Are there any specific tests needed for this through which I can get to know that whether it works or not?

          As previously mentioned, I am not aware of any. The issue relates solely to the user interface and test coverage of the UI is low.

          Show
          jglick Jesse Glick added a comment - here the behaviour of changed code is not okay Why do you say that? Error is of Remote call on jenkinsinfra-highmem8ca11 failed Likely a problem in CI infrastructure, not the code being tested. Are there any specific tests needed for this through which I can get to know that whether it works or not? As previously mentioned, I am not aware of any. The issue relates solely to the user interface and test coverage of the UI is low.
          Hide
          nisarg14 Nisarg Shah added a comment -

          Okay. So I think I should check from where .getDisplayName() is assigned to the particular message. From that we can be sure whether this particular change will work or not.

          Show
          nisarg14 Nisarg Shah added a comment - Okay. So I think I should check from where .getDisplayName() is assigned to the particular message. From that we can be sure whether this particular change will work or not.
          Hide
          nisarg14 Nisarg Shah added a comment -

          The tests which got failed while checking got passed now, but now I am only facing some problem to verify the fix by manually creating an out of disk space event.

          Show
          nisarg14 Nisarg Shah added a comment - The tests which got failed while checking got passed now, but now I am only facing some problem to verify the fix by manually creating an out of disk space event.
          Hide
          batmat Baptiste Mathus added a comment - - edited

          Nisarg Shah are you on Windows or Linux?

          On Linux, the usual way for this is simply to use fallocate or dd, see https://stackoverflow.com/a/5688625/345845.

          On Windows, the simplest low-tech way I did that in the past is the following:

          • find a big enough file, or very big if you have a lot of free disk space
          • copy paste it as many times as needed, even using the UI and Ctrl-C Ctrl-V should be quick enough
            • i.e. if you have 10 GB of free disk space, find a big file, like 1GB, and copy-paste it like nine or ten times

          Or maybe try https://blogs.msdn.microsoft.com/oldnewthing/20150710-00/?p=45171 but didn't try it myself.

          HTH

          Show
          batmat Baptiste Mathus added a comment - - edited Nisarg Shah are you on Windows or Linux? On Linux, the usual way for this is simply to use fallocate or dd , see https://stackoverflow.com/a/5688625/345845 . On Windows, the simplest low-tech way I did that in the past is the following: find a big enough file, or very big if you have a lot of free disk space copy paste it as many times as needed, even using the UI and Ctrl-C Ctrl-V should be quick enough i.e. if you have 10 GB of free disk space, find a big file, like 1GB, and copy-paste it like nine or ten times Or maybe try https://blogs.msdn.microsoft.com/oldnewthing/20150710-00/?p=45171 but didn't try it myself. HTH
          Hide
          nisarg14 Nisarg Shah added a comment -

          Baptiste Mathus Okay, great I had a thought of copying and pasting big files and manually create an out of disk event. But actually after this I have a doubt that how to run jenkins code on local and test that fixed part?

          Show
          nisarg14 Nisarg Shah added a comment - Baptiste Mathus Okay, great I had a thought of copying and pasting big files and manually create an out of disk event. But actually after this I have a doubt that how to run jenkins code on local and test that fixed part?
          Hide
          nisarg14 Nisarg Shah added a comment - - edited

          Baptiste Mathus Jesse Glick

          Due to some issues I have deleted my previous repo and due to that my old PR got deleted.

          Here is new PR : https://github.com/jenkinsci/jenkins/pull/3874

          Show
          nisarg14 Nisarg Shah added a comment - - edited Baptiste Mathus Jesse Glick Due to some issues I have deleted my previous repo and due to that my old PR got deleted. Here is new PR :  https://github.com/jenkinsci/jenkins/pull/3874
          Hide
          nisarg14 Nisarg Shah added a comment -

          Baptiste Mathus

          Can you suggest me the command(applied on Jenkins Folder) of test through which I can check this fix?

          I have checked some flow of the code and want to check whether this fix would actually work or not.

          Show
          nisarg14 Nisarg Shah added a comment - Baptiste Mathus Can you suggest me the command(applied on Jenkins Folder) of test through which I can check this fix? I have checked some flow of the code and want to check whether this fix would actually work or not.
          Hide
          batmat Baptiste Mathus added a comment -

          Answered in the open GitHub Pull Request.

          Show
          batmat Baptiste Mathus added a comment - Answered in the open GitHub Pull Request.
          batmat Baptiste Mathus made changes -
          Status Open [ 1 ] In Progress [ 3 ]
          nisarg14 Nisarg Shah made changes -
          Attachment Capture1.JPG [ 45958 ]
          nisarg14 Nisarg Shah made changes -
          Comment [ I have tested this fix and I am attaching the screenshot of the result obtained. !Capture1.JPG! ]
          nisarg14 Nisarg Shah made changes -
          Comment [ [~batmat]

          I have tested the fix and I am attaching the screenshot of the result obtained.

          !Capture1.JPG|thumbnail! ]
          nisarg14 Nisarg Shah made changes -
          Attachment Capture2.JPG [ 45959 ]
          nisarg14 Nisarg Shah made changes -
          Attachment Capture1.JPG [ 45960 ]
          Hide
          nisarg14 Nisarg Shah added a comment -

          Baptiste Mathus

          I have tested he fix and attaching screenshot of the output of Jenkins Window and CLI.

          And I think this fix has resolved the issue of displaying null message.

           

           

          Show
          nisarg14 Nisarg Shah added a comment - Baptiste Mathus I have tested he fix and attaching screenshot of the output of Jenkins Window and CLI. And I think this fix has resolved the issue of displaying null message.    
          batmat Baptiste Mathus made changes -
          Description When the master goes out of disk space, it is put offline by the {{hudson.node_monitors.AbstractDiskSpaceMonitor}} system.

          The message has a blank where the node name should be:

          {{Putting back online as there is enough disk space again}}


          I think this is just because {{c.getName()}} returns an empty string in https://github.com/jenkinsci/jenkins/blob/9d61a9a13171c94e55779d166d81599cdd0f9cb7/core/src/main/java/hudson/node_monitors/AbstractDiskSpaceMonitor.java#L43-L71.

          h3. Acceptance criteria

          * This should show {{master}} instead of a space as the node name
          When the master goes out of disk space, it is put offline by the {{hudson.node_monitors.AbstractDiskSpaceMonitor}} system.

          The message has a blank where the node name should be:

          {{Putting back online as there is enough disk space again}}


          I think this is just because {{c.getName()}} returns an empty string in https://github.com/jenkinsci/jenkins/blob/9d61a9a13171c94e55779d166d81599cdd0f9cb7/core/src/main/java/hudson/node_monitors/AbstractDiskSpaceMonitor.java#L43-L71.

          h3. Acceptance criteria

          * This should show {{master}} instead of a space as the node name, i.e.

          {{Putting back *master* online as there is enough disk space}}
          Hide
          batmat Baptiste Mathus added a comment -

          Nisarg Shah the goal here is to fix the message explained in the description.

          See the acceptance criteria above.
          You need to see how Jenkins reacts when you remove the big files and have enough disk space again.

          Show
          batmat Baptiste Mathus added a comment - Nisarg Shah the goal here is to fix the message explained in the description. See the acceptance criteria above. You need to see how Jenkins reacts when you remove the big files and have enough disk space again.
          nisarg14 Nisarg Shah made changes -
          Attachment Capture3.JPG [ 45977 ]
          nisarg14 Nisarg Shah made changes -
          Attachment Capture3 - Copy.JPG [ 45978 ]
          Hide
          nisarg14 Nisarg Shah added a comment -

          Baptiste Mathus

          I am attaching the screenshot of the message before the fix : 

          I have marked the issue with red marker.

          Show
          nisarg14 Nisarg Shah added a comment - Baptiste Mathus I am attaching the screenshot of the message before the fix :  I have marked the issue with red marker.
          nisarg14 Nisarg Shah made changes -
          Attachment Capture4 - Copy.JPG [ 45979 ]
          nisarg14 Nisarg Shah made changes -
          Attachment Capture2 - Copy.JPG [ 45980 ]
          nisarg14 Nisarg Shah made changes -
          Attachment image-2019-02-09-00-30-55-101.png [ 45981 ]
          nisarg14 Nisarg Shah made changes -
          Comment [ [~batmat]

          I am attaching screenshot of message after the fix.

          First photo is when there was an out of disk event detected.

          Second photo is when master node came online as the disk gets some empty space.

          And I think this fulfills the acceptance criteria.

           

          !image-2019-02-09-00-30-55-101.png|width=735,height=370!

           

          !Capture4 - Copy.JPG|width=598,height=263!

           

            ]
          nisarg14 Nisarg Shah made changes -
          Attachment Capture5 - Copy.JPG [ 45982 ]
          Hide
          nisarg14 Nisarg Shah added a comment -

          Baptiste Mathus

          I am attaching the screenshot of message after the fix.

          And I think the issue is solved as per acceptance criteria.

           

          Show
          nisarg14 Nisarg Shah added a comment - Baptiste Mathus I am attaching the screenshot of message after the fix. And I think the issue is solved as per acceptance criteria.  
          batmat Baptiste Mathus made changes -
          Status In Progress [ 3 ] In Review [ 10005 ]
          oleg_nenashev Oleg Nenashev made changes -
          Labels evergreen-triggered fosdem2019 newbie-friendly evergreen-triggered fosdem2019 lts-candidate newbie-friendly
          Hide
          oleg_nenashev Oleg Nenashev added a comment -

          The fix has been released in Jenkins 2.164. Thanks Nisarg Shah!

          Show
          oleg_nenashev Oleg Nenashev added a comment - The fix has been released in Jenkins 2.164. Thanks Nisarg Shah !
          oleg_nenashev Oleg Nenashev made changes -
          Status In Review [ 10005 ] Resolved [ 5 ]
          Resolution Fixed [ 1 ]
          Released As Jenkins 2.164
          olivergondza Oliver Gondža made changes -
          Labels evergreen-triggered fosdem2019 lts-candidate newbie-friendly evergreen-triggered fosdem2019 newbie-friendly

            People

            • Assignee:
              nisarg14 Nisarg Shah
              Reporter:
              batmat Baptiste Mathus
            • Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: