Loading...

XML

Word

Printable

Type: Bug
Resolution: Fixed
Priority: Major
Component/s: core
Labels:
- 2.138.2-fixed
- robustness

Similar Issues:

Show

Upon creating lots of agents in parallel (Cloud provisioning containers), I see sometimes random exceptions reported moving temporary files to node/config.xml.

Also:   java.nio.file.NoSuchFileException: /var/jenkins_home/nodes/myagent-5pr7b/atomic4488666319135941520tmp -> /var/jenkins_home/nodes/myagent-5pr7b/config.xml
		at sun.nio.fs.UnixException.translateToIOException(UnixException.java:86)
		at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102)
		at sun.nio.fs.UnixCopyFile.move(UnixCopyFile.java:396)
		at sun.nio.fs.UnixFileSystemProvider.move(UnixFileSystemProvider.java:262)
		at java.nio.file.Files.move(Files.java:1395)
		at hudson.util.AtomicFileWriter.commit(AtomicFileWriter.java:191)
java.nio.file.NoSuchFileException: /var/jenkins_home/nodes/myagent-5pr7b/atomic4488666319135941520tmp
	at sun.nio.fs.UnixException.translateToIOException(UnixException.java:86)
	at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102)
	at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107)
	at sun.nio.fs.UnixCopyFile.move(UnixCopyFile.java:409)
	at sun.nio.fs.UnixFileSystemProvider.move(UnixFileSystemProvider.java:262)
	at java.nio.file.Files.move(Files.java:1395)
	at hudson.util.AtomicFileWriter.commit(AtomicFileWriter.java:206)
	at hudson.XmlFile.write(XmlFile.java:198)
	at jenkins.model.Nodes.save(Nodes.java:289)
	at hudson.util.PersistedList.onModified(PersistedList.java:173)
	at hudson.util.PersistedList.replaceBy(PersistedList.java:85)
	at hudson.model.Slave.<init>(Slave.java:198)
	at hudson.slaves.AbstractCloudSlave.<init>(AbstractCloudSlave.java:51)
	at org.csanchez.jenkins.plugins.kubernetes.KubernetesSlave.<init>(KubernetesSlave.java:116)
	at org.csanchez.jenkins.plugins.kubernetes.KubernetesSlave$Builder.build(KubernetesSlave.java:408)
	at com.cloudbees.jenkins.plugins.kube.PlannedKubernetesSlave.call(PlannedKubernetesSlave.java:122)
	at com.cloudbees.jenkins.plugins.kube.PlannedKubernetesSlave.call(PlannedKubernetesSlave.java:35)
	at jenkins.util.ContextResettingExecutorService$2.call(ContextResettingExecutorService.java:46)
	at jenkins.security.ImpersonatingExecutorService$2.call(ImpersonatingExecutorService.java:71)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:748)

I tracked the root cause being the nodeProperties field in hudson.model.Slave.

If you have a lot of agents created in different threads, this will cause to call Jenkins.get().getNodesObject().save in each thread. This method is not thread-safe, and affects all nodes storage. As a result, in some threads, save() throws an exception because the node has been already processed through another thread.

In ~~JENKINS-31055~~, Stephen made Node implement Saveable, which means the persisted lists should be tied to the node instead of the Nodes object. The corresponding save() operation is fine-grained, so the issue would be avoided completely.

duplicates

JENKINS-51203 Slave.setNodeProperties is unreliable and inefficient

Resolved

is related to

JENKINS-51203 Slave.setNodeProperties is unreliable and inefficient

Resolved

JENKINS-54187 EC2 Plugin deadlock leaving Jenkins unresponsive

Closed

relates to

JENKINS-31055 Make Node implement Saveable

Closed

links to

PR #3609

Assignee:: Vincent Latombe

Reporter:: Vincent Latombe

Votes:: 0 Vote for this issue

Watchers:: 5 Start watching this issue

Created:: 2018-09-04 08:28

Updated:: 2022-04-19 10:03

Resolved:: 2018-09-26 08:51

Details

Description

Attachments

Issue Links

Activity

People

Dates