I don't know that this is limited to the EC2 plugin. I'm seeing a similar issue with a simple Linux JNLP agent. The first job that runs on the agent takes considerably longer than it normally should. Here's an image that shows my results.
Builds 31 and 36 are after the agent has been rebooted. Each step is doing essentially the same operations:
- sh 'env'
- sh w/simple multi-line command (pwd, ls -al, for loop with print/sleep)
- writeFile the multi-line command to disk to be used as input for sshScript
- sshScript to a remote instance and execute the same multi-line command
The main difference is that the first two steps run on the master node while the third runs on a remote JNLP agent.
For builds 31 and 36 the execution timings show that it takes almost 20 seconds for a 'sh' step to be loaded and started. The 'sshScript' that follows takes about three minutes from the end of the prior 'sh' step completing until output is logged. Under normal circumstances these operations take about two seconds or less to log some form of activity.
Observing the output of 'top' and checking CloudWatch metrics for the instance I don't see high resource usage or anything that would explain why this first job after reboot is suffering from such horrible performance.