I took a look at this more closely.
Change under consideration
The original PR #2302 attempted by Nicolas De Loof tries to change the encoding of the console log file that the master keeps. It is currently written in the encoding of the build agent that runs the build, and the PR attempted to change this to UTF-8.
One-off-executor plugin would like to launch a new build agent synchronously from within Executor that's carrying out a build (such as creating a new container on Kubernetes.) In general, such an activity might produce some logs that logically should go to Run.getLogFile().
Run.execute() currently determines (A) the platform encoding of the build agent, then creates (B) a log file and starts writing. This ordering is problematic for the OOE plugin because (C) the the launch activitiy needs to happen to be able to determine (A), and (C) requires (B) to be able to log any progress/failure during the launch. So there's cyclic dependency between (A)->(B)->(C)->(A).
Upon closer look, I don't think such encoding change is feasible without various regressions, and here is why; Run.charset has an intimate relationship with the encoding of Run.getLogFile(). (D) Too many code makes an assumption that Run.getLogFile() is encoded in Run.charset (correctly as of today), such as ConsoleCommand.run(). And likewise, (E) too many code makes an assumption that TaskListener.getLogger() receives text encoded in Run.charset (also correctly as of today), such as Maven.perform() where MavenConsoleAnnotator filters log output. In the world where logs are stored in UTF-8, the encoding in which Run.getLogFile() is written vs the encoding of bytes that stream through TaskListener.getLogger() will be different, and this will break (D) or (E). The only way out of (D) vs (E) dilemma is to use UTF-8 for everything from TaskListener.getLogger() to Run.getLogFile() by pushing re-encoding (from the platform encoding of the build agent to UTF-8) into Launcher (for example by tweaking the contract that ProcStarter.stdout deals with UTF-8), but doing this now creates IMO undeisrable interaction between Launcher and Executor, and also breaks an use case where Launcher is used to fork off a process whose I/O is not text-based (such as java -jar slave.jar)
Given the difficulty in the original attempted fix, my suggestion to this problem is as follows:
(F) Define RunListener.onInitialize(Run) that is fired in Run.execute() before Computer.getDefaultCharset() is called to provide the OOE plugin an opportunity to perform a launch and avoid relying on a fragile assumption that Computer.getDefaultCharset() is the point in which provisioning has to happen.
(G) Run.execute() will open log file with the append flag, so that the activity in (F) can open and write to Run.getLogFile() without getting overwritten. Because of the (A)->(B)->(C)->(A) cycle above, in general we cannot be certain that (H) the encoding in which the launch activity is written and (A) are identical, but I think this is the least of the evil because:
- In a homogeneous environment where the platform encoding of the master & agents are the same, this is not a problem
- OOE plugin can stick to ASCII charset to reduce the likelihood of this problem
- There might be out-of-the-band information to help plugins like OOE determine the eventual platform encoding of the build agent. For example, in CJK countries, Linux servers have been traditionally run with UTF-8. Docker containers appear to be mostly packaged with the "POSIX" locale (meaning ASCII), and UTF-8 at best (source), so for Docker slave plugin, it can write the log file in UTF-8 to minimize the encoding mismatch problem between (A) and (H).
I'm going to write a PR based on this suggested solution.