perf: use BufWriter<StdoutLock> for all verbose/listing output#163
perf: use BufWriter<StdoutLock> for all verbose/listing output#163kaladron wants to merge 1 commit intouutils:mainfrom
Conversation
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #163 +/- ##
===========================================
+ Coverage 60.83% 76.30% +15.46%
===========================================
Files 7 9 +2
Lines 789 941 +152
Branches 24 24
===========================================
+ Hits 480 718 +238
+ Misses 309 223 -86
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Replace println! in list, create, and extract operations with writeln! on a BufWriter<StdoutLock>. This acquires stdout's mutex once per operation instead of once per write call, and batches writes to reduce write(2) syscalls. Also propagates write errors (e.g. broken pipe) gracefully instead of panicking.
|
@sylvestre I'm looking at your locking comment (I didn't know that the Rust standard library still did locking even on single-threaded programs, thanks!) and I'm wondering if your experience with Coreutils gives suggestion for how far to take this. For instance, in this CL I'm wrapping all the major operations since create/extract/list could all generate a significant amount of stdio work. Is there a good threshold? I'd appreciate guidance! |
|
|
||
| // Create Builder instance | ||
| let mut builder = Builder::new(file); | ||
| let mut out = BufWriter::new(io::stdout().lock()); |
There was a problem hiding this comment.
io::Stdout is internally a LineWriter; lock() hands you a StdoutLock that writes through that line writer. Wrapping it in a BufWriter adds an 8 KB buffer above the line writer, so on a terminal the user no longer sees output line-by-line. For tar -tvf big.tar interactively, output is delayed until the buffer fills or the function returns — a noticeable regression for slow archives where users currently get streaming feedback.
The cleaner fix that keeps the lock-once perf win without breaking interactive output is to drop the BufWriter and use io::stdout().lock() directly (the line writer already coalesces within a line), or use LineWriter::new(io::stdout().lock()) explicitly. You'd still get the mutex-once benefit; you'd lose the cross-line batching for piped output, but that's much smaller than the lock contention savings.
| })?; | ||
| } | ||
|
|
||
| out.flush().map_err(TarError::Io)?; |
There was a problem hiding this comment.
Just FYI, return TarError::Io, which UError::code reports as 2, with the io error printed to stderr - is also non-conventional. GNU tar on tar tf foo.tar | head -1 exits via SIGPIPE (status 141) silently
There was a problem hiding this comment.
Thanks. Can you please file that as an issue? I think it'll get lost in here.
|
|
||
| // Create Archive instance | ||
| let mut archive = Archive::new(file); | ||
| let mut out = BufWriter::new(io::stdout().lock()); |
There was a problem hiding this comment.
We only use this inside of verbose blocks, can we avoid creating it in non-verbose mode?
It also avoids panic with |
Replace println! in list, create, and extract operations with writeln! on a BufWriter. This acquires stdout's mutex once per operation instead of once per write call, and batches writes to reduce write(2) syscalls.
Also propagates write errors (e.g. broken pipe) gracefully instead of panicking.