Redirecting stdout/stderr
File Descriptor 102
Order, Order!
Now we are versed in the language of file descriptors we can look at some shell subtleties.
We know that IO redirections are performed left to right so that:
ls > foo > bar
means that we first open (and truncate) foo and send the output of ls there followed immediately by opening (and truncating) bar and sending the output of ls there instead.
A slightly wasteful open and truncation of foo but maybe that's what you meant.
Let's do something with both stdout and stderr:
ls > foo 2>&1
Here we open foo and redirect the output of ls to it. We then dup stderr to wherever stdout is currently going to (hint: foo).
OK, now the famous mistake:
ls 2>&1 > foo
Rookies (not us, no sir!) expect the same thing as the previous example but that's not what happens, of course. Here, the first thing we do is dup stderr to wherever stdout is currently going (which we don't know of course) and then we open foo and send the output of ls there. But we've done nothing with stderr while fiddling with stdout and foo, it's still going to wherever stdout was originally going (which we still don't know nor shall we ever).
Note
There is a small possibility that stdout was already going to foo when we started and so by chance stderr will end up in the same place. But not by the design of this one command line!
In these (suspicious?) circumstances the file descriptor associated with stdout and foo was probably not at offset zero in the file and so the > operator, which opens and truncates the file, will have an interesting effect on users' expectations. For a start any existing content will have been deleted and secondly all file descriptors associated with foo will move to offset zero [1].
[1] | I think. My systems programming escapes me right now. |
Pipelines
Pipelines seem to mess with this order stuff. The classic example is:
ls 2>&1 | less
It looks like I have to redirect stderr before the pipeline redirects stdout!
Not so! You need to revisit how you see pipelines. A pipeline, although it clearly and obviously binds the stdout of the left hand command to the stdin of the right hand command is unrelated to the IO redirection we've been discussing.
It would be better to think of a pipeline as doing a bit of magic before the (sub)shell starts doing anything with the command and any IO redirection. Bear in mind, too, the standing problem that as any program runs it cannot know where its own stdin/stdout/stderr are connected to: the terminal, a file, /dev/null? [2] The program cannot know and cannot determine the answer (unless it pokes about in the kernel, runs lsof or some equivalent which is decidedly cheating).
[2] | To complement the discussion the script's own stdout could be a pipe! |
Creating a pipeline is doing much of the same. For each of the "command plus IO redirection" blobs, ... in:
... | ... | ...
they each run, like ourselves, oblivious to where their initial stdin/stdout/stderr connections are.
With that thought rattling about, our original pipeline example makes more sense. The left hand side:
ls 2>&1
is simply sending the stderr of ls to wherever stdout is currently going. For a script that is, um, wherever it is. For a command [3] in a pipeline we, the script author, happen to know that the command's stdout is one end of a pipe but the command doesn't (and can't [4]).
[3] | Technically, a subshell! |
[4] | And figuring out the details of a pipe from the kernel/lsof are messier and less useful than a file. |
For IO redirection purposes it might be better to say:
ls 2>&1 *magic* less
where the two commands are more evidently unrelated for IO redirection purposes although clearly the *magic* is all about binding the two together.
Note
As an alternative (getting further from reality but closer to pure IO redirection):
*pre-magic* ls 2>&1 *post-magic* less
where you could imagine *pre-magic* as performing > $tmp and *post-magic* as performing < $tmp. As they are left-most on the line their IO redirections are performed first and any that you supply as part of your command plus IO redirection snippet override them.
*pre-magic* and *post-magic* are far from the truth however as they assume that their respective commands run to completion whereas in reality the commands can take arbitrary amounts of time to run and, in practice, the pipe itself acts as a blocking buffer.
Think of:
zcat verylargefile.gz | less
the chances are that you will quit from less long before zcat has finished expanding the file. That wouldn't be possible with *pre-magic* and *post-magic*.
Of course, you'll immediately realise that as either side of a pipeline is performing IO redirection as though it were a standalone script they are free to ignore their stdin and stdout respectively:
ls > foo | wc
will result in wc seeing an empty stdin as you've chosen to send the output of ls into foo.
The original stdout for the command, the pipeline, hasn't gone anywhere, we've just chosen not to use it. The same with any command that redirects its output elsewhere, the original stdout is still there. In this particular case, because ls sends its output to foo and doesn't use the pipeline then wc sees nothing at its end.
If you like race conditions you may manage to make:
ls > foo | wc < foo
do the right thing. But it's unlikely!
Note
The problem is that the parent shell is constructing the entire pipeline ready to go before unleashing it. It will probably be constructing the pipeline left to right in which case the output of ls into foo will have created a zero-length foo. When it next creates the subshell for wc it will use the (zero-length) foo as stdin. When the pipeline is unleashed, wc will discover its stdin is zero-length long before ls has had a chance to examine the current directory and print a listing.
You can get somewhere with:
ls > foo | { sleep 1; wc < foo; }
where the right hand command is a compound group command and the wc is delayed by a second. Time enough for ls to do its thing.
But be honest, what were you trying to achieve?
Document Actions