Author: Joe Barr
Recap
The first step in the CLI for noobies series was alias cat and pipe meet grep.” In it, we took a look at a few simple commands we could use to make handy CLI tools. We also learned how we could rename those tools with the alias
command, and where to keep them.
The topic for week two was file this. It started the current mini-series on files. Episode one concentrated on commands and tools useful in dealing with files: finding them, renaming them, and so on.
Last week we continued our file exploration. In man for hier, we covered the basics of the Linux file system hierarchy and learned a little bit about what sorts of files live where, as well as why that matters to us.
Forward through the fog
Looking at the world from the eyes of the Linux kernel, everything is a file. Devices like modems, monitors, CD-Roms, hard drives, even keyboards and printers are files. You (what you type on the keyboard, at least) are a file, the data that appears on your screen in response to what you’ve typed is a file, even any error messages that might result from what you’ve typed is a file. In fact, those last three files form something of a trinity that deserves special attention: standard input, standard output, and standard error.
It’s like this, C
Linux is written (mostly) in the C language, just like Unix. In C, most programs are able to read from the keyboard and write to the console monitor thanks to a set of standard I/O definitions for three standard streams. Streams, naturally, are a kind of file. Console programs written in C use those three files (stdin
, stdout,
and stderr
) without a second thought. Unless you specify them differently, stdin
is your keyboard and stdout
is your monitor. The stderr
stream might also be your monitor. That’s why, grasshoppa, when you sit at a Linux console for very long, you start to feel as if you are one with the force.
Do you remember structured analysis and design? Probably not, but let’s take a look at a typical dataflow diagram from the stoneage of IT anyway, just for the sake of illustration. The circle in the chart below represents a process: a program running on Linux. The curved lines outside the circle represent data in motion. Another way of saying streams. Or Files.
The curve with the arrow pointing into the circle is stdin
, the standard input stream. The curved line with the arrowhead pointing away from the circle is stdout
. The stdout
stream contains the output from the program. The stderr
stream (not shown) contains any errors that might have been generated.
Now let’s do away with abstractions like dataflow diagrams and look at something concrete: one of the first commands we learned. Like this one:
cat phones.txt | grep -i steve
That command combines two processes, not just one. That gives us one more thing to talk about. In your mind’s eye, label the circle in the chart above as the program cat
. As a command line program, cat
expects that its input (that would be stdin
, of course) to be passed to it on the command line. That input would contain important things for cat
to know. Like the name of the file(s) it is supposed to work with, for example. That would be the “phones.txt” in our example.
If that were the end of the command, cat
would assume that the console is the stdout
and print the contents of phones.txt there. But that’s too easy. In this example, we have linked the stdout
from cat
to become the stdin
of grep
. That’s what the pipe operator does. It pipes output from one process to another process as its input.
The “| grep -i steve
” portion of our example shows that part of grep
‘s input comes from the pipe and part of it from the command line. The grep
command normally gets the names of the file(s) to be searched from the command line, like this:
grep -i steve phones.txt
In our case, the pipe operator provides grep
with the data to be searched. But the other arguments still need to be provided on the command line itself, like the “-i” option and the search term (“steve”). Corrected: The original version incorrectly stated “So in this case, stdin
is split between two sources: the pipe and command line data.” In fact, command line arguments are completely distinct from stdin
.
Let me redirect your attention
There are other operators besides the pipe which can be used with files in cool ways. Redirection is one example. Let’s slightly change the example above to show how this works.
cat phones.txt | grep -i steve
> steve.txt
All we’ve done is to add “> steve.txt” to the end. That redirects the output from grep that would normally go to stdout
(your monitor) to a file named “steve.txt.” Of course there are other operators as well. The “stdin so that the program gets the data from a file instead of the command line. Using “>>” instead of just “>” allows you to add the output of a program to a file instead of replacing it, if it already exists. As usual, there is plenty more where that came from. We only introduce you.
What’s next?
Don’t worry, we’re just getting started exploring this newly discovered environment. In the future we’ll learn more commands, more handy tips, and unlock the secrets of the Linux gurus. We’re talking serious geek here, mister. Secrets like compiling your own programs with the magic mantra of “./configure, make, and make install.” We’ll also look at compiling the kernel and writing your own shell scripts. You can’t be stopped now that you’ve learned all about files.