Author: Michael Stutz
The tools philosophy
One of the endearing qualities of Unix is its “tools” philosophy. Like the hand tools and power tools in a workshop, each of these tools is designed to perform a specific task, and do it well. For example, cat concatenates its input and spills it to the standard output, tr is a filter that translates characters of its input, grep searches for a regular expression in its input and outputs the lines that match, and who gives a listing of all the users that are on the system.
Then came pipes, a way to pass the output of one tool to the input of another. By combining the individual tools, passing the output of one into the input of the next, you could build powerful “strings” of commands whose function is unpredicted in any of the single tools.
Every Linux user who tries out a famous pipeline for the first time gets a live demonstration of the power of this philosophy, as with the pipeline for showing the number of users logged in the system:
who | wc -l
Stringing pipelines together yield powerful results fast — such as this example, where with just a few tools you can take the lines of output from some command and produce a list that’s sorted by frequency, with the lines prefixed by their number of occurrences:
some-command-with-lots-of-output | sort | uniq -c | sort -n -r
The new tools
All of the software assembled in moreutils is GPLed, and was written by Hess and others, including longtime Linux hacker Lars Wirzenius. There are nine tools in the package right now, and counted together they comprise around 1,329 lines of source code — scarcely 150 lines per utility. Let’s take a look at them.
Check for valid Unicode
The isutf8
tool checks its input to see if the text is in the Unicode character set (UTF-8, which is the most popular Unicode encoding on UNIX systems):
isutf8 somefile
If there’s an error and the input contains invalid UTF-8 code, isutf8
will say so. If there is no error (or if the input is in some other character set entirely), it outputs nothing.
Pick up standard input with sponge
The sponge
tool just takes its standard input and writes it to a given file, but before it writes anything, it “soaks up” all of the standard input first — so you can write to the same file without clobbering it.
This comes in handy in those cases when you want to extract certain lines from a file and then write the lines back to the file itself. The command below is an example of what not to do — it extracts the lines you want from myfile, yes, but then the redirection clobbers the file, rendering it empty:
grep foo myfile > myfile
To do what you intend — take all the lines of myfile containing “foo” and write those lines back to myfile, so that the file only contains those lines — use sponge
in this way:
grep foo myfile | sponge myfile
Add timestamps with ts
The ts
filter is a Perl script that prefaces its standard input with a timestamp and a space character. This is good for logging program output:
some-program | ts > logfile
The default timestamp format is “%H:%M:%S”; you can, however, specify another format as an argument (see the man
page for strftime
to see what’s possible), or use it to prefix lines with any arbitrary text:
some-program | ts `hostname` > logfile
Edit directories with vidir
The vidir
tool lets you edit directory and file names in your favorite text editor. It uses whatever editor you have set in the EDITOR or VISUAL environment variables; otherwise, as you’d expect, it uses vi
.
With no arguments, vidir
opens the contents of the current directory for editing — you can delete files, rename them, or edit them. You can give, as an argument, a directory name, or a list of files to edit, or a hyphen character, which then reads the names of files from standard input and then it opens them for editing.
Edit pipelines with vipe
The vipe
tool lets you insert an interactive text editor into a pipeline. Which editor it uses is determined the same way as with vidir
.
vipe
can be helpful when you want to edit some command output, but also want to pass it on to some other commands — just be sure to write your changes before exiting the editor, or the changes won’t get passed on to the pipeline.
For example, if you want to mail the output of commands to your friend but want to preface the output with a comment, you can do it like so:
some-commands | vipe | mail pal@example.net
Merge files with combine
The combine
tool is a Perl script that combines the lines from two files (or standard input), using Boolean operators, according to this table:
and outputs lines contained in both files or outputs lines contained in either file not outputs lines contained in the first file but not the second xor outputs lines that are in either file but not in both
Give as arguments the first file, the operator, and the second file.
For example, here’s how you’d use it to output all of the lines that are in /tmp/passwd but not in /etc/passwd:
combine /tmp/passwd not /etc/passwd
Gather network interface information with ifdata
The ifdata
tool gives parse-friendly output of all kinds of network interface data for a given interface, according to the following list of options:
-e Reports interface existence via return code -p Print out the whole config of iface -pe Print out yes or no according to existence -ph Print out the hardware address -pa Print out the address -pn Print netmask -pN Print network address -pb Print broadcast -pm Print mtu -pf Print flags -si Print all statistics on input -sip Print # of in packets -sib Print # of in bytes -sie Print # of in errors -sid Print # of in drops -sif Print # of in fifo overruns -sic Print # of in compress -sim Print # of in multicast -so Print all statistics on output -sop Print # of out packets -sob Print # of out bytes -soe Print # of out errors -sod Print # of out drops -sof Print # of out fifo overruns -sox Print # of out collisions -soc Print # of out carrier loss -som Print # of out multicast
For instance, if you have a PPP connection, this command returns the network IP address:
ifdata -pN ppp0
The -pf
option outputs a list of network interface flags for a given interface in a much easier to parse format than ifconfig‘s, showing whether each flag is on or off. The -pf
option tells ifdata to print the flags for the interface, so ifdata -pf lo
displays the flags for the loopback interface:
ifdata -pf lo On Up Off Broadcast Off Debugging On Loopback Off Ppp Off No-trailers On Running Off No-arp Off Promiscuous Off All-multicast Off Load-master Off Load-slave Off Multicast Off Port-select Off Auto-detect Off Dynaddr Off Unknown-flags
Tee to pipes with pee
The unfortunately named pee
is a tee for pipes: where tee passes its standard input to both standard output and any given filenames, pee sends its standard input to any given commands:
who | pee "wc -l > lines" "wc -w > words" "wc -c > chars"
Unlike tee, pee doesn’t send to standard output — but you can effectively add that functionality by using cat.
Run commands on compressed files with zrun
Finally, zrun
takes a command line as an argument and uncompresses any files in the command line — a useful tool for when you want to run a command on a compressed file (.gz or .bz2) without uncompressing it first. For example, to view a compressed image file with feh without having to uncompress it, you can try:
zrun feh image.bz2
Future plans
There’s work to be done — some of the tools in the moreutils package don’t support the standard options, and the documentation is still a little spotty — but it’s a start; Hess has pointed out that small tools tend to be forgotten, and banding them together in a collection with a purpose is a good way to gain notice, and help improve upon them.
He’s also soliciting feedback on which tools might be added to the collection. Among those being considered is tmp
, which would put standard input in a temporary file and pass that file to the given command — which you’d use to send standard input to a tool that can only take a file as an argument.
Meanwhile, you might want to add these tools to your toolbox — they don’t take up much room, and you just never know when they might come in handy.