Squish, Bang, Zip… So You Want to Archive and Compress/Decompress Files?

215

Let’s continue now with our series of articles introducing new Linux Adventurers to the Linux shell, the command line, and command line editing.

Sometimes, there is a need to stuff a bunch of files into an archive and compress it. This is often the case when one wants to send a bunch of large files to someone else via email or serve them on a server. This practice came about in the old days of computing when disk space was at a premium. Even though disk space is now huge and cheap, bandwidth can still be tight in some cases, so archiving and compressing files is still an efficient way to store/transport them on your system or move them across the Internet.

Let’s say our old pal Joe had three files that he wants to shove into an archive for storage on his own system. His files are text1.txt, joemama.text, and umeume.txt. Here’s how Joe would archive these three files using the tar command. Tar stands for tape archive. It’s been around since Noah ran Slackware on the Ark.

joe@mysystem:~$ tar -cvf joesstuff.tar text1.txt joemama.txt umeume.txt

Here’s what the above command does… it takes the three files listed at the end of the command and stuffs them (-c = create, -v = verbose, -f = read/write to file) into an archived file called joesstuff.tar.

OK, so far? Alright then… what if Joe’s three files were pretty big and he wants to send them to his buddies Bill, Mary, and Tom via his online server? Well, in that case Joe would want to compress his new archive using a compression application. Remember the old WinZip program in Windows? Well, Linux has some similar applications to squish files into smaller packages. Two that we’ll talk about here are bzip2 and gzip.

So, Joe wants to compress his new archive file with bzip2 for starters. Here’s how he does it:

joe@mysystem:~$ bzip2 -v joesstuff.tar

Now when Joe lists (ls) his contents of the directory he’s working in, he will find that his original joesstuff.tar is now called joesstuff.tar.bz2. Cool, huh? The size will be smaller, too, because the archive has been compressed. Joe could now upload the file to his online server to share with his friends.

Another compression method that Joe could use is gzip. Here’s how Joe would do that:

joe@mysystem:~$ gzip -v joesstuff.tar

If Joe used gzip to compress his archive, it would now be called joesstuff.tar.gz.

Now let’s say that Mary downloaded the joesstuff.tar.bz2 to her computer and wanted to actually see/read the files it contained. She would have to decompress the archive first. She would do this from her command line with this command:

mary&larry@home:~$ bunzip2 joesstuff.tar.bz2

This command would “unzip” (decompress) the package leaving Mary with Joe’s original archive –> joesstuff.tar. If Joe had put the archive on his server as a gzip compressed archive, Mary would use this command instead to unzip it:

mary&larry@home:~$ gunzip joesstuff.tar.gz

Either way, she’d still end up with Joe’s original archive –> joesstuff.tar. Now though, she will need to unpack the archive in order to read the individual files. She can unpack the tar archive this way:

mary&larry@home:~$ tar -xvf joesstuff.tar

Now, when Mary lists (ls) the contents of her working directory, she would see the original joesstuff.tar and the three files that she just extracted (-x) from the archive.

mary&larry@home:~$ ls

joesstuff.tar

text1.txt

joemama.txt

umeume.txt

That’s it, folks. Those are the basics of archiving and compressing/decompressing files in Linux using the command line tools tar, bzip2, and gzip. Be sure to refer to the man pages for these commands to get some more good information. Remember how to get to the man pages? You can either read them online at places like linuxmanpages.com or you can access them right there on your own computer using the command line by entering:

you@your_computer:~$ man tar

TAR(1)                    BSD General Commands Manual                   TAR(1)

NAME
tar — The GNU version of the tar archiving utility

SYNOPSIS
tar [-] A –catenate –concatenate | c –create | d –diff –compare |
–delete | r –append | t –list | –test-label | u –update | x
–extract –get [options] [pathname …]

DESCRIPTION
Tar stores and extracts files from a tape or disk archive.

The first argument to tar should be a function; either one of the letters
Acdrtux, or one of the long function names.  A function letter need not
be prefixed with “-”, and may be combined with other single-letter
options.  A long function name must be prefixed with –…

I hope you’ve learned something from this lesson.

Later…

~Eric

*This article originally appeared on my Nocturnal Slacker v1.0 site at WordPress.com