Mounting archives with FUSE and archivemount

3447

Author: Ben Martin

The archivemount FUSE filesystem lets you mount a possibly compressed tarball as a filesystem. Because FUSE exposes its filesystems through the Linux kernel, you can use any application to load and save files directly into such mounted archives. This lets you use your favourite text editor, image viewer, or music player on files that are still inside an archive file. Going one step further, because archivemount also supports write access for some archive formats, you can edit a text file directly from inside an archive too.

I couldn’t find any packages that let you easily install archivemount for mainstream distributions. Its distribution includes a single source file and a Makefile.

archivemount depends on libarchive for the heavy lifting. Packages of libarchive exist for Ubuntu Gutsy and openSUSE for not for Fedora. To compile libarchive you need to have uudecode installed; my version came with the sharutils package on Fedora 8. Once you have uudecode, you can build libarchive using the standard ./configure; make; sudo make install process.

With libarchive installed, either from source or from packages, simply invoke make to build archivemount itself. To install archivemount, copy its binary into /usr/local/bin and set permissions appropriately. A common setup on Linux distributions is to have a fuse group that a user must be a member of in order to mount a FUSE filesystem. It makes sense to have the archivemount command owned by this group as a reminder to users that they require that permission in order to use the tool. Setup is shown below:

# cp -av archivemount /usr/local/bin/ # chown root:fuse /usr/local/bin/archivemount # chmod 550 /usr/local/bin/archivemount

To show how you can use archivemount I’ll first create a trivial compressed tarball, then mount it with archivemount. You can then explore the directory structure of the contents of the tarball with the ls command, and access a file from the archive directly with cat.

$ mkdir -p /tmp/archivetest $ cd /tmp/archivetest $ date >datefile1 $ date >datefile2 $ mkdir subA $ date >subA/foobar $ cd /tmp $ tar czvf archivetest.tar.gz archivetest $ mkdir testing $ archivemount archivetest.tar.gz testing $ ls -l testing/archivetest/ -rw-r--r-- 0 root root 29 2008-04-02 21:04 datefile1 -rw-r--r-- 0 root root 29 2008-04-02 21:04 datefile2 drwxr-xr-x 0 root root 0 2008-04-02 21:04 subA $ cat testing/archivetest/datefile2 Wed Apr 2 21:04:08 EST 2008

Next, I’ll create a new file in the archive and read its contents back again. Notice that the first use of the tar command directly on the tarball does not show that the newly created file is in the archive. This is because archivemount delays all write operations until the archive is unmounted. After issuing the fusermount -u command, the new file is added to the archive itself.

$ date > testing/archivetest/new-file1 $ cat testing/archivetest/new-file1 Wed Apr 2 21:12:07 EST 2008 $ tar tzvf archivetest.tar.gz drwxr-xr-x root/root 0 2008-04-02 21:04 archivetest/ -rw-r--r-- root/root 29 2008-04-02 21:04 archivetest/datefile2 -rw-r--r-- root/root 29 2008-04-02 21:04 archivetest/datefile1 drwxr-xr-x root/root 0 2008-04-02 21:04 archivetest/subA/ -rw-r--r-- root/root 29 2008-04-02 21:04 archivetest/subA/foobar $ fusermount -u testing $ tar tzvf archivetest.tar.gz drwxr-xr-x root/root 0 2008-04-02 21:04 archivetest/ -rw-r--r-- root/root 29 2008-04-02 21:04 archivetest/datefile2 -rw-r--r-- root/root 29 2008-04-02 21:04 archivetest/datefile1 drwxr-xr-x root/root 0 2008-04-02 21:04 archivetest/subA/ -rw-r--r-- root/root 29 2008-04-02 21:04 archivetest/subA/foobar -rw-rw-r-- ben/ben 29 2008-04-02 21:12 archivetest/new-file1

When you unmount a FUSE filesystem, the unmount command can return before the FUSE filesystem has fully exited. This can lead to a situation where the FUSE filesystem might run into an error in some processing but not have a good place to report that error. The archivemount documentation warns that if there is an error writing changes to an archive during unmount then archivemount cannot be blamed for a loss of data. Things are not quite as grim as they sound though. I mounted a tar.gz archive to which I had only read access and attempted to create new files and write to existing ones. The operations failed immediately with a “Read-only filesystem” message.

In an effort to trick archivemount into losing data, I created an archive in a format that libarchive has only read support for. I created archivetest.zip with the original contents of the archivetest directory and mounted it. Creating a new file worked, and reading it back was fine. As expected from the warnings on the README file for archivemount, I did not see any error message when I unmounted the zip file. However, attempting to view the manifest of the zip file with unzip -l failed. It turns out that my archivemount operations had turned the file into archivetest.zip, which was now a non-compressed POSIX tar archive. Using tar tvf I saw that the manifest of the archivetest.zip tar archive included the contents including the new file that I created. There was also a archivetest.zip.orig which was in zip format and contained the contents of the zip archive when I mounted it with archivemount.

So it turns out to be fairly tricky to get archivemount to lose data. Mounting a read-only archive file didn’t work, and modifying an archive format that libarchive could only read from didn’t work, though in the last case you will have to contend with the archive format being silently changed. One other situation could potentially trip you up: Because archivemount creates a new archive at unmount time, you should make sure that you will not run out of disk space where the archives are stored.

To test archivemount’s performance, I used the bonnie++ filesystem benchmark version 1.03. Because archivemount holds off updating the actual archive until the filesystem is unmounted, you will get good performance when accessing and writing to a mounted archive. As shown below, when comparing the use of archivemount on an archive file stored in /tmp to direct access to a subdirectory in /tmp, seek times for archivemount were halved on average relative to direct access, and you can expect about 70% of the performance of direct access when using archivemount for rewriting. The bonnie++ documentation explains that for the rewrite test, a chunk of data is a read, dirtied, and written back to a file, and this requires a seek, so archivemount’s slower seek performance likely causes this benchmark to be slower as well.

$ cd /tmp $ mkdir empty $ ls -d empty | cpio -ov > empty.cpio $ mkdir empty-mounted $ archivemount empty.cpio empty-mounted $ mkdir bonnie-test $ /usr/sbin/bonnie++ -d /tmp/bonnie-test Version 1.03 ------Sequential Output------ --Sequential Input- --Random- -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks-- Machine Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP /sec %CP v8tsrv 2G 14424 25 14726 4 13930 6 28502 49 52581 17 8322 123 $ /usr/sbin/bonnie++ -d /tmp/empty-mounted Version 1.03 ------Sequential Output------ --Sequential Input- --Random- -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks-- Machine Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP /sec %CP v8tsrv 2G 12016 19 12918 7 9766 6 27543 40 52937 6 4457 24

When you want to pluck a few files out of a tarball, archivemount might be just the command for the job. Instead of expanding the archive into /tmp just to load a few files into Emacs, just mount the archive and run Emacs directly on the archivemount filesystem. As the bonnie++ benchmarks above show, an application using an archivemount filesystem does not necessarily suffer a performance hit.

Categories:

  • Desktop Software
  • Tools & Utilities