There are two Linux file systems that
continually prove to be confusing stumbling blocks to new Linux
users. These two directories, /proc and /dev have no Windows
counterpart, and are not at first glance easily understandable. They
are, however, powerful tools for understanding and using Linux.
This article is a walk-through of the
device (/dev) and process (/proc) file systems. It will explain what they are, how they
work, and how they are used in practice.
/dev: A file system of devices
Devices: In Linux, a device is any piece of equipment (or
code that emulates equipment) that provides methods for performing
input or output (IO). For example, a keyboard is an input device. A
hard disk is an input (read) and output (write) device. In Linux, most devices are
represented as files in the file system (network cards are an
exception). These special files are stored in a common place, /dev,
where they are easily accessible to processes that need to perform
IO-related tasks.
Devices roughly fall into two categories:
character devices and block devices. Character devices deal with IO
on a character by character basis. The most obvious example is a
keyboard, where every key generates a character on the device. The
mouse is another. Every motion or button click sends a character to
the /dev/mouse device.
Block devices read data in larger chunks. Data
storage devices, such as IDE hard drives (/dev/hd), SCSI hard drives
(/dev/sd), and CD-ROMs (/dev/cdrom) are block devices. IO
interactions with block devices transact with chunks of data, which
allows large quantities of data to be moved back and forth more
efficiently.
Device names: Devices are often named after the equipment they
represent. Devices named /dev/fb represent the frame buffers for
graphics. /dev/hd devices represent IDE hard disks. In some cases,
symbolic links are employed to clarify what a device is, for example
/dev/mouse, the device representing a mouse, may be linked to a
serial, USB or PS2 device, depending on the physical hardware. The
symbolic link makes it easier for both humans and machines to figure
out which device is the mouse.
In some cases, there may be multiple devices of
the same type. For instance, a machine may have two ATAPI CD-ROMS.
Two devices will be used — one for each device. For instance,
/dev/cdrom0 will be the first CD-ROM, and /dev/cdrom1 will be the
second.
The naming gets a little more confusing in cases
like hard disks. A hard disk device name is composed of the type of
disk followed by the position of the disk, and then the disk
partition. The first hard disk may be called /dev/hda, with the “hd”
part indicating that it is an IDE hard disk, and the “a” part
indicating that it is the first hard disk. /dev/hdb would then refer
to the second hard disk. Each hard disk is divided into partitions.
The first partition on the first hard drive would be /dev/hda1, where
the one at the end indicates the location of the partition. Note that
where some devices (like /dev/cdrom0) may begin the index from 0,
devices with partitions typically begin from 1. So a listing of all
of the partitions on the two IDE hard drives on my computer might
look like this:
/dev/hda /dev/hda1 /dev/hda2 /dev/hda3 /dev/hda4 /dev/hdb /dev/hdb1 /dev/hdb2 /dev/hdb3
SCSI hard disks use /dev/sd instead of /dev/hd,
but otherwise the convention is the same. /dev/sda1 refers to the
first partition on the first SCSI hard disk.
Special Devices: There are a few special devices that come in
useful every once in a while, /dev/null, /dev/zero, /dev/full, and
/dev/random.
The null device, /dev/null, is sort of the “trash”
device. Put simply, things that go in never come out. Many times,
some program may generate unnecessary output. Shell scripts often
employ /dev/null to prevent the user from having to see unnecessary
output generated by utilities that it calls. The example below
inserts a kernel module and sends the
output from modprobe to /dev/null.
$ modprobe cipher-twofish > /dev/null
Closely related to /dev/null is /dev/zero. Like
/dev/null, it can be used to dump unwanted data, but reads from
/dev/zero return characters (reads from /dev/null return
end-of-file characters). For this reason, /dev/zero is
commonly used to create empty files.
dd if=/dev/zero of=/my-file bs=1k count=100
The command above will create a file 100k
in size, full of null characters.
/dev/full mimics a full device. Writing to
/dev/full will generate an error. The full device is useful when
testing how an application will act when it accesses a device that is
full.
$ cp test-file /dev/full cp: writing `/dev/full": No space left on device $ df -k /dev/full file system 1k-blocks Used Available Use% Mounted on /dev/full 0 0 0 -
The random devices, /dev/random and /dev/urandom,
generate “random” data. Though the output to both may
appear to be completely random, /dev/random is actually more random
than /dev/urandom. /dev/random generates random characters based on
“environmental noise” that is not determinable. Since
there is only a limited supply of this random noise, the /dev/random
device is slow, and may pause in order to collect more data.
/dev/urandom uses the same pool of noise as /dev/random, but if it
runs out of random data, it generates pseudo-random data. This makes
it faster, but less secure.
Old /dev File System: In the past, the /dev file system has been part of
the normal file system. It consisted of special files created once
(usually when the system was installed) and stored on a hard disk.
On systems with this setup, the /dev file system
needed to contain entries for any devices that might be connected to
that computer. Consequently, /dev was very large, having entries for
multiple hard drives, consoles, floppy drives, etc. Earlier, we saw
the list of hard drive partitions on hdb. Under the old /dev file system,
entries would exist for /dev/hdb1 through /dev/hdb11. In order to
figure out which devices actually mapped to partitions on the device
(remember, I only have three partitions on my hdb drive), some
utility would have to be used to determine which devices were valid.
the command “file -s hdb*” was one way to figure that sort of thing
out, printing something like this:
$ file -s /dev/hdb? /dev/hdb1: Linux/i386 ext2 file system /dev/hdb2: Linux/i386 ext2 file system /dev/hdb3: Linux/i386 ext2 file system /dev/hdb4: empty /dev/hdb5: empty /dev/hdb6: empty /dev/hdb7: empty /dev/hdb8: empty /dev/hdb9: empty
If a given device file wasn’t already present, it
had to be created by mknod or another program (like MAKEDEV). Though
the “old way” worked, it was complex, and got tedious to
manage.
DevFS: In the 2.4 kernel tree, an alternative to the
cumbersome disk-based /dev file system was introduced. The
alternative, DevFS, incorporated new device handling code into the
kernel. In DevFS, the /dev file system is created during each boot-up
cycle and stored in RAM, instead of on the hard disk. Under this
model, there is no need to maintain a list of all possible devices,
and when new devices are added to hardware, the kernel just makes an
entry for it in /dev. In the occasional cases where devices need
special configuration to appear correctly in DevFS, there is a
configuration file usually stored in /etc/devfsd.conf.
/proc: A file system for processes
Processes: At any given time, Linux will have many processes
running at once. Some, such as window managers, email clients, and
Web browsers, will be visible to the end user. Others, like servers
and helper processes, are not immediately visible, but run
in the background, handling tasks that do not require the user”s
interaction. Running “ps -ef” in a shell will print a list of all the
currently running processes. It should look something like this:
$ ps -ef UID PID PPID C STIME TTY TIME CMD root 1 0 0 11:08 ? 00:00:04 init root 2 1 0 11:08 ? 00:00:00 [keventd] root 3 0 0 11:08 ? 00:00:00 [ksoftirqd_CPU0] root 4 0 0 11:08 ? 00:00:00 [kswapd] root 5 0 0 11:08 ? 00:00:00 [bdflush] root 6 0 0 11:08 ? 00:00:00 [kupdated] root 8 1 0 11:08 ? 00:00:00 [kjournald] root 86 1 0 11:08 ? 00:00:00 /sbin/devfsd /dev root 165 1 0 11:09 ? 00:00:00 [kjournald] root 168 1 0 11:09 ? 00:00:00 [khubd] root 294 1 0 11:09 ? 00:00:00 [kapmd] root 515 1 0 11:09 ? 00:00:00 metalog [MASTER] root 521 515 0 11:09 ? 00:00:00 metalog [KERNEL] root 531 1 0 11:09 ? 00:00:00 /sbin/dhcpcd eth0 /etc/X11/fs/config -droppriv -user xfs root 572 1 0 11:09 ? 00:00:00 /usr/kde/2/bin/kdm root 593 572 2 11:09 ? 00:04:27 /usr/X11R6/bin/X -auth /var/lib/kdm/authfiles/A:0-25pIgI root 644 1 0 11:09 vc/1 00:00:00 /sbin/agetty 38400 tty1 linux root 1045 572 0 12:16 ? 00:00:00 -:0 mbutcher 1062 1045 0 12:16 ? 00:00:00 /bin/sh /etc/X11/Sessions/kde-2.2.2 mbutcher 1091 1062 0 12:16 ? 00:00:00 /bin/bash --login /usr/kde/2/bin/startkde mbutcher 1132 1 0 12:16 ? 00:00:00 kdeinit: Running... mbutcher 1157 1132 0 12:16 ? 00:00:01 kdeinit: kwin mbutcher 1159 1 0 12:16 ? 00:00:07 kdeinit: kdesktop mbutcher 1168 1 0 12:16 ? 00:00:00 kdeinit: kwrited mbutcher 1171 1168 0 12:16 pty/s0 00:00:00 /bin/cat mbutcher 1173 1 0 12:16 ? 00:00:00 alarmd mbutcher 1207 1132 0 12:23 ? 00:00:08 kdeinit: konsole -icon konsole -miniicon konsole mbutcher 1219 1207 0 12:23 pty/s2 00:00:00 /bin/bash mbutcher 1309 1260 0 13:48 pty/s3 00:00:01 vi dev-and-proc.html root 1314 1220 0 14:03 pty/s2 00:00:00 ps -ef
Many of the tasks in the output to ps are background processes. Those that
are contained in square brackets are kernel processes. Only a few,
like the kde processes and the entries toward the end, are processes
that I interact with.
In order to manage the system, the kernel must
keep track of every process running, including itself. Many
user-level applications, too, must be able to find out what processes
are running (“ps” is a good example. “top” is another.). The /proc
file system is where the kernel stores information about processes.
Like DevFS, /proc is stored in memory, rather than
on disk. If you look at the file /proc/mounts (which lists all of the
mounted file systems, much like the “mount” command), you should see a
line in it that looks like this:
proc /proc proc rw 0 0
/proc is controlled by the kernel and does not
have an underlying device. Because it contains mainly state information
controlled by the kernel, the most logical place to store the
information is in memory controlled by the kernel.
Information about running processes: To keep track of processes, the kernel assigns each one a Process ID (PID) number. Running the command “ps -ef” as
we did above, will print a list of all running processes ordered by the PID number
(which is in the second column). The /proc file system stores
information about each PID.
In /proc, many of the directory names are numbers.
These directories correspond to PID numbers. Inside of the
directories are files that provide important details about the state,
environment, and details regarding a process. In the output of ps (above),
there was a line that read:
mbutcher 1219 1207 0 12:23 pty/s2 00:00:00 /bin/bash
This process is running the bash shell, and has PID 1219. The directory
/proc/1219 contains information about this process.
$ ls /proc/1219 cmdline cpu cwd environ exe fd maps mem root stat statm status
The file “cmdline”
contains the command invoked to start the process.
The “environ” file contains the environment
variables for the process. “status” has status information on the
process, including the user (UID) and group (GID) identification for
the user executing the process, the parent process ID (PPID) that
instantiated the PID, and the current state of the process,such as
“Sleeping” or “Running.”
$ cat status Name: bash State: S (sleeping) Tgid: 1219 Pid: 1219 PPid: 1207 TracerPid: 0 Uid: 501 501 501 501 Gid: 501 501 501 501 FDSize: 256 Groups: 501 10 18 VmSize: 2400 kB VmLck: 0 kB VmRSS: 1272 kB VmData: 124 kB VmStk: 20 kB VmExe: 544 kB VmLib: 1604 kB SigPnd: 0000000000000000 SigBlk: 0000000080010000 SigIgn: 8000000000384004 SigCgt: 000000004b813efb CapInh: 0000000000000000 CapPrm: 0000000000000000 CapEff: 0000000000000000
Every process directory also has a couple
symbolic links. “cwd” is a link to the current working directory for
that process. “exe” is a link to the executable program the process
is running, and “root” links to the directory the process sees as its
root directory (usually “/”). The directory “fd” contains a list of
symbolic links to the file descriptors that the process is using.
There are other files in the process directory
that provide information about everything from processor and memory
usage to the amount of time a process has been running. Most of these
files are documented in the kernel source under
“Documentation/file systems/proc.txt” and are available as a man page
— “man proc.”
Kernel Information: In addition to storing information about specific
processes, the /proc file system contains a great deal of information
generated by the kernel itself to describe the state of the system.
The kernel and its modules
may generate files in /proc to provide information about
their current state. For instance, /proc/fb provides information
about the currently available frame buffer devices (frame buffers are
most often used to display a boot logo).
$ cat fb 0 VESA VGA
Note that the 0 refers to the frame buffer”s index, and corresponds
to the device /dev/fb0. If I had a second framebuffer, the proc entry would
also contain a line starting with a 1, corresponding to /dev/fb1. Often, proc data will
refer to (and explain) entries in /dev.
Lots of information about hardware is stored in /proc. The file /proc/pci has
info about every PCI device detected on the
system. Running the command “lspci” ought to generate data that looks
similar to this file, as it uses /proc/pci as its source of
information. /proc/bus contains directories for various bus
architectures (PCI, PCCard, USB), which in turn contain information
about the devices connected via those buses.
Various network information and statistics are
stored in /proc/net. Information about hard disks is stored in
/proc/ide and /proc/scsi, depending on the hard drive type.
/proc/devices lists all of the devices (divided into the “block” and
“character” categories) available on the system.
$ cat /proc/devices Character devices: 1 mem 2 pty/m%d 3 pty/s%d 4 tts/%d 5 cua/%d 7 vcs 10 misc 14 sound 29 fb 116 alsa 162 raw 180 usb 226 drm 254 pcmcia Block devices: 1 ramdisk 2 fd 3 ide0 22 ide1
There are many more files in /proc than can be
covered here. Each kernel may, in fact, have different entries
depending on what was built into the kernel, what hardware and
software is present, and what state the computer is currently in.
Some of these files are clearly meant for a machine to read, but
others offer information that is intuitive. Most of these files are
documented in various places in the kernel documentation. A good
starting point in the kernel source is
Documentation/file systems/proc.txt.
Interacting with processes via /proc: Some files in proc are not just for reading.
Writing to them may alter the state of the kernel. Looking to see
what’s in a file in /proc is usually harmless, but writing to files
without being sure of the outcome can be dangerous. Nevertheless,
sometimes writing to /proc is the only way to communicate with the
kernel.
For instance, in recent versions of the kernel,
there is the option to include a kernel-level high-performance Web
server (khttp). Because starting a Web server by default can be a
security risk, khttp requires explicit startup through messages sent
to a file in proc.
echo 1 > /proc/sys/net/khttpd/start
When the kernel sees the contents of
/proc/sys/net/khttps/start change from 0 (the default) to 1, it
starts the khttpd server.
There are dozens of other configurable parameters
in /proc — some for tuning hardware, others for managing the
kernel internals. Almost all of them, though, are low level
and can cause bad things to happen if set to the wrong values.
As a rule of thumb, /proc entries should not be changed unless
you know what you are doing.
Conclusion
/proc and /dev provide file-based interfaces to
the internals of Linux. They assist in determining the configuration
and state of various devices and processes on a system. They provide
capabilities necessary to make the operating system easy to upgrade,
analyze, debug, and run. Understanding and applying knowledge of these two file
systems is key to making the most of your Linux system.
Category:
- Linux