Author: Shashank Sharma
diff
, which creates a patchfile, and patch
, which applies it. You can use both tools with text or HTML files.
User Level: Intermediate
As the name suggests, diff
documents differences between two files. diff
compares files line by line. Running the diff old_file new_file
command displays the differences between files on your screen. The -u
switch creates output in the unified diff format, which displays each difference with a few unchanged context lines above and below the change. A unified diff file can help you determine where changes have been made.
To create a unified diff format patchfile, run the command diff -u old_file new_file > patchfile
. Here is a quick example to illustrate how diff
works. We have files named first.txt and second.txt with inventory lists of sports equipment:
File first.txt 1 ball 2 bats 3 nets 2 caps 5 clubs 1 golf ball
File second.txt 3 balls 2 bats 3 nets 2 caps 4 clubs 1 golf ball
When we compare these two files with diff -u first.txt second.txt > patchfile
, the patchfile contains the following:
--- first.txt 2006-01-21 16:20:40.271039432 +0530 +++ second.txt 2006-01-21 16:21:00.538958240 +0530 @@ -1,6 +1,6 @@ -1 ball +3 balls 2 bats 3 nets 2 caps -5 clubs +4 clubs 1 golf ball
The ---
line shows the name of the first file, which has the original inventory list. The +++
line shows the name of the second file, which contains the updated inventory list. The @@
line is called the header, and the section below the header is called the hunk. The hunk shows the actual changes between the two files. A large diff file will have several hunks, each with a unique header.
In the hunk, the lines that are not preceded by -
and +
symbols are the context lines. Lines starting with -
indicate a line that was in the original file but not in the new file. Conversely, lines starting with +
indicate a line that is in the new file but not in the original file. In our example, -1 ball
means that the line was present in the original file but absent from the new file. The line +4 clubs
indicates a line was not in the original file.
To determine whether two files differ, use the -q
switch. For example, the command diff -q first.txt second.txt
will display the string Files first.txt and second.txt differ
.
Once you know the differences between two files, you can create a patchfile, which is applied using the patch
tool.
Working with patch
In workgroups, many people work on the same software, documentation, and text files. If you want to apply changes to all copies of a file, you can use a patchfile and the patch
command. For example, in order to update changes to the inventory list in the first.txt file, we can apply the patchfile we created earlier with the command patch first.txt < patchfile
.
The filename is optional. A simple patch
command also works, because patch
looks at the patchfile to determine the name of the file to patch. This works in most cases because the filenames are the same on the machine where the patchfile is generated and where it is applied.
Sometimes, a file has been modified before a patch is applied. If a user has modified the file, then patch
uses the context lines that were generated with the diff -u
switch to determine the lines that need to be changed.
If you wish to keep a copy of the original file, use the -b
switch; by default, patch
replaces the original file with the patched file.
If you are unsure about whether to apply the changes, use the --dry-run
switch, which displays the results of applying the patch without actually changing the file.
The man page is an excellent resource for reviewing the many options available with the patch
command.
Conclusion
As is the case with all free and open source software, there are plenty of similar tools to choose from. For example, you can use vimdiff
, a comparatively modern alternative to diff
that uses vim
to highlight the differences between two or three files, making comparison easy. You can also use diff3
, an enhanced version of diff
to compare three files. cmp
is another tool that can be used to compare two files. It works by inspecting the files byte-by-byte. No project can do without diff
and patch
, however, for making quick changes across files in multiple locations.
Shashank Sharma is studying for a degree in computer science. He specializes in writing about free and open source software for new users.