Faster Web page load times for mobile devices with Ziproxy

639

Author: Ben Martin

Ziproxy is Web proxy server, but rather than cache content the way Web proxies like Squid do, it’s designed to compress the content that it fetches from the Web before forwarding it to the Web client. It can be useful for serving mobile devices like handheld Internet tablets that cannot take full advantage of high-resolution, high-quality images, or where the browser client is running over a mobile data plan where speed is low and bytes are expensive.

Ziproxy can compress images to lower quality (smaller) JPEG or JPEG 2000 files and can optimize HTML and CSS content for size before compressing it with gzip. While JPEG 2000 images require a lot of computational power to decode, they can also be substantially smaller than PNG or normal JPEG images. Many Web browsers automatically decompress content that is gzip-encoded, so the end user cannot tell that compression of HTML and CSS was used.

The Fedora, Ubuntu, and openSUSE repositories do not contain packages of Ziproxy. I’ll build from source using the latest version 2.5.2 of Ziproxy on a 64-bit Fedora 9 machine. If you would like to recompress images to the JPEG 2000 format, you must have the JasPer development packages installed prior to building Ziproxy. JasPer is available as a 1-Click install for openSUSE, is packaged for Ubuntu Hardy, and is available in the Fedora 9 repositories as jasper-devel.

The commands to build Ziproxy are shown below. I found that even with the JasPer development packages installed, I had to supply the --with-jasper option for Ziproxy to build in JPEG 2000 support.

$ tar xjf /FromWeb/ziproxy-2.5.2.tar.bz2 $ cd ./ziproxy-2.5.2/ $ ./configure --with-jasper $ make $ sudo make install $ su -l # cp -av etc/ziproxy /etc # chown -R root.root /etc/ziproxy

Ziproxy comes with an init.d file that can be used to start the daemon, but there are a few issues with starting Ziproxy using this script. For one, the path for Ziproxy is hard-coded, and the script will not find the binary if it is installed in /usr/local. The most serious issue is that it will start Ziproxy as the root user. To remedy these problems, I created a new user to run the program, then edited the script, as you’ll see below. In the line calling printf, I removed the “g” from the original call to gprintf. Wrapping the invocation of Ziproxy with the su command is the major change.

# useradd ziproxy # mkdir /var/log/ziproxy # chown ziproxy.ziproxy /var/log/ziproxy # cp /.../ziproxy-2.5.2/etc/init.d/ziproxy /etc/init.d # vi /etc/init.d/ziproxy ... PID_FILE=/var/tmp/ziproxy.pid ZIPROXY=/usr/local/bin/ziproxy ... printf "Starting %s: " "${PROGNAME}" ... su -c "${ZIPROXY} -d -c ${ZIPROXY_CONF} >${PID_FILE}" ziproxy ... # /etc/init.d/ziproxy start

You can run Ziproxy either through xinetd or as a standalone daemon. I’ll run it standalone, as this usually yields improved performance. Any client connection filtering that might be done via xinetd can be performed through packet filtering and the use of SSH port forwarding to securely access Ziproxy from the Internet.

Before starting Ziproxy, take a look at its configuration file in /etc/ziproxy/ziproxy.conf. The first options allow you to set the port and address that Ziproxy will bind to. The OnlyFrom option lets you set an IP address or a contiguous range of IP addresses (in the form begin.IP-end.IP) that are allowed to access the proxy. If OnlyFrom is not specified then any client that can connect to the address and port that Ziproxy is bound to will be served. If you are using Ziproxy to provide a Web proxy to mobile devices, one option is to set OnlyFrom=127.0.0.1 and use SSH port forwarding to connect to the Web proxy.

Many of the options in ziproxy.conf specify which image formats will be (re)compressed, which formats an image can be compressed into, and how aggressively to set image compression parameters. In the above two use cases, you are unlikely to want images to be compressed into JPEG 2000 format for a device that has minimal system RAM and a relatively slow CPU, such as those of an Internet tablet. On the other hand, you might want all images to be recompressed into JPEG 2000 if you are planning to access the proxy from a relatively fast laptop over a slow link.

The image compression quality settings can change depending on how large (in pixels) in image is. It is more likely that you would like a better-quality image for the larger images on a Web page. As an added bonus, if you specify negative values, then the image is first converted to gray-scale and then compressed with the given quality. You must specify four values for quality settings to use for images up to 5,000, 50,000, and 250,000 pixels, and all images larger than these sizes; for example, ImageQuality={17,20,23,25}. If you want to compress in JPEG 2000, use the JP2ImageQuality keyword instead.

By default Ziproxy will try to recompress PNG, JPG, and GIF images. You can turn off recompression for an image format with boolean options like ProcessJPG. If you want to recompress to JPEG 2000 instead of normal JPEG if possible then set ProcessToJP2=true. Conversely, if you want to recompress any JPEG 2000 image to normal JPEG, set ForceOutputNoJP2=true. There’s also a small collection of options relating to how to compress JPEG 2000 images; see the example ziporoxy.conf for details.

To affect text compression set Gzip=true/false; the default is true. If your client does not handle gzip compressed content or you are accessing the proxy of SSH and prefer to use SSH compression, you might turn this off. The Compressible={"shockwave","foo"} option lets you tell Ziproxy to compress a nominated list of other datatypes as well.

A collection of options starting with the Process prefix let you specify if HTML, CSS, and JavaScript files should be modified en route. These options might make the content slightly smaller, but they are marked as experimental.

You can set a limit (in bytes) on the size of a file that Ziproxy will try to (re)compress using the MaxSize option.

Setting the ImageQuality values to all negative to force Ziproxy to convert all images to gray-scale provides a quick visual indication as to which images have been treated by Ziproxy on a Web page. Inspecting the log files during initial testing against linux.com and slashdot.org didn’t show much compression being performed by Ziproxy; in fact, the only things that were gray-scale were non-Flash adverts. Using the same configuration and visiting other sites like arstechnica.com, I found that most of the images on the Web page were gray-scale and there was a noticeable reduction in bytes transferred between Ziproxy and the Web browser.

When browsing Project Gutenburg, many of the CSS files that were transferred were compressed to 25% of their original size. While these were only 5 and 20KB each in their original size, the transfer savings would be noticeable if you were on a slow link. The 30KB image of a mobile reader shown on the homepage also became 5KB. When downloading larger HTML files like the text of Dead Souls, using Ziproxy yielded a transfer that was 60% smaller than without it. The book search results page also went from 8KB to 3KB when viewed through Ziproxy. As a final test on the Gutenburg site, I downloaded a 343KB uncompressed version of The Gambler by Dostoyevsky, and Ziproxy transferred 89KB to the Web browser. By contrast, the .zip compressed version of the book on the site is 126KB in size. Of course, the transfer between Ziproxy and the Gutenburg Web site will be much larger for the uncompressed download, but there will actually be fewer bytes transferred between Ziproxy and the Web browser for the uncompressed download.

On image-heavy sites like flickr.com you’ll see more substantial gains with recompression. Using gray-scale JPEGs with compression qualities between -20 and -15 you can see many images becoming less than 10% of their original size. Of course, with such aggressive settings, you will notice JPEG compression artifacts in the gray-scale images.

Further refinement

One method of using Ziproxy is to configure it to only allow connections from localhost. Clients would then use SSH port forwarding to access the proxy, connecting a port of the client machine to the machine that is running Ziproxy. Although SSH has the -C option to compress communications, by turning off SSH compression and relying on Ziproxy for compression you will achieve better compression, particularly of image files. Generic compression (ssh -C uses gzip) is not going to be able to compete with JPEG for images.

Advertisements and safebrowsing checks can comprise the lion’s share of bytes transferred when viewing a Web site. Even if you get rid of ads with software like Adblock Plus, Ziproxy can offer substantial reductions in the number of bytes transferred to your client. How effective Ziproxy is at compressing CSS and HTML content depends on how well the Web sites that you browse already take advantage of client supported implicit compression. While gzip compression of HTML and CSS saves bytes in a manner that an end user cannot detect, the configuration options also let you tailor how many artifacts are introduced into your images in order to save even more bytes.

Category:

  • Internet & WWW