Here’s something useful for the web developers out there. It’s a script I’ve been using for a while that makes it super-easy to losslessly compress entire folders of PNG and JPEG files for the web.
If you’re familiar with PNG optimization, then you know about programs like advpng, optipng and pngout that optimize and recompress PNG files, making them much smaller than the likes of Photoshop can. My shell script combines all three of these utilities to achieve minimum file size, while hiding their command-line syntax and adding the ability to recursively process entire directory trees.
And, it works with JPEGs, too! It uses
jpegtran (included with libjpeg) and another small utility I’ve included to optimize and strip all metadata from JPEG files. Since my script searches directories recursively, all you need to do is type, say,
imgopt myproj/htdocs and it’ll take care of your entire website.
All compression is lossless, which means no pixels are changed and no information is lost, the files just get smaller — chances are your layout can shrink by as much as 50%, which is like getting free bandwidth, and it means your site will snap into place that much faster for users.
Read on for:
How to use: the quick and dirty
imgopt is a command-line bash shell script, so it’ll run on *nix systems like Mac OS X, Linux, BSD, and Windows running cygwin. I’ve been running it on OS X and Linux (both running bash 2.05), but please let me know if you run into any problems on your platform.
The current version is 0.1.2, released on 2009-04-02 (see changelog for details). This post has been updated to reflect the current version, but some comments (especially those regarding bugs) are about prior releases. New bug reports and feature requests in the comments are welcome!
Download imgopt-0.1.2.tar.gz and check out the included README. Basically, you just copy the
imgopt script into your path and download and install all the helper programs if you don’t have them already.
Then you can use
imgopt to optimize any combination of files, directories and wildcards in one fell swoop. Examples:
$ imgopt A30.jpg A30.jpg reduced 40 bytes to 1327 bytes $ imgopt under60.jpg files/*png under60.jpg reduced 36559 bytes to 49510 bytes files/A11.png reduced 130 bytes to 669 bytes files/A256.png reduced 55 bytes to 2106 bytes # It is totally fine to run imgopt on the same file more than once, since it checks # and only overwrites files when they have been reduced in size. $ imgopt * A30.jpg unchanged at 1327 bytes A60.jpg reduced 36 bytes to 2119 bytes files/A11.png unchanged at 669 bytes files/A256.png unchanged at 2106 bytes under50.jpg reduced 48 bytes to 18499 bytes under60.jpg unchanged at 49510 bytes
Really, that’s all you need to know. Keep reading only if you’re curious or have time to kill.
What it does and why you should care
If you’re a website builder, hopefully you already understand the value of keeping website graphics as small as possible — even though it’s 2009 and most of us have megabit connections at home, each byte saved multiplied by every website visitor adds up to a huge reduction in bandwidth and server load, and a faster experience for users. (See: rules for exceptional performance)
Hopefully, your image workflow looks like this. First, you know to sprite your images, especially those with shared color palettes, to minimize total file size and reduce HTTP request overhead. And you are a total expert at Photoshop or Illustrator’s “Save for Web” feature — you know when to save graphics as JPEG (continuous-tone images like photos) and when to use indexed-color PNG (line art, so probably most of the images in your website’s theme). You know a few of the manual color palette reduction tricks (like deleting colors with R/G/B values all > 250, which reduces palette size and increases edge sharpness of type), so your PNGs only have 13 colors when that’s all they need, and not 256 just because that’s the default setting of some wizard thingy used by children and former print designers.
imgopt script steps in at this point and further optimizes all the files that Photoshop (or whatever your graphics program is) spits out. Unfortunately, typical web graphics production software doesn’t really compress image data as much as it could — it doesn’t take the CPU time to really optimize the way the image data is ordered (because that would take too long), it inserts gamma and empty metadata structure that’s wasted on web layout graphics (yes, even in “Save for Web” mode), and, last but probably most important, uses an inferior zip compression algorithm.
One warning: My script strips all metadata (i.e. EXIF and JFIF) and gamma channels (which you should be removing for web use anyway; otherwise people running IE or Safari will complain about the color in your PNGs being “off”), so if you have image files with metadata, color profiles, masks or channels you want to keep, definitely make backup copies before optimizing!
I’ve commented the script and made it easy for you to go in and customize the helper programs I’m using and the parameters they’re passed. I’ve chosen options that minimize file size, so you could, for instance, make things run faster by choosing less compression, or you could opt to preserve some metadata, like photo credit information.
Here’s what it’s doing, to PNG and JPEG files, respectively …
My script piggybacks on the excellent work done by optipng, advpng and pngout. Now, I could go into a long explanation about how one of these (
pngout) is good because it has a proprietary deflate algorithm that’s superior to zlib, while another is great at optimizing IDAT order (
optipng), but the bottom line is that running all three of these utilities on your PNG files will produce smaller files than any one of them separately.
Very occasionally, you’ll be able to get a few more bytes of compression on PNGs just by running files through
imgopt twice. So, feel free to try this and make it part of your routine if you like. Since it’s 2x the execution time to run the program twice, and the savings are very small, I elected not to just have
imgopt do this on its own (and it’s so easy to just hit up arrow to run the command again). The reason a second pass occasionally works is related to the order the utilities are run in — sometimes a file, once deflated by
pngout, can then be further optimized on a second pass thru
optipng — and just doing a second run of the whole script is the easiest way to take care of any and all interaction cases like these. (FYI, in no instance have I ever found a third pass to do any good.)
Here, the story is simpler:
jpegtran (probably already on your system, since it’s part of libjpeg) to optimize the Huffman table and strip color profiles, EXIF data and other metadata. Second, it uses a small utility called
jfifremove to remove JFIF metadata (
jpegtran leaves at least a minimal 18-byte JFIF segment in place).
Results will depend mostly on how much metadata there is to strip from your images. If you’re running
imgopt on files produced by Photoshop’s “Save for Web,” the savings may only be a few dozen bytes. Whereas, running it on random photos uploaded by users, files produced by Photoshop’s regular “Save” command or files straight from your digital camera will easily save 10-30K or more.
A word again about lossy compression vs. lossless. JPEG is a lossy format — most of the benefit of JPEG compression comes when you choose a compression level (of course it is more complicated than that and there are some techniques you can use to get better control than what Photoshop provides, but I digress), and ultimately that’s what’s going to make the biggest difference to the final sizes of your JPEG files. My script and
jpegtran are lossless and do not decode/recompress the JPEG image, they just optimize the existing JPEG data. Of course, if you want
imgopt to automatically recompress JPEGs at a different quality setting, you could easily add that feature to the script by inserting a line before
jpegtran that calls something like ImageMagick’s convert, but I’ll leave that to you.
jfifremove — this is a very simple utility written in C that I found here, debugged and compared against the JFIF standard. So, while I don’t claim authorship, I am the maintainer of the version included with my script. I’ve tested it thoroughly only as it is used in
imgopt — i.e., for deleting the JFIF segment from files that already have had all other optional headers stripped by
jpegtran. I have not tested it for standalone use (i.e. on files containing color profiles, thumbnails, EXIF data etc.), so I don’t recommend you use it outside my script unless you know what you’re doing. (I suspect it might fail to strip the JFIF segment from a file where the EXIF APP1 precedes the JFIF APP0, but at least do nothing destructive … but I’m only speculating.)
In action: How much can we optimize Reddit?
OK, let’s have some fun and optimize the graphics from a real website. I picked Reddit because it uses very few images, so this won’t take long. :)
Reddit is doing a pretty good job at performance optimization already, because they’re not overusing images, and the images they do use are small with few colors. But, they aren’t spriting their images or reducing colors effectively. Now, I’m not going to rebuild their website for them, but let’s see what happened when I ran
imgopt on their homepage’s images, and also when I took another 5 minutes to do basic indexed color reductions on their PNGs. For the autogenerated thumbnail images, it would be unfair to do any manual reductions — but, it turns out they’re making another big mistake by using PNG instead of JPEG for these files. So, for these images, the third column is just a batch convert to JPEG with quality=60 (actually a pretty high setting for small thumbnails) to show what would happen if they fixed this.
|Filename||Bytes||imgopt only||Colors reduced plus imgopt|
|Filename||Bytes||imgopt only||JPEG convert plus imgopt|
So, the site’s main theme graphics shrunk by 50% just using
imgopt alone, and another 25% on top of that with just a few minutes’ common-sense color reduction. The autogenerated PNG thumbnails shrunk 20% — or, make them JPEGs and they’re down 70%. Reddit is a pretty graphics-sparse site, so saving 42K per page visitor isn’t too shabby. (Here’s a tarball of the end results, in case anyone from Condé is reading.)
What about a site with a graphics-rich design? I gave the same experiment a whirl on a former top-20 blog that I used to be tech lead for. Naturally, when I was there, the theme images were sprited and optimized and added to a total of maybe a dozen files and 20K — since then, there’s been a redesign and it’s now about 60 files and 150K (including some bad reminders of the dark ages, like sliced images and “blank.gif”). Result of a quick optimization on those files: I got them down to 50K, 1/3 the original size — so you know that with spriting and some sense you could get them down to 10-20 files and 30-40K pretty easily. I sent the result to a buddy, so hopefully the optimized versions will be making an appearance soon.
Still reading? Well, thanks for hanging in there. Now it’s time to go and make your own website go fast!