I gather from some of the comments and email that my earlier post about batch generation of PDFs from web pages wasn’t clear, so here’s my second try. The Software Carpentry course that I’m writing for the Python Software Foundation has about 35 web pages. I’d like to add something to my Makefile so that I can say
make pdf and have each page turned into a PDF. (Currently, I do this by hand about once a week; it ain’t the ten minutes of tedium that bothers me, but what my students would think if they found out I’m doing a repetitive task by hand.)
So, option #1 is an HTML to PDF converter. There are several out there, but none of the open source ones I’ve been able to find will respect my stylesheets. Yes, I could investigate XSL-FO, but (a) I have 185 other issues to deal with, and (b) I’m morally opposed to duplicating style information.
Option #2: write a script to make Firefox do what I’ve been doing by hand, i.e., load, print, change printer to PDFCreator, “print”, click “OK” for the document title, specify where to save it, “OK”, repeat. A couple of people have said, “Oh, you can do that on Windows with the win32 module and COM,” but haven’t responded to my follow-up question, “Great—do you have some sample code I can tweak?” (Also, my build currently runs on both Windows and Linux, and I’d like to keep it that way if I can…)
…it makes sense to go looking for nails. When you have a new way to organize information that works well in one context, it makes sense to see how well it will solve problems in others. This is a working sketch (in Perl) of how we might use del.icio.us-style tags to organize a filesystem. Clever.
I put an alpha version of the notes for the Software Carpentry course  on-line yesterday. Now, I’m looking for a way to convert them into a single PDF document, so that reviewers can download them in one shot. I’ve found some open source tools that don’t do style sheets, and some commercial tools that’ll do one page at a time, for money, but nothing so far that strikes me as being any better than “printing” with PDFCreator, one page at a time. Given that there are 31 pages, this is tedious enough that I’d like to find an alternative…
 The course on basic software development skills for scientists and engineers that I’m writing for the Python Software Foundation.