Grabbing Zen and the Art of Motorcycle Maintenance

I just found “Zen and the Art of Motorcycle Maintenance” for free online, a book that was on my wish list for quiet some time (recommended by a good number of programmers…).

And since i like a printed version here is a small script to get it in a printable format.
(which i post here for educational purpose only)

require 'rubygems'
require 'open-uri'
require 'hpricot'

book = 'zen_and_the_art/zen_and_the_art'
pages = 32
text = ''
1.upto(pages) { |i|
  doc = Hpricot(open("{book}_#{i.to_s.rjust(2,'0')}.php").read)'table.body tr td[@height=40] div').remove'table.body tr td[@height=40] img').remove'table.body tr td[@height=40] p.body').remove'table.body tr td[@height=40] p a').remove
  part ='table.body tr td[@height=40]')
  text += part.inner_html
}'out.html','w') {|f|f.puts text}

2 thoughts on “Grabbing Zen and the Art of Motorcycle Maintenance

  1. normally i am quiet happy with hpricot, but that i cannot do this:

    item =‘table.body tr td[@height=40]’)

    was a little bit frustrating/un-dry…

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s