« links for 2008-09-27 | Main | links for 2008-09-30 »

links for 2008-09-29

  • Selenium is an awesome scraper. This is a simple scraper that performs the same function as WGet or Lynx --dump, that is: get a remote HTML file. However, instead of retrieving the code of that file, RWget retrieves its DOM as rendered in Firefox (or another browser).

    I can also a do a "rendered diff," that is: get the innerHTML of the BODY tag from Firefox, then compare that with the innerHTML of the same page as stored on the server. Based on a presentation by Kord of Splunk at Ajax World 2008.

  • (tags: project:nerd)