Comments on: Web Scraping, HTML/XML Parsing, and Firebug’s Copy XPath Feature

By: Cgbdesign

Cgbdesign — Tue, 06 Jun 2017 15:56:20 +0000

Very helpful information about basic data mining using firebug. I find the exact information that I need. Thank you.

By: Sara

Sara — Tue, 27 Oct 2009 19:54:09 +0000

Nice post on web scrappers, simple and too the point :), I use python for simple html web scrappers, but for larger projects i have used extractingdata.com web scrapper which builds custom web scrappers and data extracting programs simple and fast

By: Sen Hu

Sen Hu — Sat, 27 Dec 2008 19:49:16 +0000

I use biterscripting for web scraping or harvesting. There is a good script example at http://biterscripting.com/Download/FS_WebPageToText.txt .

Sen

By: Chinh Do

Chinh Do — Wed, 03 Sep 2008 22:58:17 +0000

Philippe: Thanks for the info on XPather! I gave it a quick run over and it looks nice.

One caveat: it doesn’t work with XML documents (the generated XPath expression is always “/html”). You can get around this by changing your file extension to “htm” to get Firefox to treat your document as HTML. The other issue, not a major one in my opinion, is that the generated XPath expressions always contain “TBODY” elements underneath each “TABLE” element, whether the TBODY tags are actually there or not in the source. It’s easy enough to manually edit out these extract TBODY tags, but it would be nice if you don’t have to do that. I’ll send a bug report to the author to see if this can be fixed in the next version.

By: Philippe Lhoste

Philippe Lhoste — Wed, 03 Sep 2008 11:40:45 +0000

Interesting, I overlooked this feature.
Now, I use another extension, XPather, which precisely do what you wish to have: inclusion of IDs (and classes) references in the XPath. Plus some other handy functionalities.

By: Dew Drop - August 30, 2008 | Alvin Ashcraft's Morning Dew

Dew Drop - August 30, 2008 | Alvin Ashcraft's Morning Dew — Sat, 30 Aug 2008 23:24:10 +0000

[…] Web Scraping, HTML/XML Parsing, and Firebug’s Copy XPath Feature (Chinh Do) […]