Comments on: Web Scraping, HTML/XML Parsing, and Firebug’s Copy XPath Feature https://www.chinhdo.com/20080829/web-scraping-htmlxml-parsing-and-firebugs-copy-xpath-feature/ Chinh's semi-random thoughts on software development, gadgets, and other things. Tue, 06 Jun 2017 15:56:20 +0000 hourly 1 https://wordpress.org/?v=6.9.1 By: Cgbdesign https://www.chinhdo.com/20080829/web-scraping-htmlxml-parsing-and-firebugs-copy-xpath-feature/comment-page-1/#comment-689133 Tue, 06 Jun 2017 15:56:20 +0000 https://www.chinhdo.com/20080829/web-scraping-htmlxml-parsing-and-firebugs-copy-xpath-feature/#comment-689133 Very helpful information about basic data mining using firebug. I find the exact information that I need. Thank you.

]]>
By: Sara https://www.chinhdo.com/20080829/web-scraping-htmlxml-parsing-and-firebugs-copy-xpath-feature/comment-page-1/#comment-62072 Tue, 27 Oct 2009 19:54:09 +0000 https://www.chinhdo.com/20080829/web-scraping-htmlxml-parsing-and-firebugs-copy-xpath-feature/#comment-62072 Nice post on web scrappers, simple and too the point :), I use python for simple html web scrappers, but for larger projects i have used extractingdata.com web scrapper which builds custom web scrappers and data extracting programs simple and fast

]]>
By: Sen Hu https://www.chinhdo.com/20080829/web-scraping-htmlxml-parsing-and-firebugs-copy-xpath-feature/comment-page-1/#comment-6599 Sat, 27 Dec 2008 19:49:16 +0000 https://www.chinhdo.com/20080829/web-scraping-htmlxml-parsing-and-firebugs-copy-xpath-feature/#comment-6599 I use biterscripting for web scraping or harvesting. There is a good script example at http://biterscripting.com/Download/FS_WebPageToText.txt .

Sen

]]>
By: Chinh Do https://www.chinhdo.com/20080829/web-scraping-htmlxml-parsing-and-firebugs-copy-xpath-feature/comment-page-1/#comment-4819 Wed, 03 Sep 2008 22:58:17 +0000 https://www.chinhdo.com/20080829/web-scraping-htmlxml-parsing-and-firebugs-copy-xpath-feature/#comment-4819 Philippe: Thanks for the info on XPather! I gave it a quick run over and it looks nice.

One caveat: it doesn’t work with XML documents (the generated XPath expression is always “/html”). You can get around this by changing your file extension to “htm” to get Firefox to treat your document as HTML. The other issue, not a major one in my opinion, is that the generated XPath expressions always contain “TBODY” elements underneath each “TABLE” element, whether the TBODY tags are actually there or not in the source. It’s easy enough to manually edit out these extract TBODY tags, but it would be nice if you don’t have to do that. I’ll send a bug report to the author to see if this can be fixed in the next version.

]]>
By: Philippe Lhoste https://www.chinhdo.com/20080829/web-scraping-htmlxml-parsing-and-firebugs-copy-xpath-feature/comment-page-1/#comment-4793 Wed, 03 Sep 2008 11:40:45 +0000 https://www.chinhdo.com/20080829/web-scraping-htmlxml-parsing-and-firebugs-copy-xpath-feature/#comment-4793 Interesting, I overlooked this feature.
Now, I use another extension, XPather, which precisely do what you wish to have: inclusion of IDs (and classes) references in the XPath. Plus some other handy functionalities.

]]>
By: Dew Drop - August 30, 2008 | Alvin Ashcraft's Morning Dew https://www.chinhdo.com/20080829/web-scraping-htmlxml-parsing-and-firebugs-copy-xpath-feature/comment-page-1/#comment-4624 Sat, 30 Aug 2008 23:24:10 +0000 https://www.chinhdo.com/20080829/web-scraping-htmlxml-parsing-and-firebugs-copy-xpath-feature/#comment-4624 […] Web Scraping, HTML/XML Parsing, and Firebug’s Copy XPath Feature (Chinh Do) […]

]]>