<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
		>
<channel>
	<title>Comments on: Web Scraping, HTML/XML Parsing, and Firebug&#8217;s Copy XPath Feature</title>
	<atom:link href="http://www.chinhdo.com/20080829/web-scraping-htmlxml-parsing-and-firebugs-copy-xpath-feature/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.chinhdo.com/20080829/web-scraping-htmlxml-parsing-and-firebugs-copy-xpath-feature/</link>
	<description>Chinh's not quite random thoughts on software development, .NET, gadgets, and other things.</description>
	<lastBuildDate>Tue, 09 Mar 2010 13:15:12 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.9.1</generator>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
		<item>
		<title>By: Sara</title>
		<link>http://www.chinhdo.com/20080829/web-scraping-htmlxml-parsing-and-firebugs-copy-xpath-feature/comment-page-1/#comment-62072</link>
		<dc:creator>Sara</dc:creator>
		<pubDate>Tue, 27 Oct 2009 19:54:09 +0000</pubDate>
		<guid isPermaLink="false">http://www.chinhdo.com/20080829/web-scraping-htmlxml-parsing-and-firebugs-copy-xpath-feature/#comment-62072</guid>
		<description>Nice post on web scrappers, simple and too the point :), I use python for simple html web scrappers, but for larger projects i have used extractingdata.com &lt;a href=&quot;http://www.extractingdata.com/web%20scraper.htm&quot; rel=&quot;nofollow&quot;&gt;web scrapper&lt;/a&gt; which builds custom web scrappers and data extracting programs simple and fast</description>
		<content:encoded><![CDATA[<p>Nice post on web scrappers, simple and too the point <img src='http://www.chinhdo.com/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /> , I use python for simple html web scrappers, but for larger projects i have used extractingdata.com <a href="http://www.extractingdata.com/web%20scraper.htm" rel="nofollow">web scrapper</a> which builds custom web scrappers and data extracting programs simple and fast</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Sen Hu</title>
		<link>http://www.chinhdo.com/20080829/web-scraping-htmlxml-parsing-and-firebugs-copy-xpath-feature/comment-page-1/#comment-6599</link>
		<dc:creator>Sen Hu</dc:creator>
		<pubDate>Sat, 27 Dec 2008 19:49:16 +0000</pubDate>
		<guid isPermaLink="false">http://www.chinhdo.com/20080829/web-scraping-htmlxml-parsing-and-firebugs-copy-xpath-feature/#comment-6599</guid>
		<description>I use biterscripting for web scraping or harvesting. There is a good script example at http://biterscripting.com/Download/FS_WebPageToText.txt .

Sen</description>
		<content:encoded><![CDATA[<p>I use biterscripting for web scraping or harvesting. There is a good script example at <a href="http://biterscripting.com/Download/FS_WebPageToText.txt" rel="nofollow">http://biterscripting.com/Download/FS_WebPageToText.txt</a> .</p>
<p>Sen</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Chinh Do</title>
		<link>http://www.chinhdo.com/20080829/web-scraping-htmlxml-parsing-and-firebugs-copy-xpath-feature/comment-page-1/#comment-4819</link>
		<dc:creator>Chinh Do</dc:creator>
		<pubDate>Wed, 03 Sep 2008 22:58:17 +0000</pubDate>
		<guid isPermaLink="false">http://www.chinhdo.com/20080829/web-scraping-htmlxml-parsing-and-firebugs-copy-xpath-feature/#comment-4819</guid>
		<description>Philippe: Thanks for the info on XPather! I gave it a quick run over and it looks nice.

One caveat: it doesn&#039;t work with XML documents (the generated XPath expression is always &quot;/html&quot;). You can get around this by changing your file extension to &quot;htm&quot; to get Firefox to treat your document as HTML. The other issue, not a major one in my opinion, is that the generated XPath expressions always contain &quot;TBODY&quot; elements underneath each &quot;TABLE&quot; element, whether the TBODY tags are actually there or not in the source. It&#039;s easy enough to manually edit out these extract TBODY tags, but it would be nice if you don&#039;t have to do that. I&#039;ll send a bug report to the author to see if this can be fixed in the next version.</description>
		<content:encoded><![CDATA[<p>Philippe: Thanks for the info on XPather! I gave it a quick run over and it looks nice.</p>
<p>One caveat: it doesn&#8217;t work with XML documents (the generated XPath expression is always &#8220;/html&#8221;). You can get around this by changing your file extension to &#8220;htm&#8221; to get Firefox to treat your document as HTML. The other issue, not a major one in my opinion, is that the generated XPath expressions always contain &#8220;TBODY&#8221; elements underneath each &#8220;TABLE&#8221; element, whether the TBODY tags are actually there or not in the source. It&#8217;s easy enough to manually edit out these extract TBODY tags, but it would be nice if you don&#8217;t have to do that. I&#8217;ll send a bug report to the author to see if this can be fixed in the next version.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Philippe Lhoste</title>
		<link>http://www.chinhdo.com/20080829/web-scraping-htmlxml-parsing-and-firebugs-copy-xpath-feature/comment-page-1/#comment-4793</link>
		<dc:creator>Philippe Lhoste</dc:creator>
		<pubDate>Wed, 03 Sep 2008 11:40:45 +0000</pubDate>
		<guid isPermaLink="false">http://www.chinhdo.com/20080829/web-scraping-htmlxml-parsing-and-firebugs-copy-xpath-feature/#comment-4793</guid>
		<description>Interesting, I overlooked this feature.
Now, I use another extension, XPather, which precisely do what you wish to have: inclusion of IDs (and classes) references in the XPath. Plus some other handy functionalities.</description>
		<content:encoded><![CDATA[<p>Interesting, I overlooked this feature.<br />
Now, I use another extension, XPather, which precisely do what you wish to have: inclusion of IDs (and classes) references in the XPath. Plus some other handy functionalities.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Dew Drop - August 30, 2008 &#124; Alvin Ashcraft's Morning Dew</title>
		<link>http://www.chinhdo.com/20080829/web-scraping-htmlxml-parsing-and-firebugs-copy-xpath-feature/comment-page-1/#comment-4624</link>
		<dc:creator>Dew Drop - August 30, 2008 &#124; Alvin Ashcraft's Morning Dew</dc:creator>
		<pubDate>Sat, 30 Aug 2008 23:24:10 +0000</pubDate>
		<guid isPermaLink="false">http://www.chinhdo.com/20080829/web-scraping-htmlxml-parsing-and-firebugs-copy-xpath-feature/#comment-4624</guid>
		<description>[...] Web Scraping, HTML/XML Parsing, and Firebug&#8217;s Copy XPath Feature (Chinh Do) [...]</description>
		<content:encoded><![CDATA[<p>[...] Web Scraping, HTML/XML Parsing, and Firebug&#8217;s Copy XPath Feature (Chinh Do) [...]</p>
]]></content:encoded>
	</item>
</channel>
</rss>
