<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
		>
<channel>
	<title>Comments on: Stymied</title>
	<atom:link href="http://third-bit.com/blog/archives/3307.html/feed" rel="self" type="application/rss+xml" />
	<link>http://third-bit.com/blog/archives/3307.html</link>
	<description>Data is ones and zeroes &#124; Software is ones and zeroes and hard work.</description>
	<lastBuildDate>Thu, 24 May 2012 15:30:31 +0000</lastBuildDate>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3.2</generator>
	<item>
		<title>By: Jim Graham</title>
		<link>http://third-bit.com/blog/archives/3307.html#comment-3304</link>
		<dc:creator>Jim Graham</dc:creator>
		<pubDate>Mon, 21 Dec 2009 21:50:15 +0000</pubDate>
		<guid isPermaLink="false">http://pyre.third-bit.com/blog/?p=3307#comment-3304</guid>
		<description>The problem with any script that attempts to scrape Google Scholar is that Scholar detects spikes in activity from your IP (for some value of &#039;spikes&#039;) and will send you a Forbidden until you revalidate via CAPTCHA as a human. Scraping and scripts are against the Google ToS.

AFAIK, this is because Google gives scholar so much less bandwidth than the main search engine and they defend it much more than the regular search engine.</description>
		<content:encoded><![CDATA[<p>The problem with any script that attempts to scrape Google Scholar is that Scholar detects spikes in activity from your IP (for some value of &#8216;spikes&#8217;) and will send you a Forbidden until you revalidate via CAPTCHA as a human. Scraping and scripts are against the Google ToS.</p>
<p>AFAIK, this is because Google gives scholar so much less bandwidth than the main search engine and they defend it much more than the regular search engine.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Nathan</title>
		<link>http://third-bit.com/blog/archives/3307.html#comment-3303</link>
		<dc:creator>Nathan</dc:creator>
		<pubDate>Sun, 20 Dec 2009 18:39:42 +0000</pubDate>
		<guid isPermaLink="false">http://pyre.third-bit.com/blog/?p=3307#comment-3303</guid>
		<description>Have you looked at Mendeley? It&#039;s a pretty nice solution for managing a library of academic papers. One of it&#039;s features is integration with Google Scholar. I&#039;ve found it works pretty well but it does require some manual intervention.

http://www.mendeley.com/</description>
		<content:encoded><![CDATA[<p>Have you looked at Mendeley? It&#8217;s a pretty nice solution for managing a library of academic papers. One of it&#8217;s features is integration with Google Scholar. I&#8217;ve found it works pretty well but it does require some manual intervention.</p>
<p><a href="http://www.mendeley.com/" rel="nofollow">http://www.mendeley.com/</a></p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Tony Wiliams</title>
		<link>http://third-bit.com/blog/archives/3307.html#comment-3302</link>
		<dc:creator>Tony Wiliams</dc:creator>
		<pubDate>Mon, 14 Dec 2009 05:29:22 +0000</pubDate>
		<guid isPermaLink="false">http://pyre.third-bit.com/blog/?p=3307#comment-3302</guid>
		<description>Have you considered reverse engineering one of the Mycroft project Firefox search plugins? They include many searches of Google scholar that use various proxies.

Surely you could build a python script with the hints from one of these.

// Tony</description>
		<content:encoded><![CDATA[<p>Have you considered reverse engineering one of the Mycroft project Firefox search plugins? They include many searches of Google scholar that use various proxies.</p>
<p>Surely you could build a python script with the hints from one of these.</p>
<p>// Tony</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Uldis Bojars</title>
		<link>http://third-bit.com/blog/archives/3307.html#comment-3301</link>
		<dc:creator>Uldis Bojars</dc:creator>
		<pubDate>Mon, 14 Dec 2009 03:16:48 +0000</pubDate>
		<guid isPermaLink="false">http://pyre.third-bit.com/blog/?p=3307#comment-3301</guid>
		<description>Maybe this Python script can help you a bit: http://gist.github.com/255743

It searches for a given phrase on Bibsonomy and reports entries found. In case if it finds very few matches (as you mention in the blog post) try making the search phrase shorter / simpler.</description>
		<content:encoded><![CDATA[<p>Maybe this Python script can help you a bit: <a href="http://gist.github.com/255743" rel="nofollow">http://gist.github.com/255743</a></p>
<p>It searches for a given phrase on Bibsonomy and reports entries found. In case if it finds very few matches (as you mention in the blog post) try making the search phrase shorter / simpler.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Aldo Chan</title>
		<link>http://third-bit.com/blog/archives/3307.html#comment-3300</link>
		<dc:creator>Aldo Chan</dc:creator>
		<pubDate>Mon, 14 Dec 2009 02:46:31 +0000</pubDate>
		<guid isPermaLink="false">http://pyre.third-bit.com/blog/?p=3307#comment-3300</guid>
		<description>OK, about fooling google with the User-Agent setting (if you&#039;ve tested your script against a local web server you&#039;ll notice that the User-Agent header is still set to urllib-blah-blah) you&#039;ll need to subclass urllib.URLOpener and set the class attribute version to your desired header.
like this: http://paste.pocoo.org/show/156877/</description>
		<content:encoded><![CDATA[<p>OK, about fooling google with the User-Agent setting (if you&#8217;ve tested your script against a local web server you&#8217;ll notice that the User-Agent header is still set to urllib-blah-blah) you&#8217;ll need to subclass urllib.URLOpener and set the class attribute version to your desired header.<br />
like this: <a href="http://paste.pocoo.org/show/156877/" rel="nofollow">http://paste.pocoo.org/show/156877/</a></p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Eric O. LEBIGOT (EOL)</title>
		<link>http://third-bit.com/blog/archives/3307.html#comment-3299</link>
		<dc:creator>Eric O. LEBIGOT (EOL)</dc:creator>
		<pubDate>Sun, 13 Dec 2009 21:30:32 +0000</pubDate>
		<guid isPermaLink="false">http://pyre.third-bit.com/blog/?p=3307#comment-3299</guid>
		<description>Sorry to hear that your attempts have not produced much so far…  That&#039;s an interesting project!

With regards to cb2bib, I&#039;d like to point out that it is available via Fink (at least in the unstable branch).</description>
		<content:encoded><![CDATA[<p>Sorry to hear that your attempts have not produced much so far…  That&#8217;s an interesting project!</p>
<p>With regards to cb2bib, I&#8217;d like to point out that it is available via Fink (at least in the unstable branch).</p>
]]></content:encoded>
	</item>
</channel>
</rss>

