Quick Link Prospecting with Scraper Extension2 years ago

Using XPath to select content in an XML document to scrape for SEO is nothing new, but traditionally SEOs are doing it within Google Docs for simplicity and ease of use. There are some limitations though, which includes a 50 ImportXML limit per spreadsheet and the fact that it’s not done in-line while browsing. I’ve been playing around with a Google Chrome extension called Scraper which allows you to scrape content in-line while browsing.

Let’s walk though some examples of how this is awesome.

Prospecting Guest Posts

Let’s say we’re quickly trying to find guest post opportunities for a food related site.

To do this, I search for food inurl:”write for us” and show 100 results per page (you have to turn off Google Instant for this).

Step One – Advanced Search Query

guest post food Quick Link Prospecting with Scraper Extension

Step Two – Select and Right Click Listing

Select “Scrape Similar” in the menu and the extension will find the XPath to the selected content and extract it and the repetitive elements similar to it.

scrape similar Quick Link Prospecting with Scraper Extension

Step 3 – View Output

At this stage, you can make edits to the XPath and remove fields of data that have been extracted. You can also define presets for frequently used XPath. The extension does a fair job selecting the content correctly, but depending on the markup of the page, you may need to edit the XPath to select the right text. In this example, it did it perfect without correction.

scrape guest post output Quick Link Prospecting with Scraper Extension

Step 4 – Export to Google Docs, FTW

From here, it’ll send it directly into Google Docs where you can mash up with the SEOmoz API or other data.

export google docs Quick Link Prospecting with Scraper Extension

Example Output:

7 More Examples

#1 An Alltop Scraper

You can scrape the curated blog lists at Alltop, such as this huge list of marketing blogs.

It looks like this, in a matter of seconds.

#2 Scrape WordPress Blog Post Comments

Let’s say I wanted to quickly contact everyone who left a comment on a post on Outspoken Media’s blog, such as my link building personas post.

I had to make a quick edit to the XPath so it didn’t select the the comment anchor URL.

//div[2]/dl/dt/span/a[@class='url']

Run this on your guest posts or the comments being left on your competitor’s site.

(You’ll likely have to customize it per blog if the extension doesn’t get it automatically.)

 #3 Blog Directory Scraper

Need a quick list of 102 gaming blog? Just head over to the BOTW Blog Directory.

A little edit to the XPath: //div[2]/ul/li/a[1]/@href

And in a few seconds a spreadsheet list of URLs to 102 gaming blogs.

#4 Link Placement and Buys

Similar to the guest post search, but try these in Google and scrape.

inurl:edu alumni discount code
inurl:sponsor intitle:sponsors seattle

icon wink Quick Link Prospecting with Scraper Extension

#5 Followerwonk Scraper

A quick search for zombie on Followerwonk.

A little Xpath: //div[3]/table/tbody/tr/td/a

And…

#6 Tumblr Submit Scraper

Looking to launch content on Tumblr?

site:tumblr.com inurl:submit zombie OR zombies

#7 A Google Plus Profile Scraper

Looking for food bloggers on Google Plus?

Right Tool, Right Job

This doesn’t replace the benefit of doing some XPath within Google Docs, since you can do scripting and iterate on imports. However, I really like this tool so far. It can do a lot very easily and very quickly.

It does have some bugs and I’ve had to restart my browser a few times because it can stop working.

If you have any other ideas on how it could be used, be sure to drop a comment below. icon smile Quick Link Prospecting with Scraper Extension