As a longtime SEO, I get all giddy when a tool truly lives up to it’s promise and delivers.
I’ve been using spider tools for years, they’re an essential part of your toolkit to help you do a ton of essential analysis that is a must have. I’m talking foundational analysis, items like:
- Crawl your entire site html infrastructure
- Analyze internal and external page counts and link points
- Catalog all page links and server response codes (find dead 404 pages, etc.)
- Consolidate & export all your SEO on page attribute meta (Titles, Descriptions, …)
- List out all link types by images or text
- Catalog and evaluate all links by anchor text
- Export by relative path URLs
- Files sizes by URL
- Check response times by URL, etc.
- W3C validation for usability and to avoid low quality signals
Those features are great, but as a more enterprise / industrial strength SEO responsible for really big websites and huge content efforts many tools fall flat on their face. I’ve worked on some of the largest content sites there are, with massive traffic (think Top 15 of all sites in the US per Comscore) and the needs are different than a small scale or private website audit. But for commercial solutions (rather than an in-house custom built tool) there are some great tools you can use. There’s also some that get a big FAIL in my book too.
Having used these tools for some time, here’s some triggers you want to look for:
- Uncapped Crawl Depth – a lot of tools crawl fine but require you to specify a crawl depth. I’m talking to you SEO Website Auditor (SEO PowerSuite). That’s not good. There are times we may want to limit crawl depth, as an OPTION, but not as a requirement. You want the ability to crawl uncapped, to any folder/page hop depth. The whole idea is to crawl all links found on your site, starting with your root and then following those links on all pages found (rinse, repeat…) until it’s exhausted all link points.
- Ability to Save Mission Data – this is really important. You want to be able to save the mission data, in its raw form and not just the exported mission results. This means you can recut the data or even share it (an initialization file) which is also a way you can save and have multiple people working on the result set. If multiple team members, developers, SEOs, QA folks, or your agency are involved this can be a big time saver. Don’t overlook it or you’ll be hating yourself later. Make your life easier.
- Ability to Export Expanded mission Data – you want to be able to export all mission detail. For example, if your mission shows 500 status code 404 (page not found errors) you want to not only export which pages are giving a 404, but also export all the link points (other pages linking to that dead page). That’s needed to analyze where the fix needs to be done. Some tools don’t support this — they let you see that data (sometimes called inlinks’) but only within the application, not the exported data so when you export it into a .csv or excel file it’s not there. Makes my blood pressure rise…….
- Robust Crawler Controls – you’re going to want the ability to override some user agent settings or crawl settings. For example, if you’re pre-testing a site that has a Robots.txt file preventing crawlers, or a page with Meta NOINDEX, you want the ability to override or ignore those settings so the crawler still runs.
- Exclude Functions – this can be a big time saver and at a minimum can save you time later not having to manually filter out information you want to ignore. For example, you can tell the spider to ignore images, not follow external links (unnecessary bandwidth and I don’t always care about external responses), exclude certain page types (like if you have sections of your site coded in .php or .jsp or an area you don’t care about), exclude certain paths, or more.
- Control Crawl Speed – this can help slow your crawl speed — some smaller sites or dev environments that have limited scalability may need to limit your speed so you don’t tax the system resources. Think of it like a friendly rank mission controls some SEO search ranking tools use.
So, on to the big 3… There’s a ton of others, and some very limited online only tools. Some that I’ve looked at and rejected include:
Rejected Site Crawl Tool List:
- SEOMoz Site Crawl Tool – I was a little disappointed by the Moz Tool. I’m a huge SEOMoz advocate, Rand’s team always does a rock solid job and Open Site Explorer is one of my all time favorite tools. But their crawl tool (available to Pro and Pro+ members only) was a disappointment. It’s capped at 10,000 results. In my test crawls of a site I manage with over 30,000 proven discoverable pages Moz only found about 4,000. They didn’t even provide their proprietary PageAuthority scores on a per page crawled level like I was assuming they would do. I’m sure they’ll improve this in the future. Until then, it’s a complimentary tool at best.
- Website Auditor by SEO PowerSuite or Link Assistant – This tool is kind of a “Big Hat, No Cattle”. While I like PowerSuite’s Rank Tracker and SpyGlass tools, this one is a disapointment. They’ve put more effort into making the reports look good (slick .pdfs) than their crawl tool. Results took forever to run, probably the slowest of all, and you must specify depth of crawl as a requirement (it won’t run as an open ended query). That sucks – so while their core is adequate, it’s not industrial, fast, or flexible enough…
The Big 3 – SEO Spider tools
- Xenu Link Sleuth- the original, probably the most widely distributed
- Screaming Frog – from the UK. They’ve certainly got the coolest product name… !
- A1 Website Analyzer – from Microsys. Boring name, but how does it perform?
Xenu
What’s not to like about Xenu. It’s FREE after all, with no limits or expiration.
Xenu ran without difficulty or interruption. You can start and pause missions, use Xenu to create XML Sitemaps, and of course export mission results. On the downside, Xenu has the most lackluster user interface of all, and their list of customizable crawl options is really limited. Reminds me of the car salesman saying you can get the car in any color you want as long as it’s white…
Xenu data exports into csv or tab delm formats as you would expect, but the limited setup and feature sets for large scale SEO I find too limiting if you’re working with any degree of scale. You can save off missions, but the export dataset is subpar compared to the other options.
Verdict — Xenu is capable. I’ve used it for years, and it’s still reliable and as a Free tool outperforms a lot of other tools and just about all online scan tools. But you can do better and need more…
Screaming Frog
This is a UK company, and the ordering process to buy your license is a bit clunky. They have no automated instant digital key, you have to wait for a manual order review across the pond… I ordered mine on a Friday morning, Denver time… Which was probably the longest possible 1 business day window possible given it was about 8pm London time on a Friday.
Looking beyond the wicked cool name (I know I shouldn’t care, but I like it) is a really robust interface. Frog has a great set of features. You can use it for free for 30 days, buying it is in Euro’s and I had to pay around $125 US. It’s only available to buy via PayPal, which sucks if you’re a corporate client using a corporate credit card – kind of pathetic.
The tool itself is great – I loved it. Some features (Spider crawl options) are limited until you buy a license, and until you do you’re limited to only being able to crawl 500 URI’s.
The best thing about Frog is the application interface. They hit a home run here. It’s the best I’ve seen, and it’s without a doubt the easiest to use and scroll within results. I quickly was able to sort results by Status Code, find 404 errors, and then click on a result page. The bottom of the page then has the “Inlinks” area allowing me to easily see all the link points directing to the broken link. Very easy. See the anchor text, know all the connecting points, crush issues. Problem solved. Slick and easy to use.
Frog also lists length of data elements (descriptions, titles, etc.), gives a breakdown of external and inbound linking pages to a specific page, and more.
The biggest trip ups / problems? First, there is no way to save a mission. If you close the application you lose it. Game over, you have to re-run the results. You can export the results to an excel file, but you lose the mission itself from the application window and interface I just finished singing praise about. That’s a huge problem – and it’s ridiculously easy for them to fix. Inexcusable in the meantime though — there’s no way to save a mission and share it between colleagues, clients, etc. Also, the export features have no customiation, so you can export the raw results, but some details (like the ability to expose all linking pages to a single page) you can’t do — you need the interface for that.
A1 Website Analyzer
Such a painfully sanitized and boring name. But the tool itself is spot on.
For starters, A1 Website Analyzer runs circles around both Screaming Frog and Xenu as far as customization options to set up your crawl. From controlling crawl speed to incredibly detailed exclusion rules you can customize, you have the ability to take absolute control of even the most intricate of site IA structures.
A1 also allows for expansive export functions, including the ability to export the full data set (including all linking pages, so if you’re doing an audit of say 301 redirects and want to make all your 301s become 200′s to avoid any redirect linkjuice degradation (hint) you can do that — you’ll see not only all the 301s found on your entire site IA but also every single page that is referencing those 301s — all from the exported data you’re sucking into Excel to share with your Ops and Development team or fixing yourself).
You can save the missions too, so you can zip the file (it actually zips it for you) and then share that internally with other team members and they’ll have the .ini file to open up the mission in their own application and have everything. This is great when you’re working remote, have multiple involved parties, or just want the campaign itself and not just the results.
The sole “I’m giving it a C grade” demerit for A1 is the interface. I preferred the visual layout and ability to see the site URLs that Screaming Frog uses, as I don’t always like navigating thru an interface of + folder / path expansions like I’m in Windows Explorer. But that’s really just a preference, it works fine and sometimes it makes sense. Others may love it. I think they could do better, but it’s good as it is and is better visually than Xenu to me, but a step behind Frog from an interface layout perspective.
Final Verdict & Thoughts
A1 wins. Feature wise it just can’t be beat, and it’s the only tool that covers all the bases.
Step up from Xenu. A1 get’s my vote, and I’ve used it liberally to assess and find site Info. Architecture issues. Save time. Save money.
Kudo’s to the Microsys guys. They’re the first to get close to satisfying all my wishes from a commercial tool.




The other two publicly available tools which I think ought to be included on any list are:
- GSiteCrawler – its URL rewriting rules and ability to edit your rules half way through a run are my favourite features
- Microsoft’s IIS SEO Toolkit (Vista and later only, so not as well known)
Every tool has pros and cons – unfortunately that means that you really need to use multiple tools to do a thorough job.