Overview

In this lesson, we’ll talk about configuring the Screaming Frog Spider to Include and Exclude content based on rules you set.

includesWithin the Configuration section of Screaming Frog, we have the option to instruct the Spider to include or exclude specific URLs and directories as we see fit. Using Regex (regular expressions), you can instruct the Spider where to go, and where not to go.

Directory Example

Adding the following to “Include” would tell the spider to only crawl URLs within the what-we-do directory. Likewise, adding this to the “Exclude” would omit these pages from the reports.

http://www.webgumption.com/what-we-do/.*

URL Contains Example

Adding the following to “Include” would tell the spider to only crawl URLs containing the word, “search.”. Likewise, adding this to the “Exclude” would omit these pages from the reports.

.*search.*

Using this feature is easiest with a quick course on Regex but some additional useful examples can be found here.

Usage Tips

  • For large sites, crawl small sections at a time