Share this post:

Share on facebook
Share on twitter
Share on linkedin

A Beginner’s Guide on How to Use Screaming Frog

  • How to perform a site crawl on Screaming Frog
  • Using Screaming Frog to analyse key areas of your website
  • How to create custom configurations in Screaming Frog
  • Customising robots.txt files and locating orphan pages

 

If you are new to search engine optimisation (SEO) and are looking for tools to help search engine bots crawl your website, then this post is for you!

We regularly use Screaming Frog at Soar Online to support our technical SEO efforts.

Why?

Screaming Frog is a website crawler that allows you to perform a full technical audit and an inquisitive check of your website. Thanks to its versatility, you can use Screaming Frog for other uses, including:

  • Competitor analysis
  • Outreach
  • Verifying schema markup

Various SEO experts use Screaming Frog. However, this tool can be a challenge if you’re new to the software. We’re going to explore a few of our favourite Screaming Frog features and detail how you could use them to get the most out of this SEO Spider tool.

But first, let’s break down the user interface.

Data analytics

Getting Started with Screaming Frog

After installing Screaming Frog, we advise familiarising yourself with the menu, options and settings to help you navigate the platform.

File

The first control element is the “File” option, which will house your last six crawls and allow you to set default settings for the software.

Configuration

The “Configuration” control element allows you to set and adjust custom settings, such as excluding specific URLs from the crawl and integrating Google Analytics or Google Search Console into crawls.

Bulk Export

As the name suggests, this option allows you to export multiple URLs.

Reports

The “Reports” control element creates downloadable crawl overviews and data reports.

Sitemaps

You can construct a website sitemap in this control element. Screaming Frog offers various sitemap options, so it’s great for large websites with a complicated site structure.

Visualisations

Screaming Frog has two visualisation types – a directory tree visualisation and crawl visualisation.

While visualisations don’t offer the best way to diagnose issues, they can help provide perspective and highlight underlying patterns in data that may be difficult to identify using traditional reporting methods.

How to Crawl Your Site

When performing a crawl in Screaming Frog, the software defaults to Spider mode and conducts the audit according to the configurations and filters created by yourself.

You perform an audit by entering the URL into the search bar near the top of the software and clicking “Start”. Alternatively, you could upload your sitemap by changing the “Mode” to “List”, which will instruct the platform to crawl links contained within this blueprint of the website. (Soar will perform a basic SEO audit on your site completely free, click here to order one today

Now that we’ve covered how to get started with Screaming Frog let’s delve into how you can use the software to support SEO tasks.

Backlink Analysis

You can use Screaming Frog to identify low-quality backlinks on your website, which you might need to disavow.

Knowing the quality of your backlinks is extremely important, especially if you’re comparing your website performance to competitors and assessing a new client to see how the website is faring.

As of the Screaming Frog 8.0 update, you can now integrate other analytical tools such as Ahrefs, Google Analytics, and Majestic with the SEO spider software. All you’ll need is your API code to connect the accounts.

Using the Majestic SEO tool as an example, which has a powerful backlink tracking ability, we’re going to show you how to use it in conjunction with Screaming Frog.

After connecting the two accounts and performing a site audit, Screaming Frog will return its usual data, and link metrics gathered from Majestic, which will show in the “Link Metrics” tab.

If you’re analysing your own data, you can use backlink metrics to assess your performance against competitors, paying particular attention to the differences in engagement levels across your top-performing content.

You can also look deeper into your internal and external links to see how they’re being used, where they link to and if page authority is being passed down to lower-level pages in your sitemap.

Link Building 2

Analysing Images

If your images aren’t correctly optimised, this could lead to a slow page loading speed when users try to access your web pages.

Use Screaming Frog to determine your image sizes and identify any that may slow response times. You can also use Screaming Frog to review ALT text and image display issues for whatever reason.

To find data on your images, go to the “Images” tab, where you can filter by size or other factors to go through the ones that may be causing issues on your site and optimise accordingly.

Custom Extractions with Regex, CSS and XPath

Screaming Frog scrapes essential information about your website by default. However, if you’re looking for something specific, there are two features you can use to conduct an advanced site crawl and analysis: Custom Searches and Custom Extractions.

You can source the Custom Search feature in the Configuration element in the menu. It will allow you to find a preferred line of text within the source code of your web pages. For example, suppose you own an eCommerce website; this feature could help you identify which of your products are “Out of Stock” and whether the web page is still needed or needs to be removed.

On the other hand, the Custom Extractions feature, which is located under the same Configuration element, collects data from the HTML source via three paths:

XPath

XPath, an abbreviation of XML Path Language, extracts HTML elements of a web page, meaning any information contained in a div, span, p and heading tag. 

Google Chrome has also made it easier to export XPath. Right-click on the code within the Inspection tool and go to Copy XPath. While you might need to tweak the syntax, you can paste the data into Screaming Frog to perform the extraction.

CSS Path

You can also scrape data from your site using a CSS path, which uses patterns to select elements, with the option of adding an attribute field. This is probably the quickest option for extracting data out of the three methods.

Regex

Regex is a unique string of text used for defining data patterns. You can also use Regex to extract schema markup in JSON-LSD format and tracking scripts; however, as it is pretty complex, it’s best suited for more advanced users.

Creating Custom Configurations

Screaming Frog’s 8.0 update now includes a Custom Configurations feature. 

If you want to scrape specific information, you might need to set custom configurations to perform an audit in your preferred way. Before the latest update, you would have to add your custom configuration settings each time you wanted to switch between crawls on different sites.

Now you can save your custom configuration profiles in Screaming Frog. All you have to do is go to File > Configuration > Save As.

You can save an unlimited number of these and even share them with other users – useful if other people in your team need to access these settings.

Customising Robots.txt Files

Robots.txt files is a text file web admins created to instruct search bots how to crawl URLs, specifically, which links it should and should not crawl.

The robots.txt file plays a significant role in the overall management of a website. For example, failure to properly manage disallow entries, which is the rule that tells crawl bots not to visit a specific URL, could prevent critical sections of the site from being crawled.

Screaming Frog allows you to run a site audit and ignore the robots.txt file, to help you identify whether key content is being blocked from crawls and take appropriate action. It’s also a helpful feature to utilise when setting up a new site or performing a site migration.

To set this option, go to Configuration > Robots.txt > Settings.

You can also customise and create your own rules for the current robots.txt file of the website within the “Custom” tab of the menu element mentioned above to determine how changes to the file could affect your site.

Running a site audit with the new rule in place will also give you an idea of how search engines would crawl your web pages.

Sourcing Orphan Pages in Sitemap

An orphan page is a page that search bots cannot find via your internal linking structure, meaning users will also have difficulty accessing these web pages.

Orphan pages can occur for several reasons, including:

  • Old pages unlinked but left as published
  • Site architecture issues
  • CMS creating additional URLs as part of page templates
  • Pages that no longer exist but are being linked to via another website

While a small number of orphan pages isn’t a huge problem, it’s important to create a solid internal linking structure to help Google understand and rank your website better.

It’s incredibly significant for your top-level pages as the more links a page receives, the more important it appears to Google. However, to discover orphan pages in Screaming Frog, additional URL sources are required from sitemaps and other web tools such as Google Search Console or Google Analytics.

Once you’re set, start by performing a website crawl of your site and sitemap, then head to the “Internal” tab and filter the URLs by HTML for export. Make sure you create separate files for URLs found in your site crawl and the sitemap, then export both files into different tabs within a Google Sheets document and remove any duplicates.

Alternatively, suppose you’ve previously configured a “Crawl Analysis”. In that case, the right-hand pane will show an overview of URLs that require attention post-crawl, including Orphan URLs, which you can filter under the Sitemap, Search Console and Analytics tabs to view.

You may also be interested in...

Content Marketing in 2022

The Anatomy of Top-Performing Organic Content in 2022

Engaging content needs to be original, relevant and offer value to the user via an extensive criterion that ranges article length to visual elements and content quality. Technical factors such as article structure, internal links, and image ALT tags also play an essential role in a winning content formula.

Read More »
analytics

Web Traffic Results, Trends & Benchmarks For 2022

Over the last few years, we’ve witnessed a phenomenal increase in website traffic, especially mobile traffic, accounting for approximately 55% of total web traffic in 2022.
More and more consumers use their smartphones to discover new brands, shop online and research products or services.

Read More »
Spelling & Grammar Impacting SEO Strategy

Spelling & Grammar: Is It A Google Ranking Factor?

The introduction of artificial intelligence (AI) and machine learning to support algorithmic processes such as RankBrain is steadily gaining weight among the top ranking signals, meaning marketers should seek to practice good spelling, grammar and punctuation.

Read More »
Instagram marketing tips

Instagram Marketing Tips for 2022

According to the latest statistics, Instagram has 1 billion monthly active users and 500 million daily active users, making it one of the most popular social media networks globally.

Read More »
MarTech Awards Logo

Soar Online
Best Boutique SEO Agency UK

Excellence Award for Digital Recovery Services 2022

Track the keywords & phrases your clients would use to search for your business

Use our handy tool to receive keyword ranking reports on a weekly basis – it’s fast and free.

500 Club

5 Fantastic Benefits
only £500 per month

The Digital Revolution is happening right now

Give your online business some altitude.

Contact Us

This field is for validation purposes and should be left unchanged.