Skip to main content

The Siteimprove Crawler – Bot Information

Modified on: Fri, 12 Jun, 2026 at 3:27 PM

Summary

The Siteimprove crawler is a cloud-based bot that scans websites to monitor quality, SEO, accessibility, and policy compliance. It identifies itself through specific user agent strings (such as SiteCheck-sitecrawl) and respects robots.txt rules while crawling domains configured within customer accounts.

Overview

The Siteimprove crawler is a core component of the Siteimprove platform. It systematically analyzes website content to provide insights into performance, accessibility, and compliance. This article explains how the crawler identifies itself, how it accesses websites, and how it can be controlled or blocked if needed.

Environment / Applicability

  • Siteimprove platform
  • Website crawling and monitoring
  • SEO, Accessibility, QA, and Policy modules

What is the Siteimprove Crawler

Siteimprove is a Software-as-a-Service (SaaS) company that creates cloud-based tools and services for website governance and optimization. 

The Siteimprove crawler analyzes and monitors websites for quality assurance, SEO, and accessibility purposes, and keeps website content in line with brand guidelines and organizational policies.

Identifying the Siteimprove Crawler

While the full user agent string of the Siteimprove crawler can vary on some sites, the common identifier of the Siteimprove crawler in the user agent string is:

  • SiteCheck-sitecrawl by Siteimprove.com 

Related Siteimprove services requesting resources on domains crawled by Siteimprove include the following:

  • Link Checker: LinkCheck by Siteimprove.com
  • Image sizer: Image size by Siteimprove.com
  • Probing: Probe by Siteimprove.com

See the full list of the IP addresses, user agent strings, and tokens we use.

How the Siteimprove crawler accesses your site

By default, the Siteimprove crawler respects the robots.txt and delays set on a robots.txt-file. 

The frequency of requests sent by the Siteimprove crawler can vary across domains, depending on the specific crawl settings requested by our customers. 

Blocking Siteimprove’s crawler from visiting your site

The Siteimprove crawler only crawls domains specifically configured to be crawled within Siteimprove accounts. With the exception of a few internal accounts, these accounts are owned and configured by Siteimprove customers. 

If your domain is crawled by Siteimprove, there is a very high chance your organization has a contract with Siteimprove, pays for Siteimprove, and needs the crawls to receive insights for improvements on your website.

If you want to prevent the Siteimprove crawler from crawling your domain anyway, you can do this via your robots.txt by disallowing the crawlers user agent token.

  • SiteimproveBot-Crawler
     or simply
  • SiteimproveBot

Blocking SiteimproveBot affects Siteimprove's services that use the crawled data, including Quality Assurance, SEO, Accessibility, and Policy. 

Contact us

If you believe Siteimprove should not be crawling your domain, contact Siteimprove technical support.

Did you find it helpful? Yes No

Send feedback
Sorry we couldn't be helpful. Help us improve this article with your feedback.