Crawl errors, explained
Overview
At Siteimprove, we help you keep your websites healthy by regularly crawling them and checking for issues. This ensures your content stays up to date, accessible, and optimized. In some cases, a site crawl may fail and no pages are found. When this happens, Siteimprove reports a crawl error. This article explains why crawl errors occur, what they mean, and how to identify the specific type of crawl error you are experiencing.
What this means
A crawl error occurs when Siteimprove’s crawler cannot access or process pages on your website. This can happen for several reasons, such as:
- The site or page no longer exists
- Server or network settings block the crawler
- Authentication or login requirements prevent access
- Siteimprove settings restrict what can be crawled
Because crawl failures can have different causes, Siteimprove groups crawl errors into specific categories to help you understand what is preventing a successful crawl.
Who is impacted by this
Crawl errors may impact several groups, depending on their role and responsibilities:
- Website owners and content managers
When pages cannot be crawled, Siteimprove cannot analyze them for accessibility, SEO, or content quality issues, which may result in incomplete insights about site health. - Developers and IT teams
Crawl errors often relate to server configuration, network restrictions, authentication requirements, or robots.txt rules that developers and IT teams manage. - Accessibility, SEO, and compliance teams
If pages are unavailable to the crawler, they cannot be evaluated for accessibility, search performance, or regulatory compliance, limiting visibility into potential risks. - Organizations relying on monitoring and reporting
Crawl failures can affect dashboards, reports, and trend data, making it harder to track improvements or identify issues across the site.
To continue, review the crawl error details in Siteimprove to identify the error category. Then refer to the corresponding crawl error article for additional context and guidance.
Related articles
- Crawl Errors: Excluded URLs, Redirects, and HTTP 400 Issues
- Crawl Errors: Index URL blocked by exclude or remove rules
- Crawl Errors: Index URL Points to an Invalid or Non-Crawlable Page
- Crawl Errors: Crawler Blocked by Authentication or Login Requirements
- Crawl Errors: Crawler Blocked by Server, Network, or robots.txt settings
- Crawl errors: Crawl Blocked due to Server Overload or Instability
- Crawl Error: Unidentified Error
Did you find it helpful? Yes No
Send feedback