Why is my website crawl taking a long time?
There can be a number of reasons preventing a website crawl from ending.
Typical reasons and possible solutions are provided in the table below.
Reason for Issue | Possible Solutions |
---|---|
Your website is responding slowly, therefore, our crawler has automatically slowed down so as not to overload the website. | Exclude any areas of the website that you do not want to be checked. See: How to add and remove content from a crawl |
A large amount of PDFs on the site | The time it takes to process a PDF is comparable to processing a web page. Having many PDFs is similar to crawling a site with lots of web pages. Siteimprove needs time to check the PDFs and provide you with check results. If the PDF checks are not a priority (e.g. archived data) you can use settings to mark them as external to reduce the crawl time. See: Site Content Settings. |
The login credentials provided to Siteimprove, for a website behind a login, have expired. | Provide Siteimprove Technical Support with the new login credentials for the website. |
The crawler has found a new section on the website and needs to be re-configured accordingly, e.g. a new forum, a web-shop with duplicate pages, etc. | Exclude any areas of the website that you do not want to be checked. See: How to add and remove content from a crawl Open a ticket to ask Technical Support to review the crawler settings and configure the crawler to reduce duplicate pages or exclude unwanted areas of the website from the crawl. |
The IP address for the crawler has been blocked by your firewall. | Ensure that Siteimprove Crawler IP addresses can access the website. |
Contact Siteimprove Technical Support if you continue to have issues with a website crawl not ending.
Did you find it helpful? Yes No
Send feedback