EWC Crawler

The EWC bot is a web crawler that is used to gather information about a website. Websites are analyzed by our crawler when a user enters a URL on our website. We always limit the number of requests per domain to not overload your servers. The data we gather helps in identifying common issues on public websites. The results are used by website owners to improve their websites for SEO, usability, accessibility, and performance.

General Info

How we use Robots.txt

Robots.txt will be loaded before executing requests. The Robots.txt is parsed with Google's Robots.txt validator. The content of the Robots.txt is cached for a short period of time. By default, the crawler will crawl at a slow rate for the freemium versions of our services. When no (valid) robots.txt is available on the domain, we continue crawling your website.

Since our service heavily relies on Chromium to make requests, we measure concurrent requests only on a page level. Assets, such as JavaScript and CSS files will be requested in parallel.

External websites

For some of our services, we only need to know if a URL is serving content. This is for example the case with our Broken link checker. When a user enters his domain in our broken link checker, we check outgoing links to other websites by performing a HEAD request. A server should respond with a valid HTTP Status Code and no response body for a HEAD request. If the server does not support a HEAD request, we expect one of the following HTTP status codes:

In the case a server does not support a HEAD request, we perform an additional GET request. Some servers that don't support HEAD requests, simply drop the request. When we detect a dropped HEAD request (that timed out on our servers), we perform a GET request.

Redirects

The EWC-bot follows redirects. However, we have a strict limit on the number of redirects we follow. Make sure to limit the number of redirects on your website to not reach the redirect limit.

How to block EWC bot?

In case you want to disallow EWC bot to index your website completely, add the following to your robots.txt:

User-agent: excellentwebcheck-bot
Disallow: /

Issues with EWC bot

If you encounter any problems with the EWC-bot, please send an email to support@solureal.com.

The EWC Crawler is built with great care. The crawler shouldn't overload any website or cause any problems. It might be that some other bot is using our user-agent. Please send the IP-address of the bot to make sure if it is really the EWC crawler that causes the issues.