How It Works

Bot tracking without the server-log project nobody wants to own.

CrawlerLogs exists because server logs are critical in theory and unavailable in practice. We skip the log pipeline and capture the smallest possible crawler event instead.

Why teams still cannot answer a simple question

Most teams cannot say when Googlebot, Bingbot, GPTBot, or other crawlers last visited their site because the data is trapped in infrastructure. Hosting providers hide it, platforms sample it, engineering teams are understandably cautious about sharing raw request data, and the files usually contain far more than a marketing or SEO team should be passing around casually.

That means the real problem is not whether logs exist. The real problem is whether you can get a clean, narrow crawler-specific signal without taking on a privacy, security, and workflow mess first.

Two collection paths, one clear winner

JavaScript Tag

Good when infrastructure access is a dead end

If you are not on Cloudflare or cannot modify request handling, the JS tracker gets you started quickly. It is the lower-friction fallback for teams that need visibility now.

Example JS Install

<script  src="https://js.crawlerlogs.com/latest.js"  data-cl-token="YOUR_API_KEY"></script>

→

Cloudflare Worker

Best when your domain is on Cloudflare

The Worker runs at the edge, sees every request that hits the domain, and extracts only the fields required for crawler intelligence. It is the strongest collection path and the one we should lead with.

Example Worker Install

export default {  async fetch(request, env) {    const res = fetch("https://ingest.crawlerlogs.com/ingest", {      method: "POST",      headers: { "Content-Type": "application/json",                 "X-Api-Key": "YOUR_API_KEY" },      body: JSON.stringify({        ip: request.headers.get("cf-connecting-ip"),        ua: request.headers.get("user-agent"),        url: request.url,        ts: Date.now()      })    });    return fetch(request);  }}

We only keep four things

The privacy model is simple: do not ingest full logs when all you need is crawler intelligence. Each event is reduced to four fields, which is enough to classify visits, verify real bots, and understand page-level crawl behavior without dragging unnecessary request data into the workflow.

URLWhich page was crawled

IP AddressUsed for verification

User-AgentWhich bot claimed the visit

TimestampWhen the visit happened

What the workflow looks like

Step 1

Deploy the Worker or JS tag

Choose the strongest path your stack allows. Worker first on Cloudflare, JS when that is not available.

Step 2

Capture narrow crawler events

Each request is reduced to the four fields needed for bot intelligence, not a full request log.

Step 3

Turn visits into visibility

See which bots visited, what pages they touched, how patterns changed, and whether the traffic looks real.