Our Data Journey

We want it to be completely clear what happens to every piece of data Clerion touches. We comply with all relevant privacy laws and practice data minimisation throughout: only processing and saving what is essential, useful, and privacy-first.

This page describes in precise detail what happens from the moment you embed the Clerion script on your website to the moment that data surfaces as an insight on your dashboard. Everything below matches the code running in production.

EU-first infrastructure

🇪🇺 Clerion was built on EU infrastructure from day one. Our Node.js backend runs on Railway in the Netherlands (eu-west region). Our PostgreSQL database is hosted by Supabase in Frankfurt (AWS eu-central-1). We didn't bolt on an EU region later; this is where Clerion has run from the start.

This means that regardless of where your website visitors are located, their data is processed on EU infrastructure. See our Schrems II compliance page for the full analysis of what this means in practice.

Step 1: The script loads in the visitor's browser

You embed Clerion by adding a single script tag to your website. When a visitor loads your page, their browser requests and executes our JavaScript SDK (clerion-analytics.js).

The SDK is lightweight and cookieless. Before it does anything else, it runs two privacy checks, in this exact order:

Privacy signal check (GPC and DNT)

_checkPrivacySignals() {
 if (navigator.globalPrivacyControl === true) return true; // GPC: opt-out
 if (navigator.doNotTrack === '1') return true;            // DNT: opt-out
 if (window.doNotTrack === '1') return true;
 return false;
}

If the visitor's browser asserts Global Privacy Control (GPC) or Do Not Track (DNT), tracking stops here. No events are queued. No requests are sent to our servers. This check runs before anything else. GPC compliance is legally required under CCPA and Colorado CPA; we honour it regardless.

Consent check

If no privacy signal is detected, the SDK reads the clerion_consent cookie to determine whether the visitor has given or withheld consent. The SDK is configured with requireConsent: true by default.

If consent has been granted: full tracking proceeds, including the optional persistent visitor ID.
If consent has been withheld: no persistent identifiers are created. Session-only tracking (without cross-visit linking) can proceed if allowWithoutConsent: true is configured.
If no preference is recorded yet: the consent banner component is rendered and tracking is paused pending the visitor's choice.

Step 2: Session identity (browser-side)

If tracking proceeds, the SDK establishes session context using only privacy-safe browser storage.

Session ID: `sessionStorage` only

const sessionId = `session_${Date.now()}_${Math.random().toString(36).substring(2, 15)}`;
sessionStorage.setItem('clerion_session', JSON.stringify({ sessionId, lastActivity: Date.now() }));

The session ID is stored in sessionStorage, a browser storage mechanism that is cleared automatically when the tab closes. It is not accessible to other tabs or windows, cannot be read by third-party scripts on the same page, and disappears the moment the session ends. Sessions expire after 30 minutes of inactivity.

Persistent visitor ID: `localStorage`, only with consent

_getOrCreateVisitorId() {
 if (this.consentStatus !== true) return null; // Hard gate: no consent, no ID
 // ...create and store visitor_${timestamp}_${random} in localStorage
}

The persistent visitor ID is consent-gated at the code level, not just by policy. If consentStatus !== true, the function returns null immediately. No ID is created or stored. Consent revocation calls localStorage.removeItem('clerion_user_id'): the ID is actually removed from the browser.

Device information: for aggregate classification only

The SDK reads device properties from the visitor's browser: screen dimensions, viewport size, device type (mobile/tablet/desktop), screen orientation, and pixel ratio. This data is used exclusively for aggregate reporting. For example, what percentage of your visitors use mobile devices. It is never stored alongside any identifier that could link it to a specific individual.

Step 3: Events are batched and sent to our server

Tracking events (page views, scroll depth, outbound link clicks, form submissions, etc.) are queued in memory and sent as a batch to our server via the navigator.sendBeacon API. This is a non-blocking, fire-and-forget transmission that does not delay page unload.

The payload looks like this:

{
 "events": [
   {
     "eventType": "page_view",
     "sessionId": "session_1746987654321_k4x2mj9",
     "websiteId": "site_xyz456",
     "metadata": {
       "path": "/pricing",
       "referrer": "https://google.com",
       "device": {
         "screenWidth": 1440,
         "screenHeight": 900,
         "deviceType": "desktop",
         "orientation": "landscape-primary",
         "pixelRatio": 2
       }
     }
   }
 ]
}

Notice that the payload contains no IP address. The IP address is not extracted, stored, or forwarded by the SDK. It arrives at our server as a standard part of the HTTP request headers (which is unavoidable), but the SDK itself never touches it.

Step 4: Our server receives the request

The tracking request reaches our Node.js backend in the Netherlands. Before the event is processed, it passes through our middleware stack:

Request ID assignment: every request is tagged with a unique ID for log correlation.
Body parsing and size limits: tracking payloads are capped at 100 KB.
Tracking key validation: the request must carry a valid API key that maps to a known website. Requests with missing or invalid keys are rejected immediately.
Rate limiting: per-IP request counts are maintained in memory to prevent abuse. These counts are not persisted to the database and are not associated with any stored analytics data.

Trusted proxy chain

We rely exclusively on req.ip as resolved by Express through the trusted proxy chain (app.set('trust proxy', 1)). We do not read X-Forwarded-For, CF-Connecting-IP, or any other header that could be spoofed by a client sending arbitrary values. The IP address we process is the address our infrastructure layer has verified.

Step 5: The IP address is pseudonymised and then discarded

This is the most privacy-critical step in the data journey. Here is exactly what happens to the visitor's IP address:

IP hashing

function hashIp(ip) {
 return crypto.createHmac('sha256', process.env.IP_HASH_SECRET)
   .update(ip)
   .digest('hex');
}

The raw IP address is passed through HMAC-SHA256 with a server-side secret (IP_HASH_SECRET). The result is a fixed-length hex string, a one-way cryptographic transformation. The original IP cannot be recovered from the hash without knowing the secret, and the secret is held exclusively on our EU infrastructure and is never transmitted, logged, or stored alongside the hashed data.

Geo lookup

Simultaneously, the raw IP is sent to IPLocate's EU-only API endpoint to resolve a country code:

GET https://eu-api.iplocate.io/api/lookup/{ip}?apikey=...

We use IPLocate's EU endpoint exclusively: requests never leave EU infrastructure. IPLocate operates under a signed Data Processing Agreement with EU Standard Contractual Clauses. The lookup has a 1,500 ms timeout with automatic abort. Results are cached in-memory for 24 hours to minimise external calls.

The lookup returns: country, region, city, timezone, latitude, longitude.

Raw IP discarded

After the hash is computed and the geo lookup is complete, the raw IP address is gone. It is:

Never written to our database: the analytics_events table has no column for raw IP addresses.
Never written to a log file: we do not retain HTTP access logs that would contain visitor IPs.
Never transmitted to any third party in raw form: IPLocate receives it for the lookup and does not retain it beyond the duration of the API call; no other party receives it at all.

The entire operation (hash computation, geo lookup, raw IP discard) happens in memory.

Step 6: The event is written to the database

Once the IP has been pseudonymised and the geo data resolved, the event is inserted into the analytics_events table in our Supabase PostgreSQL database in Frankfurt.

Here is an example of the exact row written to the database:

{
 "id": "a1b2c3d4-e5f6-7890-abcd-ef1234567890",
 "store_id": "store_abc123",
 "website_id": "site_xyz456",
 "session_id": "session_1746987654321_k4x2mj9",
 "user_id": null,
 "event_type": "page_view",
 "timestamp": "2026-05-11T14:00:00Z",
 "metadata": {
   "path": "/pricing",
   "referrer": "https://google.com",
   "device": {
     "screenWidth": 1440,
     "screenHeight": 900,
     "deviceType": "desktop"
   }
 },
 "geo_data": {
   "country": "GB",
   "region": "England",
   "city": "London",
   "timezone": "Europe/London"
 },
 "ip_hash": "a3f9c2e1d8b74f6c..."
}

A few things to note about this row:

user_id is null. This is the default. It is only populated if the visitor has given explicit consent and the SDK has created a persistent visitor ID.
ip_hash is the HMAC-SHA256 hash of the raw IP. The raw IP is not present anywhere in this row or in any other table.
geo_data contains city-level location derived from the IP. The IP itself is gone.
metadata contains the page path, referrer, and device classification, but no personal identifiers.
timestamp is the server-assigned UTC time. We do not store sub-second precision.

Under GDPR, the ip_hash is pseudonymous personal data: it could theoretically be reversed by someone who had both the hash and the secret. The secret never leaves our EU infrastructure. In practice, the hash is irreversible without it. Under CCPA, the hash satisfies all four de-identification requirements in §1798.140(m).

Step 7: The AI layer and what Claude sees

Clerion's AI features (Command Center summaries, Signal Feed, audience insights, error diagnosis) are powered by Anthropic's Claude model. Claude does not receive individual event rows. It does not see IP hashes, session IDs, visitor IDs, or any data that relates to an individual visitor.

What Claude receives is a pre-aggregated statistical context built from Supabase queries: counts, percentages, ranked lists, and time-series summaries. An example of the kind of data Claude analyses:

{
 "total_pageviews": 12450,
 "unique_sessions": 3821,
 "top_pages": [
   { "path": "/pricing", "views": 2340, "bounce_rate": 0.42 },
   { "path": "/", "views": 4120, "bounce_rate": 0.31 }
 ],
 "traffic_sources": {
   "organic_search": 0.48,
   "direct": 0.29,
   "referral": 0.23
 },
 "device_breakdown": {
   "desktop": 0.61,
   "mobile": 0.35,
   "tablet": 0.04
 }
}

The aggregation and statistical summaries happen on our server before anything is sent to Anthropic. This means:

Anthropic never receives a visitor IP address, hash, or any persistent identifier.
Anthropic never receives individual event records.
Anthropic's DPA explicitly prohibits using customer data for model training or product improvement.
The "sharing" that occurs with Anthropic does not constitute "selling" or "sharing" personal data under CCPA or GDPR, because no personal data is transmitted.

Step 8: Data retention and deletion

We operate an automated daily retention purge job that deletes analytics data outside each plan's retention window. The job runs 5 minutes after server startup and every 24 hours thereafter.

Plan	Retention window
Free	30 days
Solo	90 days
Starter	90 days
Growth	13 months
Business	13 months
Agency	13 months

We cite GDPR Article 5(1)(e) (storage limitation) in the code that implements it: personal data must not be kept longer than necessary for the purpose for which it was collected.

Operator-initiated deletion: Website operators can delete all analytics data for their site at any time from the dashboard. Account closure triggers deletion of all associated data. We do not retain data after a relationship ends.

Consumer requests: California residents (and others with applicable rights) who believe Clerion holds personal data about them can contact us at hello@getclerion.com. Given the architecture described above (no raw IPs, HMAC-hashed pseudonyms, no cross-site linking), identifying which specific row in our database corresponds to a specific individual is not feasible in practice. But we will engage with any such request in good faith.

Summary: what we store and what we don't

Data	Stored?	Notes
Raw IP address	No	Discarded after hashing and geo lookup
IP hash (HMAC-SHA256)	Yes	Irreversible without server-side secret
Country, region, city	Yes	Derived from IP; used for aggregate geo reporting
Session ID	Yes	Random string; cleared from browser on tab close
Persistent visitor ID	Only with consent	Never created for GPC/DNT visitors
Page path	Yes	Not personal information
Referrer hostname	Yes	Not personal information
Device classification	Yes	Aggregate; not linked to identified individual
Raw User-Agent string	No	Device type derived and discarded
Cookies	None	Clerion sets no analytics cookies
Name, email, phone, address	Never	Not collected at any point

Compliance links

The architecture described on this page underpins our compliance with every major privacy law applicable to web analytics. You can read our detailed analysis for each:

If you have questions about how we process data that are not answered here, contact us at hello@getclerion.com.