Optional enrichment
ClickStream's core product is Signals: traffic lanes, bot and answer-engine classification, behavior scores, and site-side decisions. Optional enrichment is a paid add-on for sites that are allowed to attach visitor details to a ClickStream ID after the visitor produces a matchable signal.
Privacy model up front. The normal Signals install does not require enrichment. When a site enables enrichment, raw PII is encrypted before storage and reveal is password-re-authenticated, rate-limited, and audit-logged. Email and phone matching signals are normalized and hashed before they are used for enrichment joins.
First-party identity signals
| Signal | Source | Hashing | Primary use |
|---|---|---|---|
hem | tracker.identify(email) | SHA-256 of lowercased, trimmed email | Internal primary key |
hemMd5 | same | MD5 of lowercased, trimmed email | Partner-compatible enrichment lookup format |
hashedPhone | tracker.identify({ phone }) | SHA-256 of E.164 normalized phone | Secondary primary key |
customerId | tracker.identify({ customerId }) | Raw (not PII — your internal app id) | Application/backend join |
accountId | tracker.identify({ accountId }) | Raw account identifier | Account-level analysis |
gclid / fbclid / msclkid / ttclid / twclid / sccid / epik / irclickid / dclid / li_fat_id | URL query params on landing | Raw | Inbound click parameter capture |
googleId / facebookId / linkedinId / appleId | Social login callback | Raw (OAuth subject) | Cross-device bridge via social login |
maid + maidType | Mobile SDK bridge | Raw UUID (IDFA / GAID) | Mobile-web unification |
clickstreamId / referringClickstreamId | Cross-site journey tracking | Raw | Network-tier journey stitching |
The first-party _cs_uid cookie is the anchor. Every identifier above hangs off that cookie until tracker.identify() or a captured form signal connects it to a known profile.
The API you actually write
Identify
await tracker.identify('user@example.com');
Accepts a raw email. The SDK:
- Normalizes (lowercase, trim).
- Hashes (SHA-256 for internal use, MD5 for optional enrichment compatibility).
- Sends the identify event to your first-party tracking domain.
- Encrypts raw email server-side when raw identity capture is enabled for the site, so reveal and enrichment eligibility can be audited.
- Persists
_cs_hem/_cs_hem_md5local-storage slots so subsequent sessions on the same browser re-emit the hashes without asking.
Raw email is never sent directly to an enrichment provider, never appears in public Signals responses, and is never shown in Einstein without the audited reveal flow.
Identify with additional signals
await tracker.identify({
email: 'user@example.com',
phone: '+14155551234',
customerId: 'cust_abc123',
accountId: 'acc_xyz789',
});
Phone is normalized to E.164 (+CCCC...) before hashing. If the input isn't parseable as E.164, the SDK emits a warning and skips the phone hash without blocking the rest of the identify call.
Revoke identity
// Revoke the marketing category (identity storage):
await tracker.setConsent({ marketing: false });
// Or revoke every category except necessary:
await tracker.rejectAllConsent();
Revoking the marketing consent category clears every identity slot from local storage, cookies, and in-memory state, and emits a _consent_transition event for the audit trail. Server-side person records are removed separately through the dashboard's visitor delete action — see Privacy under "Visitor rights — DSAR / export / deletion".
What ClickStream stitches
Given these signals arriving from different sessions, ClickStream links records using deterministic matchers such as hem, hashedPhone, and customerId. Every merge is auditable, and enrichment remains controlled by the site's plan, toggle, consent posture, and compliance profile.
Merge rules, in priority order:
- Exact
hemmatch — same SHA-256 email hash across devices → same person record. - Exact
hashedPhonematch — same SHA-256 phone hash → same person. - Exact
customerIdmatch — same application id → same person. - Social login bridge — same
googleId/facebookId/linkedinId/appleId→ same person. - Fingerprint + IP household — same
fingerprint_hash+ same/24IP subnet within a 48-hour window → same person,probabilisticconfidence label.
Conflicts (same _cs_uid claiming two different hem values) generate a collision audit row and are NOT auto-merged. The dashboard surfaces collisions on the Identity Quality tab; operators resolve them manually.
Using the resolved identity
Via the Signals API
import { getVisitorOrNull } from '@clickstreamhq/signals';
const visitor = await getVisitorOrNull();
if (visitor?.identity.hasIdentifiedThisSession) {
showLoggedInView();
}
The VisitorContext.identity block carries:
status—'anonymous'or'signal_identified'.clickstreamId— the first-party_cs_uidvalue.isReturning— whether this person has visited before.hasIdentifiedThisSession— whethertracker.identify()fired this session.
Full cross-session profile detail and enrichment reveal are paid dashboard features. Hobby still receives the first-party Signals snapshot needed for page-side decisions.
Via the dashboard
Visitors → Your Site shows every visitor with their sessions, Signals labels, and journey map. Paid enrichment surfaces only when your site has enabled the add-on and ClickStream has a permitted match.
Identity enrichment
ClickStream can enrich identified visitors through DataShopper (DataMoon) when a site has enrichment enabled and the visitor produces a matchable signal. Email hashes are stored in Analytics Engine and the enriched profile store, joined via per-tenant HMAC, and used only under the site's consent and compliance profile.
The hem (SHA-256) hash is the durable first-party identity key. The hemMd5
hash is retained for the approved enrichment workflow. Raw
email and phone values are encrypted before storage, are never shown by default,
and require an audited reveal action.
Enrichment remains scoped to the site and plan controls configured in Einstein. If enrichment is disabled for a site, ClickStream still records first-party identity signals, but it does not request or attach enrichment data.
Retention
- Enriched person records — retained for 90 days after last signal update, then auto-scrubbed (
email,phonehashes nulled; session data preserved anonymized). - Hashed identifiers in Analytics Engine — retained per your tenant's event retention (default: indefinite; configurable per plan).
- Encrypted raw values (form fills, IP addresses) — retained per your tenant's
retentionDayssetting. Auto-purge is opt-in. - Reveal audit log — every
/decryptcall is logged with the operator identity + IP + timestamp. Audit rows are retained 7 years by default (SOC2-style compliance window).
See also
- Privacy & compliance — consent model + GDPR / CCPA / HIPAA mode
- Security — encryption at rest, per-tenant key isolation,
/decryptgate - Event schema — identify — full event payload