ADR: Subdomain Tenant Split via Explicit Domain Registration¶
Status¶
Accepted — already applied for landing.cactusinbound.com.
Date¶
2026-03-26
Context¶
Matera needs two separate website agents with two separate knowledge bases for two subdomains: matera.eu (main site) and info.matera.eu (information portal). Each subdomain should have its own chatbot persona, its own RAG content, and its own conversation history.
The existing ADR (2026-03-12, "Website Boundaries and Host Routing") proposes a general-purpose model with Client → Website → Host Rule concepts, cookie scope contracts, analytics join rules, and a transitional schema. That model is designed for arbitrary combinations of shared and isolated subdomains across any customer. It is more than what this use case requires.
This ADR proposes a minimal path that relies on infrastructure already in place.
What already exists¶
The current architecture supports per-subdomain tenant isolation without code changes:
Frontend domain resolution (domainMatching.ts):
selectDomainMatch()tries an exact host match first, then falls back to the normalized root domain.- The matched domain row's
domainvalue is sent assiteNamein every API request (ConfigProvider.tsx). - If
info.matera.euexists in thepublic.domainstable, the frontend sendssiteName: "info.matera.eu". If it does not exist, the frontend falls back tomatera.eu.
Backend tenant derivation (rag_instance_manager.py):
get_tenant_context(site_name)derives tenant IDs by sanitizing thesiteNamestring. No database lookup is involved.matera.eu→matera_eu,info.matera.eu→info_matera_eu. These are already different tenant IDs.
Storage isolation (MongoDB, Neo4j):
- All documents and graph nodes carry a
tenantIdproperty. Storage classes filter by the tenant ID set in a per-request contextvar. - Different
siteNamevalues automatically produce different tenant scopes. No separate databases or instances need to be provisioned.
Origin validation (security.py):
validate_origin_tenant()checks that theOriginheader matches thesiteName. WhensiteNameisinfo.matera.euand the request originates frominfo.matera.eu, validation passes.
Config resolution (config_factory.py):
resolve_config_for_domain()does an exact lookup on thepublic.domainstable. A domain row forinfo.matera.eugets its own config overrides viaconfig.client_configs.
Analytics tracking (posthog-provider.ts, cross-domain-tracker.ts, session_events table):
- The widget sends
rw_domain(exact hostname) andrw_root_domain(normalized) as global properties on every PostHog event. - The
session_eventstable stores events with asite_domaincolumn set to the exact matched domain. - Backoffice analytics RPCs filter conversations and events by
site_domain = p_domain. Conversation-based metrics are already domain-scoped. - However,
rose_client_idandrose_last_active_sessioncookies are set at root-domain scope (domain=.matera.eu), so visitor identity and cross-subdomain form attribution are shared across all subdomains of the same root.
Decision¶
1. Register each subdomain as an explicit domain entry¶
For each subdomain that needs its own knowledge base, add a row to public.domains using the existing create_domain() SQL function:
SELECT create_domain('info.matera.eu', 'matera-info', 'Matera Info', '#brand-color');
SELECT create_domain('matera.eu', 'matera', 'Matera', '#brand-color');
Each domain row gets its own client_id, its own config overrides, and its own tenant scope in MongoDB/Neo4j.
2. No code changes required for tenant isolation¶
The existing frontend resolution, backend tenant derivation, storage isolation, origin validation, and config resolution all work as-is. The only action is operational: insert domain rows and ingest content.
The backoffice analytics pages should add informational copy to clarify what is domain-scoped vs root-domain-shared (see "Analytics behavior" in Consequences). This is a UX improvement, not a prerequisite for the subdomain split itself.
3. Subdomains without explicit entries keep current behavior¶
Any subdomain of matera.eu that does not have its own row in public.domains continues to fall back to the matera.eu root domain entry. This is the existing normalized match strategy in domainMatching.ts. No customer behavior changes.
4. Per-subdomain configuration¶
Each domain entry can have independent config overrides in config.client_configs:
- Identity (company name, website URL)
- Appearance (brand color, logo)
- Chat behavior (suggested questions, greeting, model)
- Any other config slug
5. Knowledge base content is ingested per tenant¶
RAG content ingestion uses siteName as the tenant key. Content ingested for info.matera.eu is stored under tenant info_matera_eu and is only retrievable by requests with that siteName.
Consequences¶
Positive¶
- Zero code changes. Entire rollout is operational (domain registration + content ingestion).
- Preserves existing behavior for all other customers and for unregistered subdomains.
- Each subdomain gets full tenant isolation: separate knowledge base, separate conversations, separate config.
- Domain rows added now map 1:1 to website entities if the full Website Boundaries model is adopted later.
Analytics behavior¶
Not all analytics dimensions are scoped the same way. Operators should understand what is domain-specific and what is shared.
Domain-scoped (fully isolated per subdomain):
- Conversations: The
site_domaincolumn on the conversations table carries the exact matched domain. Backoffice analytics RPCs (get_conversations_over_time,get_conversation_funnel,get_dynamic_question_stats,get_conversation_sources,get_page_conversation_stats) all filter byWHERE site_domain = p_domain. Selectinginfo.matera.euin the backoffice shows only conversations that originated on that subdomain. - Session events: The
session_eventstable stores each event with the exactsite_domainfrom the widget'srw_domainPostHog property. Events oninfo.matera.euare taggedsite_domain = 'info.matera.eu'. - PostHog event properties: Every event carries both
rw_domain(exact hostname, e.g.info.matera.eu) andrw_root_domain(normalized, e.g.matera.eu), so PostHog-side filtering by subdomain is possible.
Root-domain-scoped (shared across subdomains):
- Visitor identity (
rose_client_id): Therose_client_idcookie is set atdomain=.matera.eu(root scope). A visitor who browses bothmatera.euandinfo.matera.euhas the samerose_client_idon both. Sincerose_client_idequals PostHog'sdistinct_id, PostHog merges activity from both subdomains into one person profile. - Cross-subdomain form attribution: The
rose_last_active_sessioncookie is also set at root scope. If a visitor chats oninfo.matera.euthen submits a form onmatera.eu, the form submission is attributed to theinfo.matera.euconversation session via cookie fallback. This is the intended behavior for single-domain setups but becomes surprising when subdomains are split. - PostHog person profiles: Because
distinct_idis shared, PostHog dashboards that group by person (e.g. unique visitors, return visitors) will count a cross-subdomain visitor as one person, not two.
Backoffice UX recommendation:
Operators should not assume more isolation than actually exists. Recommended copy:
- Conversations page: Small note — Scoped to selected domain.
- Analytics page: Banner — Conversation metrics are specific to the selected domain. Traffic and visitor identity may include activity from sibling subdomains of the same root domain.
- Visitor/funnel cards: Tooltip — Visitor identity is shared at the root-domain level. A visitor on multiple subdomains counts as one visitor.
Negative¶
- No cookie isolation between subdomains. Rose cookies (
rose_client_id,rose_last_active_session,rose_last_active_session_date) are set at.matera.eu(root domain scope), so visitor identity and session attribution are shared across all subdomains. This is acceptable when the same client owns all subdomains and does not need cross-subdomain privacy boundaries. - No
website_idgrouping key. If cross-subdomain analytics aggregation is needed later (e.g., "show me all Matera conversations"), there is no built-in way to group these domain entries. Theclient_idforeign key provides a partial grouping, but only if both domains share the same client. - The
normalizeDomain()bug (multi-level TLDs like.co.uknormalize incorrectly) still affects fallback resolution. This does not affect Matera (.euis a single-level TLD) but should be fixed independently.
Limitations — when this approach is not enough¶
- Cross-subdomain cookie isolation: If two subdomains under the same root must not share tracking cookies or visitor identity, the cookie scope contract from the Website Boundaries ADR is needed.
- Shared knowledge base across multiple hosts: If two hostnames need to share one knowledge base while a third is isolated, the many-to-one Host → Website relationship from the Website Boundaries ADR is needed.
- Backoffice website management UX: This approach requires manual SQL to add domains. If operators need a self-service UI for managing subdomain boundaries, the full Website + Host Rule model provides the right abstraction.
- Per-subdomain unique visitor counts: Because
rose_client_idis shared at root-domain scope, there is no way to count unique visitors per subdomain independently. A visitor on both subdomains is one visitor. Achieving per-subdomain visitor isolation requires the cookie scope contract from the Website Boundaries ADR.
Relationship to the Website Boundaries ADR¶
This ADR is a pragmatic subset. It covers the immediate Matera use case without introducing new architectural concepts.
The Website Boundaries ADR (2026-03-12) remains the architecture target for:
- general multi-website support across any customer
- cookie and analytics isolation contracts
- backoffice domain management UX
- the
website_idpersistence key
The two ADRs are compatible. Domain rows created under this minimal approach will map directly to website entities when the full model is implemented.
Alternatives Considered¶
1. Implement the full Website Boundaries ADR first¶
Why not chosen:
- Requires schema changes, cookie scope logic, analytics join rules, and backend origin validation changes.
- The Matera use case does not need cookie isolation or analytics join contracts.
- Delays the rollout for architectural work that serves future use cases, not the current one.
2. Use a configuration flag instead of separate domain entries¶
Why not chosen:
- The existing domain resolution already supports exact-match-first. Adding a flag would duplicate logic that already works.
- A flag does not provide tenant isolation — the knowledge base separation comes from different
siteNamevalues, which come from different domain rows.