How to Set Up Audit Logging on a Modern Enterprise CMS

An auditor asks a single question: "Who changed this published page, and when?" On a legacy DXP, the honest answer is often a shrug, a database diff, or a week of forensic log-stitching across an application server, a workflow engine, and a separate authoring instance. By the time someone reconstructs the timeline, the audit window has closed and the finding stands. For a regulated enterprise, that gap is not a nuisance; it is the difference between passing a SOC 2 review and explaining a control deficiency to a board.

Audit logging is the most boring control in your governance program right up until the moment you need it, and then it is the only thing that matters. The problem is that most content platforms treat the log as an afterthought, bolted on, partial, and hard to query. Sanity, the Content Operating System for the enterprise, treats content as queryable structured data and exposes governance primitives, Audit logs, Roles & Permissions, and SSO, as first-class surfaces rather than add-ons.

This guide walks through how to set up audit logging on a modern enterprise CMS: what events to capture, how to scope access, how to retain and export records, and how to make the log answer real auditor questions instead of generating noise.

Start with the question your auditor will actually ask

Audit logging fails most often because teams instrument everything and answer nothing. They turn on verbose logging, accumulate gigabytes of events, and then discover that when an auditor or an incident responder asks a specific question, the log cannot answer it. The fix is to design backward from the questions you are obligated to answer, then capture exactly the events that let you answer them.

For a content platform, the recurring questions cluster into a few categories. Who created, edited, published, or unpublished a given document, and when? Who changed a user's role or permissions? Who exported or deleted a dataset? Who logged in, from where, and did they use SSO? Each of these maps to a concrete event type you need to capture with an actor, a timestamp, a target, and enough context to reconstruct intent.

The enterprise stakes here are governance, not curiosity. A SOC 2 Type II audit examines whether your controls operated effectively over a period, not just whether they exist on paper. That means your log has to demonstrate continuous coverage, not a snapshot. The EU's regulatory posture on content and AI adds a second axis: if automated processes touch published content, you need to show which change was human and which was machine, and who approved it.

Sanity's approach treats content as structured data in Content Lake, so authoring and publishing events are recorded against documents you can already query. Audit logs capture the governance actions, role changes, access events, and configuration changes, that sit above the content itself. Designing your event taxonomy first means the log you build maps directly to the control narrative you will hand an auditor, instead of a firehose you have to defend after the fact.

Capture the right events: content lifecycle, access, and configuration

A defensible audit trail covers three distinct layers, and conflating them is a common mistake. The first layer is the content lifecycle: create, edit, publish, unpublish, and delete events on documents. The second is access and identity: logins, SSO assertions, failed authentication attempts, session events, and role or permission changes. The third is configuration: dataset creation and deletion, API token issuance and revocation, webhook and Functions changes, and anything that alters the platform's own behavior.

Legacy DXPs tend to log these layers in different places. Authoring activity lives in the CMS database, authentication lives in an identity provider or an application server, and configuration changes live in deployment tooling or are not logged at all. When an incident spans all three, which they almost always do, the responder has to correlate timestamps across systems that do not share a clock or an actor ID. That correlation work is where investigations stall.

A modern enterprise CMS consolidates the layers. In Sanity, content lifecycle events flow through Content Lake and can be reconstructed from document history, governance events surface through Audit logs, and identity is anchored by SSO so every actor resolves to a single corporate identity rather than a local account. Roles & Permissions defines who can do what, and the log records when those definitions change, which is itself one of the highest-risk events in any system.

The practical rule: log the verb, the actor, the target, the timestamp, and the source. A row that reads 'user@corp.com published document abc123 at 14:02 UTC via SSO session X' answers an auditor in one line. A row that reads 'change detected' answers nothing and forces the forensic work you were trying to avoid. Capture intent-bearing events, not raw diffs, and you build a log that reads like a narrative instead of a haystack.

Scope access so the log is trustworthy and least-privilege holds

An audit log is only evidence if the people who could tamper with it are not the same people it is watching. This is the separation-of-duties principle, and it is where governance design earns its keep. If an administrator can edit content, grant themselves new permissions, and also delete the log entries that recorded those actions, the log proves nothing in a dispute.

The control structure has two parts. First, least-privilege role design: editors get publishing rights on the datasets they own and nothing else; administrators who manage roles do not also need standing content-edit rights; the people who can read audit logs are a small, named group distinct from day-to-day operators. Second, immutability or strong tamper-evidence on the log itself, so that even a privileged actor cannot quietly rewrite history.

This is harder than it sounds on a self-hosted DXP, because the database administrator typically has god-mode over everything, including the audit tables. Outsourcing the substrate changes the calculus. When Sanity operates Content Lake as a multi-tenant, multi-region store, your own staff do not hold root on the database that records their actions, which is a meaningful separation-of-duties improvement over an on-premises install where the same team runs the app and the database.

In Sanity, Roles & Permissions lets you define granular access scoped per dataset and per workspace, SSO ties every session to a corporate identity you can deprovision centrally, and Audit logs record the governance actions across that surface. The combination means an auditor can verify both that least-privilege was configured and that it held over the review period, which is exactly the 'operated effectively over time' standard a SOC 2 Type II report is built around.

Retention, export, and the integration into your SIEM

Capturing events is half the job; keeping them, exporting them, and routing them to the systems where your security team actually works is the other half. Retention is a compliance requirement before it is an engineering one. Different frameworks and contracts impose different minimums, and your retention policy needs to satisfy the longest obligation you are subject to, not the most convenient one.

The export path matters because enterprise security operations do not live inside a CMS. They live in a SIEM (Splunk, Microsoft Sentinel, Datadog, or similar) where logs from every system are correlated, alerted on, and retained under a single policy. A content platform that traps its audit data behind a UI with no programmatic export becomes a blind spot in your monitoring. The goal is to stream or pull CMS audit events into the same pipeline as everything else, so a suspicious content publish can be correlated with a suspicious login on another system.

Sanity exposes Audit logs that can be retrieved programmatically, and Functions can react to events as they happen, which lets you forward governance events into your SIEM or trigger a compliance check the moment a sensitive action occurs. Because content lives in Content Lake as structured data reachable over the Live Content API, you can also reconcile the audit trail against the current state of content rather than trusting the log in isolation.

The reframe for buyers: do not evaluate audit logging as a feature checkbox. Evaluate it as a pipeline. Can you get the events out, in a parseable format, into the system of record where your auditors and responders already look, retained for as long as your obligations demand? A platform that answers yes turns content governance from an island into part of your enterprise control plane.

Govern automated and AI-driven changes alongside human ones

The newest gap in most audit programs is the one nobody designed for: automated and AI-assisted content changes. When a script, a scheduled job, or an AI enrichment step edits published content, the audit question changes shape. It is no longer just 'who did this' but 'was this a human or a machine, what triggered it, and who authorized the automation to act.' A log that cannot distinguish a human editor from an automated process leaves you unable to answer the question regulators are increasingly asking.

This matters for enterprise risk on two fronts. Compliance frameworks and emerging regulation expect you to attribute content changes accurately, and the EU's regulatory direction on AI pushes toward demonstrable human oversight of automated outputs. If an AI step rewrites a product description or a legal disclaimer, you need a record that shows the automated change, the human review, and the approval before publication, all as distinct, attributable events.

The operational answer is to treat automation as a first-class, named actor with its own scoped permissions, and to route AI-touched content through an explicit review step rather than letting it publish silently. In Sanity, Functions can run governed automation, with actions attributed to the function rather than smeared across a shared admin account, and Content Releases let you stage a batch of changes, human or machine-originated, and ship them as a reviewed unit instead of a stream of unaccountable edits. Audit logs then record the staging, the review, and the publish as discrete steps.

The principle holds regardless of vendor: instrument your automations as identifiable actors, gate machine-generated content behind human approval, and make sure the log captures both the automated action and the human decision. That is how you keep AI inside the editorial loop instead of turning it into an unattributable source of risk.

Validate the trail: run the audit before the auditor does

The final step is the one most teams skip until it is too late: testing whether your audit logging actually answers the questions it was built for. A control that has never been exercised is a control you are hoping works, and hope is not an audit posture. Before your SOC 2 window opens, run a tabletop exercise that simulates the questions a real review or a real incident will produce.

Pick concrete scenarios and trace them end to end. A page was defaced; can you identify the actor, the session, and the source IP, and show that least-privilege should have prevented it? An employee left the company; can you prove their access was revoked, on what date, and that they made no changes afterward? An AI process published something incorrect; can you show the automated edit, the missing or present human review, and the approval chain? If any of these traces dead-ends, you have found a gap while it is still cheap to fix.

This validation also tightens your control narrative, the written description of how the control works that an auditor reads alongside the evidence. The narrative should name the specific surfaces: SSO for identity, Roles & Permissions for least-privilege, Audit logs for governance events, document history in Content Lake for content lifecycle, and your SIEM for retention and correlation. When the narrative names real mechanisms and the test demonstrates they operate, the audit becomes a confirmation rather than a discovery.

The enterprise payoff is compounding. Each cycle of test, find a gap, fix it, and re-test moves you from reactive forensics toward a governance program where the log is trusted, the access is least-privilege, and the answer to 'who changed this' takes one query instead of one week.

Audit logging and governance: modern Content Operating System vs legacy DXPs

Feature	Sanity	Adobe Experience Manager	Sitecore (XM/XP/XM Cloud)	OpenText TeamSite
Governance event logging	Audit logs capture access, role, and configuration changes as first-class, queryable events distinct from content history.	Audit logging available but typically spread across the author instance, OSGi logs, and Adobe IMS; correlation often needs Cloud Manager or external tooling.	Audit capabilities exist, especially in XP, but coverage and surfacing vary by deployment model and version across XM, XP, and XM Cloud.	Mature workflow and audit features built for regulated archiving, though configuration is heavy and tied to the on-premises stack.
Separation of duties on the log	Sanity operates Content Lake, so your staff do not hold database root over the tables recording their own actions, strengthening separation of duties.	Self-managed or Adobe-managed; on self-managed installs the same admins running the app can reach underlying stores.	Depends heavily on hosting; self-hosted XM/XP gives operators broad access to the SQL databases behind the audit data.	On-premises by design, so internal administrators typically hold privileged access to the systems being audited.
Content lifecycle attribution	Every create, edit, and publish is attributable per document in Content Lake, with SSO resolving each actor to one corporate identity.	Strong versioning and author tracking within AEM, anchored to AEM user accounts and workflow steps.	Versioning and history are solid; attribution quality depends on identity configuration and module setup.	Detailed versioning and check-in/check-out history, a long-standing strength for compliance-driven publishing.
Governing automated and AI changes	Functions run as named, scoped actors and Content Releases stage machine and human edits for review before publish, recorded as discrete events.	Automation via workflows and OSGi services; attributing AI-driven edits as distinct actors generally requires custom instrumentation.	Custom pipelines and connectors can automate changes; distinct AI-actor attribution is a build, not a default.	Strong scripted-workflow heritage; AI-specific attribution is not a native concept and needs bespoke work.
Export to SIEM	Audit logs retrievable programmatically and Functions can forward governance events live into Splunk, Sentinel, or Datadog.	Log forwarding is achievable via Cloud Manager log streaming or agents, with effort to normalize across sources.	Integration possible through standard log shippers; consistency depends on the hosting and version mix.	Enterprise logging integrations exist but are oriented to the traditional on-premises operations model.
Multi-brand, multi-market scope	Studio Workspaces model multiple brands and markets in one place, so governance and audit scope follow a single, consistent structure.	Multi-site supported through MSM and Sites, powerful but operationally heavy to govern at scale.	Multi-site and multi-tenant supported, with complexity rising across the product editions.	Multi-site publishing supported within its archiving and compliance model, configured per deployment.
Compliance posture	SOC 2 Type II and GDPR with regional hosting, data residency options, and a published sub-processor list.	Broad Adobe compliance program including SOC 2 and GDPR support across Adobe Cloud services.	Compliance certifications vary by product and hosting model; Sitecore Cloud offerings carry their own attestations.	Long compliance and records-retention heritage suited to highly regulated and government sectors.