The Next AI Arms Race Isn’t About Models — It’s About Data That Fights Back
Most AI "security" conversations still retrofit Web-2, perimeter-first thinking onto model-centric stacks. That approach cannot scale to a world of agentic systems, federated data, and adversaries who exploit the weakest dependency in sprawling supply chains. The next competitive frontier is data that defends itself: policy-aware, provenance-rich, privacy-preserving, and explainable by construction.
Share this post

The Next AI Arms Race Isn't About Models — It's About Data That Fights Back
Open Code Mission Quarterly Executive Briefing on Data Sovereignty in the Age of AI (Q4). Author: Graham dePenros | Executive Chairman, Open Code Mission (OCM)
Executive Summary
Most AI "security" conversations still retrofit Web-2, perimeter-first thinking onto model-centric stacks. That approach cannot scale to a world of agentic systems, federated data, and adversaries who exploit the weakest dependency in sprawling supply chains. The next competitive frontier is data that defends itself: policy-aware, provenance-rich, privacy-preserving, and explainable by construction.
This briefing argues for re-architecting around self-protecting data; information objects that carry, enforce, and audit their own usage policies wherever they go. The goal is to shift trust inside the data rather than around it. We set out the operating principles, technical enablers, economics, change plan, and regulatory alignment. We also outline, at a high level and without IP exposure, how Open Code Mission (OCM) is executing this vision through a memory; and provenance-centric AI appliance architecture.
Five takeaways for boards and executive teams:
- Perimeters are necessary but insufficient. Agentic AI, multi-cloud, and partner ecosystems demand portable trust attached to the data itself.
- Move from "model performance at any cost" to "explainable performance." Provenance, auditability, and interpretable decisions must be first-class requirements, not post-hoc overlays.
- Treat data as an autonomous actor. Embed access control, consent, lineage, and risk signals into the object and protocol layers, not just the application tier.
- Anchor on standards & PETs (Privacy-Enhancing Technologies). Differential privacy, confidential computing, verifiable credentials (DIDs/VCs), content authenticity (C2PA), and policy languages (e.g., ODRL/XACML) turn principle into practice.
- Measure what matters. Optimise for cost per trustworthy decision, time-to-evidence, and audit completeness; not raw benchmark lift.
1. Market Context: Beyond the Model-Centric Bubble
In recent quarters, commentary and capital have crowded around model benchmarks, parameter counts, and "frontier" rhetoric. Meanwhile, real risk sits elsewhere: how data is gathered, governed, moved, combined, and actioned. When models automate decisions about credit, care, critical infrastructure, or conflict, explainability, provenance, and enforceable usage constraints decide whether value is realised or revoked.
Regulatory and enterprise practice are converging on this reality:
- The EU AI Act classifies high-risk uses and demands traceability and oversight.
- NIST AI RMF 1.0 reframes trustworthiness as a lifecycle property: governable, measurable, and auditable.
- ISO/IEC 42001 (AI management systems), and
- IEEE P7000-series raise the bar for ethically aligned design. Across these, a single idea repeats: trust must be engineered, evidenced, and enforced.
2. Problem Statement: Automating Yesterday's Weaknesses
Retrofitting "AI security startups" on top of legacy architectures delivers diminishing returns. We still see:
- Perimeter bias: Controls assume static boundaries; data and agents are boundaryless.
- Overlay thinking: Explainability, provenance, and consent are bolted on, not embedded.
- Opaque data pipelines: Training/test data hygiene is under-incentivised and under-evidenced.
- Reactive assurance: Evidence is assembled after the fact, not produced at decision time.
Result: More automation, same fragility. We are scaling decision-making faster than we are scaling verifiability.
Figure 1: From Perimeters to Portable Trust is a concept diagram showing the shift from perimeter-based AI security to self-protecting, policy-aware data with portable trust arcs.

Diagram contrasting Perimeter-First (locks around DATA) with Self-Protecting Data (DATA with shield and portable trust arcs).
3. Strategic Thesis: Data That Fights Back
Re-define data as self-sovereign, policy-aware, and context-preserving. Instead of a passive object guarded by external tools, data becomes an active participant that:
- Expresses allowed purposes, recipients, jurisdictions, and retention.
- Enforces those policies cryptographically and through runtime checks.
- Audits every access, transform, and downstream decision.
- Explains provenance and influence on model outcomes ("every fact has a receipt").
This is not a marketing slogan; it's an architectural shift.
4. Design Principles for Self-Protecting Data
- Provenance-first: Every datum must have verifiable lineage (source, consent, transformations, quality).
- Policy-carrying objects: Usage rights travel with the data (sticky policies), not just in the app.
- Least disclosure: Default to data minimization and purpose binding; share proofs where possible, not payloads.
- Explainability by construction: Decisions must be replayable with human-readable rationales and machine-readable traces.
- Portability of trust: Trust controls survive API boundaries, clouds, and agents.
- Composable assurance: Evidence is produced as a by-product of normal operation (no audit theater).
- Fail-safe autonomy: When signals conflict, data objects decline or de-identify by default.
Figure 2: Self-Protecting Data Lifecycle (ingest → transform → infer → explain → audit)

Diagram titled Self-Protecting Data Lifecycle showing five blue blocks: Ingest → Transform → Infer → Explain → Audit.
5. Technical Enablers (Without IP Exposure)
- Cryptographic identity & consent: DIDs/VCs for subjects, devices, and datasets; cryptographic proofs of consent and purpose.
- Sticky policies & rights expression: ODRL, XACML, and policy engines that sit close to data planes; OPA/Rego for consistent enforcement.
- Provenance & content authenticity: W3C PROV, supply-chain attestations (e.g., in-toto), and media/content credentials (e.g., C2PA).
- Privacy Enhancing Technologies (PETs): Differential Privacy (Dwork), Secure Multi-Party Computation, Homomorphic Encryption, and Federated Learning to compute on data without unwanted disclosure.
- Confidential computing: Enclave-based execution (e.g., TEE/SGX-class) to enforce policy inside hardware-assured environments.
- Data contracts & quality gates: Declarative schemas and data quality SLAs (Data Mesh patterns) that block inputs that possess inadequate provenance.
- LLM guardrails & red-teaming: Formalised jailbreak and prompt-injection testing, retrieval isolation, output filtering, and assurance cases linking controls to risks.
- Runtime observability: Decision logs, feature attributions, and model cards that bind to provenance and policy records.
Open Code Mission (OCM) Note (non-IP): The Open Code Mission AI appliance-level approach - OS Mission V1.0 - treats memory, provenance, and audit as co-equal to inference. Data objects are bound to cryptographic identity and policy artefacts; agents and pipelines are instrumented to produce explainable traces continuously. Specific implementations are proprietary; the principles above describe the intent.
6. Economics: The Business Case for Explainable Performance
Table 1. From Benchmark Lift to Trust-Adjusted Value

Table comparing Legacy vs Trust-Centric AI metrics across Model KPI, Risk, Opex, Time-to-market, and CAC/LTV, with why each matters.
Two more levers matter to CFOs:
- Time-to-evidence (TTE): Days/hours to produce regulator-grade logs from a decision.
- Cost per trustworthy decision (CPTD): Total run + assurance cost divided by audited decisions delivered.
7. Risk & Compliance Alignment
- EU AI Act: High-risk systems must be transparent, traceable, and monitored; data governance is explicit.
- NIST AI RMF 1.0: Governance, Map, Measure, Manage—aligns with provenance-first controls and measurable risk treatment.
- ISO/IEC 42001 & 27001: AI and security management systems for repeatable, audit-friendly practice.
- UNESCO & OECD frameworks: ethical alignment, human rights, and accountability.
Self-protecting data shortens the gap between policy and practice, turning legal requirements into software artefacts and runtime evidence.
8. Operating Model: From Controls to Capability
Table 2. Capability Map (12–18 months)

Table showing capability roadmap from Q0–Q2 Build to Q3–Q4 Scale across provenance, policy, PETs, explainability, compute, and red-teaming.
Open Code Mission (OCM) Note (Non-IP): Our appliance model packages these capabilities into a single operational surface so teams can move from PoC to assured production without accumulating governance debt regardless of the technical debt they still owe. The new approach will repay that legacy technical debt swiftly.
9. Implementation Blueprint (First 365 Days)
Phase 1 (0–90 days): Foundation
- Establish a data constitution: purpose binding, minimisation, and audit expectations.
- Deploy a provenance store and bind it to critical pipelines.
- Stand up a policy engine and attach to at least one high-value flow.
- Define assurance KPIs (TTE, CPTD, audit completeness).
Phase 2 (90–180 days): PETs & Explainability
- Pilot DP for analytics and federated learning with partners.
- Require model cards and feature attribution for high-risk models.
- Begin agent-safety hardening (prompt-injection tests, tool isolation, output filters).
Phase 3 (180–365 days): Scale & Certify
- Expand sticky policies to cross-organisational data flows.
- Move sensitive workloads to confidential compute defaults.
- Operationalise a continuous red-team/assurance pipeline.
- Prepare for third-party attestation and regulatory simulation.
10. Case Vignettes (Illustrative)
- Healthcare: Clinical triage explanations bind to patient consent proofs. Audit time drops from weeks to hours; regulator inquiry addressed with replayable evidence.
- Financial services: Decision records embed lineage of KYC data; model retraining blocked when provenance gaps exceed thresholds. Model risk posture improves; time-to-approval for new products shortens.
- Industrial/IoT: Edge agents execute within enclaves; telemetry is hashed and signed; policy-aware data refuses export outside permitted jurisdictions.
DP — Differential Privacy
Purpose: Protects individual data points by injecting statistical noise so that results reveal patterns without exposing any single record.
Use Case: When data must be analyzed across sensitive datasets but raw values cannot be revealed.
Key Trade-off: Accuracy vs. privacy—more noise means stronger privacy but weaker precision.
Example: Releasing aggregate insights on medical trials or mobility trends without identifying individuals.
FL — Federated Learning
Purpose: Enables model training across multiple devices or organizations without centralizing raw data.
Use Case: When data can stay local, but learned parameters (gradients, weights) can be shared securely for joint model improvement.
Key Trade-off: Communication cost and model drift vs. privacy gain.
Example: Hospitals collaboratively train a diagnostic model without sharing patient data.
HE — Homomorphic Encryption
Purpose: Allows computation to be performed directly on encrypted data without decrypting it.
Use Case: When full privacy must be preserved during computation, even from the processor or cloud provider.
Key Trade-off: Very high computational cost, limited operation types (though improving).
Example: A bank performs encrypted risk analysis on client portfolios stored in the cloud.
MPC — Secure Multi-Party Computation
Purpose: Enables multiple parties to jointly compute a function without revealing their private inputs to one another.
Use Case: When datasets cannot be centralized but must contribute to a shared computation or analytic.
Key Trade-off: Coordination and latency overheads.
Example: Pharmaceutical firms jointly analyze trial results without exposing proprietary data.
In short:
- DP protects individual records.
- FL keeps data local but shares insights.
- HE keeps data encrypted during computation.
- MPC lets parties compute together privately.
Each is a Privacy-Enhancing Technology (PET) suited to different balances of trust, compute intensity, and data sovereignty.
Figure 3: PETs Decision Tree

Flowchart titled PETs Decision Tree showing paths leading to DP, FL, HE, and MPC based on data sharing and computation needs. DP - Differential Privacy; FL - Federated Learning; HE - Homomorphic Encryption; MPC - Multi-party Computation
11. OCM's Approach (High-Level, No IP Bleed)
OCM's mission is to make memory, provenance, and assurance co-equal with inference:
- Memory-centric orchestration: Systems remember what was known, when, and why a decision was made.
- Verifiable provenance: Every datum has a receipt; transformations are logged; consent and purpose are cryptographically tied to usage.
- Policy-aware data paths: Usage controls travel with data and are enforced at runtime across clouds, agents, and partners.
- Explainability-by-default: Decisions yield human-readable rationales linked to machine-verifiable traces.
- Operational packaging: Delivered as an appliance to accelerate time-to-assurance, reduce integration risk, and keep governance in lockstep with scale.
12. Metrics & Governance
Table 3. Trust Metrics & Targets

Table listing trust KPIs with definitions and targets: TTE, CPTD, Audit Completeness, PETs Coverage, and Red-Team Gate Pass.
13. Board Questions to Ask This Quarter
- Which decisions in our portfolio require explainability by design?
- Can we produce decision-time evidence within 24 hours?
- Where do usage policies travel with data, and where are they static?
- Which PETs are in production (not pilots)?
- How do we measure trust-adjusted ROI beyond raw model lift?
14. Conclusion: From Perimeters to Portable Trust
The next arms race is not a bigger model; it's a better contract between data, decisions, and duty of care. Organisations that embed trust inside the data - through provenance, policy, privacy, and proof - will ship faster, face fewer regulatory shocks, and command a trust premium in the market. Those that keep bolting controls onto Web-2 assumptions will automate risk, not reduce it.
Build for explainable performance. Make your data fight back.
References
I have tried to avoid paywall content. The following are the exceptions:
- Dwork, C. (2008) 'Differential privacy: A survey of results', in Theory and Applications of Models of Computation (TAMC 2008): Chapter exists via Springer (behind paywall).
- ISO/IEC (2023) ISO/IEC 42001: Artificial intelligence — Management system: Purchase required.
C2PA (2025) Coalition for Content Provenance and Authenticity (C2PA). Available at: https://c2pa.org (Accessed: 23 October 2025).
Dwork, C. (2008) 'Differential privacy: A survey of results', in Theory and Applications of Models of Computation (TAMC 2008), Lecture Notes in Computer Science, vol. 4978, Springer, Berlin, pp. 1–19. Available at: https://link.springer.com/chapter/10.1007/978-3-540-79228-4_1 (Accessed: 23 October 2025).
European Commission (2024) Regulation (EU) 2024/1689 on Artificial Intelligence (AI Act), Official Journal of the European Union. Available at: https://eur-lex.europa.eu/legal-content/EN/TXT/?uri=CELEX%3A32021R0001 (Accessed: 23 October 2025).
IEEE Standards Association (2022–2024) IEEE P7000 Series: Standards for Ethically Aligned Design. Available at: https://standards.ieee.org/standard/P7000.html (Accessed: 23 October 2025).
ISO/IEC (2023) ISO/IEC 42001: Artificial intelligence — Management system. International Organization for Standardization. Available at: https://www.iso.org/standard/83797.html (Accessed: 23 October 2025).
ISO/IEC (2022) ISO/IEC 27001: Information security, cybersecurity and privacy protection — Information security management systems (ISMS). International Organization for Standardization. Available at: https://www.iso.org/standard/27001.html (Accessed: 23 October 2025).
NIST (2023) Artificial Intelligence Risk Management Framework 1.0. Gaithersburg, MD: National Institute of Standards and Technology. Available at: https://nvlpubs.nist.gov/nistpubs/SpecialPublications/NIST.SP.1270.pdf (Accessed: 23 October 2025).
OASIS (2013) eXtensible Access Control Markup Language (XACML) Version 3.0 — Core Specification. OASIS Standard. Available at: https://docs.oasis-open.org/xacml/3.0/xacml-3.0-core-spec-os-en.html (Accessed: 23 October 2025).
ODRL Community Group (2018) Open Digital Rights Language (ODRL) Policy Model: W3C Recommendation. Available at: https://www.w3.org/TR/odrl-model/ (Accessed: 23 October 2025).
Shafran, I., Amershi, S., D’Amour, A., Dean, J., and Mitchell, M. (2022) 'Model cards and governance for trustworthy AI', Communications of the ACM, 65(7), pp. 56–64. Available at: https://dl.acm.org/doi/10.1145/3503212 (Accessed: 23 October 2025).
UK Information Commissioner’s Office (2023) Privacy-Enhancing Technologies Guidance. London: Information Commissioner’s Office. Available at: https://ico.org.uk/media/for-organisations/documents/4550793/privacy-enhancing-technologies-guidance.pdf (Accessed: 23 October 2025).
W3C (2013) PROV-O: The PROV Ontology. World Wide Web Consortium Recommendation. Available at: https://www.w3.org/TR/prov-o/ (Accessed: 23 October 2025).
W3C (2022) Decentralized Identifiers (DIDs) v1.0. World Wide Web Consortium Recommendation. Available at: https://www.w3.org/TR/did-core/ (Accessed: 23 October 2025).
Zuboff, S. (2019) The Age of Surveillance Capitalism: The Fight for a Human Future at the New Frontier of Power. New York: PublicAffairs. Available at: https://www.publicaffairsbooks.com/titles/shoshana-zuboff/the-age-of-surveillance-capitalism/9781610395694/ (Accessed: 23 October 2025).
Acknowledgements
With appreciation to the Open Code Mission team, all teams who are advancing responsible AI, and the standards bodies, privacy researchers, red-teamers, and the practitioners who insist that evidence must accompany intelligence.
Thanks as well to the OCM colleagues and partners who demonstrate daily that trust can be engineered without trading away performance.

