Part of the Deep Dive: Data Governance Webinar Series
Open source AI is often praised for transparency and scale—but too often overlooks one key question: who decides how data is governed?
This talk dives into how Tribal Nations across the U.S. are asserting sovereignty through technical infrastructure, community consent protocols, and legal frameworks—offering a vision for federated AI systems that center accountability and cultural relevance.
We’ll explore case studies including:
1. The Native BioData Consortium’s tribally governed biobank
2. The DataBack movement’s decentralized genomic sovereignty model
3. The implementation of CARE principles in AI pipelines
4. Jurisdiction-aware interoperability in public sector data sharing (e.g. BIA and NIH initiatives)
Attendees will leave with a set of reproducible patterns for embedding ethics into infrastructures well as insights into how co-creation, local governance, and sustainability can make our AI systems more just and open.
Video Transcript
Speaker: Sal Kimmich
Introduction
This talk explores patterns for building federated, consent-based AI systems through examples from genomics and tribal data sovereignty.
Key Topics Covered
1. Defining Sovereignty
Five Historical Dimensions:
- Authority: Who holds the right to govern
- Territory/Domain: Originally land, sea, air—now includes data, algorithms, and cyberspace
- Autonomy: Self-governance capabilities
- Recognition: Consent within systems
- Revocation: The right to opt-in and opt-out
Types of Sovereignty:
- Westphalian (1648): Nation-states recognizing each other’s borders
- Legal: Supreme authority given to specific legal bodies
- Popular: Derived from consent of the people
- Relational: Indigenous nations’ inherent right to self-governance
- Digital/Cyber: Control over digital infrastructure and data flows
2. Tribal Data Sovereignty Case Study
US Tribal Nations Context:
- 574 federally recognized tribes
- Legal status: “Domestically dependent nations”
- Limited sovereignty with recognized self-governance
- Direct federal relationships (bypassing states)
Tribal Security Grant Program (2023):
- Sovereign-to-sovereign relationship with federal government (DHS, CISA, FEMA)
- Unique regulatory space—more autonomous than cities/counties
- Direct federal support with ongoing compliance requirements
3. Genomics as a Sovereignty Issue
The Consent Problem:
- Western model: Individual informed consent
- Indigenous model: Communal decision-making
- Conflict creates data sovereignty violations
Havasupai Case (2024): Arizona State University collected blood samples for diabetes research, then repurposed the data without re-consent for:
- Schizophrenia rates
- Family-level genomic mapping
- Human migration studies
This violated communal consent and created exploitation risks for an already vulnerable population.
Response: Native BioData Consortium (NBDC)
- First indigenous-led nonprofit biobank in US
- Run by indigenous scientists and tribal members
- Operates within tribal jurisdictional bounds
- Keeps data physically isolated (“closed data system”)
4. Four Key Patterns for Sovereign Data Infrastructure
Pattern 1: Community-Governed Infrastructure
- Permissioned governance
- Physical data restrictions when necessary
- Modular permissions and auditable access logs
Pattern 2: Federated AI Pipelines
- Decentralized learning with no raw data sharing
- Data stays in community pods
- Contributes to AI models through federated learning (TensorFlow Federated, PySyft)
- Key insight: Openness ≠ centralized availability
Pattern 3: Ethical Metadata Layers
- Data ethics files with consent tokens
- FAIR principles (Findable, Accessible, Interoperable, Reusable)
- CARE principles (Collective benefit, Authority to control, Responsibility, Ethics)
- Consent tokens queryable via APIs
- Policy embedded in infrastructure
Pattern 4: Jurisdiction-Aware APIs
- FPIC flags (Free, Prior, Informed Consent)
- Policy-aware endpoints
- Transparency for developers about data origins
5. Zero Trust Data Infrastructure
Core Concepts:
- Trust Boundaries: Lines where trust levels change between components
- Verifiable Trust: Replacing institutional trust with cryptographic proof
- Confidential Computing: Protecting data during use (Intel SGX, AMD SEV, ARM CCA)
Nation-to-Nation Infrastructure:
- Requires secure isolation of workloads
- Workload identity constrained in time and execution
- Example: Italy-Switzerland-France sovereign cloud collaboration
Requirements for Sovereign Cloud:
- Data sovereignty (data protected in use)
- Operational sovereignty (transparency into platform operations)
- Technical sovereignty (continuous auditing with extended supervision)
6. Recommendations
For Data Commons:
- Enforceable residency guarantees
- Operator separation
- Trusted execution environments
For Open Source:
- Transparency as the essential pillar
- Zero trust boundaries for sovereign entity engagement
- Consent as an architecture problem (APIs, modular design, smart defaults)
Conclusion
Building sovereign-by-design systems requires embedding governance into infrastructure, not treating it as an add-on. The future of secure, federated systems depends on moving from operational trust to cryptographically verifiable trust while respecting diverse sovereignty models.
Resources Mentioned:
- Book: Code, Chips and Control (pre-order available)
- Indigenous Data Sovereignty Network Summit (2026)
