About SCAMPER
SCAMPER aims to overcome the limitations and vulnerabilities of federated identity management and self-sovereign identity (SSI) by building upon a privacy-preserving multifactor distributed authentication solution, while guaranteeing adherence to EU legislation. The solution fully aligns with the GDPR and eIDAS regulations and employs distributed storage mechanisms to provide a reliable user authentication service. The main aim in this regard is to provide good usability, revocability, and accuracy and guarantee the highest privacy preservation standards.
1. Introduction and Goals
This document describes the software architecture for the Secure Data Vault (SDV) component of the SCAMPER project (Work Package 4).
The SDV serves as the decentralised personal cloud vault that is controlled by the person’s edge wallet. Unlike traditional cloud storage solutions, the SDV is designed according to the principles of self-sovereign identity (SSI) giving the user full control and ownership over their data. The SDV is designed to minimise risk in case of an untrusted provider of the SDV, prioritising data portability and cryptographic access control.
This architecture is inspired by the SOLID project and includes additional measures to ensure privacy and security for storing Personal Data. The outcome is the design of a Secure Digital Data Vault that is compatible with SSI systems, offering a decentralized storage solution where users maintain full control over their digital identities and sensitive information. The design will consider utility and adoptability while adhering to strict requirements regarding privacy and security. This outcome will not only support privacy-preserving digital identity management but will also pave the way for advancements in secure, user-owned data storage solutions.
In summary, we aim to achieve this objective through a Secure Data Vault that aligns with the latest privacy standards, empowers users to control their sensitive information autonomously, and supports compatibility with SSI.
1.1. Requirements Overview
The architecture is driven by the specific constraints of the SCAMPER Work Package 4 (WP4).
| ID | Description |
|---|---|
REQ-WP4-1 |
Client-Side Encryption |
REQ-WP4-2 |
Capability-Based Access Control |
REQ-WP4-3 |
Separation of Identity and Storage |
REQ-WP4-4 |
Multi-Party Encrypted Sharing |
REQ-WP4-5 |
Cryptographic User Control |
REQ-WP4-6 |
User Unlinkability |
1.2. Quality Goals
The Quality Goals for the Secure Data Vault focus on the resilience, security and privacy of the storage mechanism itself.
| Goal | Motivation |
|---|---|
Confidentiality |
The storage provider is treated as an untrusted party and a potential adversary to minimize the risk in this scenario. A malicious administrator at the SDV provider with database permissions can read or write encrypted, but should not be able to gain access to any private data of the user. They should only be able to disrupt services but not break the confidentiality. |
Integrity |
Users should not rely on trust in their SDV provider that data in the SDV has not been tampered with. Instead, only the user themselves can grant cryptographic access to data, as long as there private key has not been compromised, they can trust that only third parties whom they gave access can read/write data. |
Performance |
The Vault is used during real-time verifications. Availability and latency are paramount in making the SDV commercially viable. |
Portability |
The architecture must avoid vendor lock-in for the storage layer. A user needs to have full control over where their data is stored and what data is stored. It should be easy to migrate from SDV provider. |
1.3. Stakeholders
| Role/Name | Example | Expectations |
|---|---|---|
Data Owner |
End User |
Expects their data to be accessible only to them and third parties they granted access. Expects easy data migration and access revocation. Expects easy use of edge wallet implementation that aligns with the current state of the art. |
Service Provider |
Bank, Employer, Hospital, … |
Expects a standard JSON-RPC API to fetch and manipulate data. Expects high availability: if the User grants them access, the Vault should serve the file even if the User is offline. |
Storage Host |
SCAMPER Registry |
Expects to host data with no liability. They want an architecture that mathematically proves they cannot see the data, protecting them from GDPR data processing obligations. Expects easy setup and deployment. |
DID Registry |
Public Blockchain |
Expects data owners and service providers to follow best practices in using the current existing did registries. |
VC Issuer |
Government / Employer |
Expects to be unlinkable to VC’s issued to data owners when service providers request to validate those VC’s. |
2. Architecture Constraints
The Secure Data Vault and Edge Wallet architecture is bound by strict requirements regarding biometric security, legal compliance (GDPR), and technical interoperability. These constraints limit the design choices but ensure the final system is secure, compliant, and usable in industrial contexts.
2.1. Technical Constraints
These constraints concern the hardware, software, and cryptographic primitives we must use.
| Constraint | Explanation |
|---|---|
TC-1: Biometric Key Derivation |
The system is built upon key derivation from biometrics. The SDV design assumes that the user can create a cryptographic master key from their biometrics and can also recover that key in case of Edge Wallet loss. The Edge Wallet will thus be constrained to devices that support the specific biometric input used for key derivation. |
TC-2: Mobile-First Edge Computing |
All sensitive cryptographic operations must occur on the Edge Wallet (likely a Smartphone). The architecture cannot rely on server-side processing for cleartext data. Target devices must be able to process cryptographic operations efficiently and securely. To manage sensitive cryptographic keys, Edge Wallets need to have a secure hardware module (e.g., Secure Enclave on iOS, TrustZone on Android). |
TC-3: W3C Standardization |
The architecture must adhere to W3C standards as much as possible to ensure interoperability. This will drive architectural decisions, prioritising open source libraries and protocols that are widely adopted in the SSI ecosystem. The core standards include: |
TC-4: Post-Quantum Readiness |
Given the long lifespan of identity data, the cryptographic layer must be designed using Post-Quantum Cryptography (PQC) algorithms (e.g., Dilithium, Kyber) without compromising compatibility with existing systems. |
2.2. Organizational Constraints
These constraints involve the project team, stakeholders, and development process.
| Constraint | Explanation |
|---|---|
OC-1: Open Source License |
The core components (Wallet SDK, Vault Connector, UCAN Validator, …) must be released under a permissive Open Source license (e.g., Apache 2.0 or MIT) to follow the principles of self-sovereign identity and encourage community adoption. |
OC-2: No Vendor Lock-in |
The architecture must not depend on proprietary cloud features (e.g., AWS KMS, Azure AD). The "Secure Data Vault" must be deployable on any generic object store. |
OC-3: Auditability |
All access to the Data Vault must leave a tamper-evident audit trail. However, this audit log itself must preserve privacy. |
3. Context and Scope
The Secure Data Vault (SDV) acts as the user’s sovereign agent in the cloud. It is a dual-zone storage system:
-
Public Zone (Cleartext): Stores non-sensitive, unlinkable technical data to facilitate discovery and decentralized connection establishment.
-
Private Zone (Encrypted): Stores sensitive Personal Data, Verifiable Credentials (VCs), or other user data, encrypted entirely at the edge.
The SDV operates within a hostile environment (the SDV provider itself is assumed to be untrusted for the sake of designing a secure system) but must serve data to trusted parties via standard protocols.
3.1. Business Context
The Business Context describes the information flow between the SDV and its stakeholders. The core principle is User-Centric Control with Privacy by Design.
| Communication Partner | Inputs (Sent to SDV) | Outputs (Received from SDV) |
|---|---|---|
Data Owner + (User Wallet) |
Encrypted Blobs (VCs, Private Notes) + Public Data (DID Docs, Public Keys) + UCAN Tokens (Delegation of rights) |
Synchronization (Fetching latest state) + Audit Logs (Who accessed my vault?) + Outstanding permissions (Signed UCANs) |
Service Provider + (Data User) |
Access Token (UCAN for authorization) + Read Request (Specific resource path) |
Requested Data (Encrypted file or specific VP) + Note: The Verifier must possess the decryption key (shared via |
Service Provider + (eg. VC issuer) |
Personal Data of Data Owner (as encrypted blobs) + Verifiable Credentials of Owner (as encrypted blobs) |
- |
Data Provider + (Discovery / Resolver) |
Discovery Request (Lookup via HTTP GET) |
Public Technical Data (DID Document, Service Endpoint) + Constraint: This data must be unlinkable to the physical person (Req-WP4-6). |
Storage Host +(Infrastructure Provider) |
Hosting Resources for Secure Data Vault |
Usage Metrics (Storage quota, Bandwidth) + Liability: The Host cannot see Private Data content. |
3.2. Technical Context
The SDV is exposed as a web-accessible storage service (similar to a Solid Pod or S3 Bucket) but secured via UCAN (User Controlled Authorization Networks) instead of traditional OAuth/Accounts.
It differentiates technical interfaces based on the sensitivity of the data zone.
| Interface | Channel / Protocol | Description |
|---|---|---|
Public Read API |
|
Serves data from the |
Protected Data API |
|
Accesses the User’s private storage via signed UCAN 1.0 invocation tokens. |
Replication / Backup |
|
Interface for migrating the entire vault to a new provider. |
| Interaction | Technical Implementation |
|---|---|
Publishing Identity |
User → SDV: JSON-RPC invocation |
Backing up Credentials |
User → SDV: JSON-RPC invocation |
Sharing with Bank |
Bank → SDV: JSON-RPC invocation |
3.3. Deployment Context
TODO: in development…
4. Solution Strategy
The solution strategy for the Secure Data Vault (SDV) rests on three fundamental pillars: Edge-Side Encryption, Capability-Based Authorization, and Dual-Zone Storage.
These strategies ensure that the Vault remains a passive storage utility while the intelligence and security controls reside entirely with the user.
4.1. Strategic Decisions
The following table maps the project’s key goals to the specific architectural strategies chosen to achieve them.
| Goal | Strategy | Details |
|---|---|---|
Confidentiality |
Client-Side Encryption |
The SDV stores data as opaque binary blobs. PQC Encryption is used with keys derived from the user’s biometric wallet ( |
User Control |
UCAN Authorization |
We replace traditional Access Control Lists (ACLs) with User Controlled Authorization Networks (UCAN). Access rights are embedded in cryptographically verified tokens (JWTs) issued by the user, not permissions tables stored in the server database. The user can grant access to a third party (Verifier) without the SDV’s intervention. |
Discoverability vs. Privacy |
Dual-Zone Storage |
We separate data storage based on sensitivity but allow Public Data to index Private Data. The Public Zone stores unencrypted data that cannot be used to identify the user. The Private Zone stores encrypted Credentials and Files. Public data can be used to index private data by linking these nodes without revealing the identity of the user to the vault. |
Portability |
Decoupled Identity & Storage |
The system uses DIDs (Decentralized Identifiers) as the primary user key, rather than an account/password managed by the provider. The user interacts via |
Unlinkability |
Derived Identity |
The system derives new DIDs (Decentralized Identifiers) to share with third parties, rather than sharing their primary key. The user’s activity cannot be tracked across different providers/vaults. |
4.2. Technology Decisions
The following technology stack was selected to implement these strategies, prioritizing open standards and browser/mobile compatibility.
| Component | Technology | Rationale |
|---|---|---|
Identity |
|
Used for private relationships between Users and Verifiers. It leaves no public footprint on any blockchain, ensuring maximum privacy (unlinkability). A peer DID essentially is a custom keypair used by both the User and the Verifier for a private relation, and public keys / DID documents are shared through third party channels (e.g., QR codes on first relation establishment). |
Authorization |
UCAN (User Controlled Authorization Networks) |
Unlike OAuth2, UCANs support offline delegation. A user can generate a token for a Verifier (e.g., a Bank) directly from their phone without needing the Vault to be online at that exact moment. This makes migration of data easier as access control configuration does not need to be migrated. |
Encryption |
[Decision Pending] |
The specific cryptographic primitive is currently under evaluation. Most likely, we will combine a post quantum cryptographic protocol with a classical protocol to ensure PQ readiness and mitigate against the risk of novel algorithm compromise. The most common modal for this is X25519 and ML-KEM (Kyber 768) combined into a hybrid KEM encryption: X25519MLKEM768. We intend to follow such industry standards. |
Storage Backend |
Graph Database (RDF / Triple Store) |
Aligns with Solid specifications and enables rich queries. Instead of a flat file system, data is stored as Linked Data. This allows users to query metadata without the server needing to understand the encrypted content itself. Metadata is stored in the graph (Public Zone), payload is the encrypted blob (Private Zone). |
API Surface |
JSON-RPC 2.0 over HTTPS |
UCAN 1.0 invocation tokens are self-contained, signed RPC calls where |
4.3. Top-Level Decomposition Pattern
We follow the "Smart Client, Dumb Server" architectural pattern. The SDV Acts as a high-availability "Digital Locker." It enforces who can open a locker (via UCAN check) but has no visibility into what is inside. It has mechanisms to facilitate finding data. The Client (Wallet) handles all business logic, data validation, encryption, and key management. Additionally, some of the more heavy compute could be offloaded to the SDV by using TEE on the server. Through attestation of the SVD TEE, a user wallet could delegate some heavy computation to the server. We will only rely on this for rare cases as this does put more trust in the SDV provider.
5. Building Block View
The Building Block View shows the static decomposition of the system. It explains the code structure and the high-level functional modules.
5.1. Level 1: Overall System
At the highest level, the Secure Data Vault (SDV) is a standalone server application that exposes HTTP interfaces to the outside world. It interacts with three primary external actors: the Data Owner (Wallet), the Verifier, and Public Entities.
- Motivation
-
The system is decomposed into a single deployable unit (the SDV) to ensure portability. The internal complexity is hidden behind a standard JSON-RPC API. The storage layer is abstracted so the SDV can run on local disks, cloud object stores, or decentralized networks.
- Contained Building Blocks
| Name | Responsibility |
|---|---|
Secure Data Vault (SDV) |
The core system. Handles authentication, request routing, validation of UCANs, and data persistence. |
Edge Wallet |
The primary client. Handles all encryption, key management, and UCAN issuance. |
Physical Storage |
The underlying infrastructure where bytes are saved (e.g., AWS S3, local filesystem, or a hosted Graph Database). |
- Important Interfaces
-
-
Public API: Open access for
did:webDocuments and Biometric Helper Data (Read-Only for public). -
Protected API: Authenticated access (via UCAN) for encrypted User Data.
-
5.2. Level 2: Internal Structure of the SDV
Here we zoom into the Secure Data Vault to see its internal software architecture.
The SDV follows a Layered Architecture:
1. Interface Layer: Handles HTTP and Routing.
2. Security Layer: Validates Authorization (UCAN).
3. Domain Layer: Logic for Linked Data and Blob management.
4. Persistence Layer: Translates domain objects to database calls.
5.2.1. API Gateway (Interface Layer)
The entry point for all incoming traffic. It handles JSON-RPC dispatch: routing JSON-RPC 2.0 method calls to the correct domain handler based on the cmd field of the UCAN invocation token in the request body. It distinguishes between Public (unauthenticated HTTPS GET) and Private (JSON-RPC invocation) requests. While not implementing application-specific security controls, the API gateway presents the first line of defense against common attacks such as DoS, injection, and replay attacks. It also handles HTTP/2 and TLS termination.
5.2.2. UCAN Validator (Security Layer)
The security and permission engine heart of the vault. Since the SDV is "blind" to the data, it relies entirely on Delegated Authorization. The UCAN invocation token arrives as the JSON-RPC request body — there is no separate Authorization header. The validator verifies the EdDSA signature of the invocation token and every delegation token in its prf chain. Per request it validates:
- The time bounds (exp, nbf): Is the invocation still valid?
- The nonce: Has this exact invocation already been processed (replay protection)?
- The revocation list: Was any token in the delegation chain revoked by the Data Owner?
- The cmd: Does the invocation’s command match the pattern permitted by the delegation’s cmd field?
- The pol constraints: Do the invocation’s args (e.g. endpoint) satisfy all policy restrictions in the delegation chain?
- The sub: Does the invocation’s subject (sub) match the vault namespace being accessed?
5.2.3. Private User Data Store (Domain Layer)
This component manages the structured data and metadata of the vault, for the privately and encrypted data of the vault. This is the primary domain layer component that manages the private data of Data Owners and the potential relation between data elements. Where the data is not fully encrypted or public, the data store uses known data vocabularies and ontologies from the Solid or broader ecosystem. However, in many cases the data should be kept private from the server and thus becomes an encrypted data blob. To manage large chunks of encrypted data (e.g., personal videos) the Private User Data Store relies on the Blob Manager for scalable and performant blob data retrieval and streaming.
To make the most of the underlying linked data store for encrypted data, this component will rely on the concept of blind indexing to remain performant in retrieval but yet without any insight into the data it is storing.
5.2.4. Public Data Store (Domain Layer)
The Public Data Store handles data that can publicly be queried by participants in the ecosystem. This can include non-sensitive information about Data Owners or data elements they wish to share with the public, or it can be non-sensitive relations between data elements without revealing any of the underlying data. An example of this is the Biometric Helper Data for used for fuzzy matching that on it’s own does not reveal any identity information of the Data Owner. Since the amount of public data is expected to be relatively small, we do not foresee a need for scalable blob storage for this part of the architecture.
5.2.5. Blob Manager (Domain Layer)
Handles the storage of raw, large, encrypted binary files. Unlike the Graph Engine (which deals with metadata or data with encrypted 'fields'), the Blob Manager deals with the actual content (Verifiable Credentials, PDFs, Images). It treats every file as an opaque stream of bytes.
Whether the data is stored as blobs or as relational graph data is a choice by users of the system. To illustrate this we can take a Verifiable Credential that proves a Data Owner’s name, address, and age. This Verifiable Credential contains clear private data items that should always be stored encrypted, however it also contains metadata that could be stored unencrypted. Users of the system could decide to store the whole VC as an encrypted data blob, where the existence of individual fields or even metadata about the issuer is obfuscated. However, a user of the system could also decide to store only the final values of name, address, and age as encrypted data and to not obfuscate the structure and issuer metadata of the VC. Which choice is desirable depends entirely of the the use case and the type of data. As much data as possible should be stored encrypted, but for some use cases, having the metadata or data structure unencrypted can support more advanced use cases such as faster querying and data lookup, or statistic calculations.
5.2.6. Storage Adapters (Persistence Layer)
This is an abstraction layer that decouples the application from the physical storage technologies or providers. It manages either Blob data storage (File systems, Cloud storage buckets, Data lakes, …) or Graph data storage (Apache Jena, Neo4j, Blazegraph, Amazon Neptune, …) with specific adapters. The goal of this layer is to be technology-agnostic such that the SDV can run on different physical stacks or data storage technologies. New adapters can be added for different technologies to achieve maximal portability.
The Blob Storage Adapter is relatively simplistic in that it should be able to store arbitrary data streams with a unique unlinkable identifier. The Graph Storage Adapter will abstract graph data technologies to be able to store RDF data and run SPARQL-like queries.
5.3. Level 3: Details of Key Blocks
5.3.1. UCAN Validator
The UCAN Validator is complex enough to warrant a closer look. It does not just check a password, it validates a Chain of Trust.
Responsibility:
1. Signature Checker: Ensures the invocation token and each delegation in the prf chain was signed by the key it claims.
2. Root DID Resolver: Fetches the public key of the issuer from the iss DID (no external registry needed for did:key / did:peer) to verify the EdDSA signature.
3. Command Matcher: Verifies the invocation’s cmd (e.g. /doc/read) is permitted by the delegation’s cmd pattern (e.g. /doc/ or /) and that args satisfy the delegation’s pol constraints (e.g. endpoint allowlist).
6. Runtime View
The runtime view describes the concrete behavior of the system. The SCAMPER Secure Data Vault relies on a "Smart Client" approach, meaning most complexity (Encryption, Key Management, Authorization) happens on the User’s device before data ever reaches the server.
The following scenarios cover the lifecycle of a user’s data vault: from creation to sharing and revocation. The architecture describes 5 concrete runtime scenarios in step-by-step sequence diagrams:
-
User signup & Pod creation: Happens the first time a new user interacts with the system and sets up their data vault linked to their personal Wallet.
-
Storing personal data: A user stores private data on the SDV that can later be shared to third parties or retrieved by themselves.
-
Sharing data: A user shares data with a service provider by issuing UCAN tokens.
-
Retrieving data: A service provider reads data in the user’s vault for items they have gotten permission to.
-
Writing data: A service provider write or creates data in the user’s vault.
-
Revoking access: A user revokes access to the service provider with whom they previously shared data.
-
Synchronization: The user’s Wallet synchronizes with the Secure Data Vault for caching information and data of the user at the edge.
6.1. Scenario 1: User Initialization & Pod Claiming
This flow describes "Cryptographic Signup." Unlike traditional web apps, there is no central admin to approve an account. The user asserts their existence by generating keys and claiming a namespace/pod.
The SDV does not check a database table for users. It mathematically validates that the UCAN signature matches the DID for which the pod is created. If successful the SDV generates the basic necessities bootstrap a user pod, such as storage identifiers, or table entries in the graph database.
The Root UCAN is self-signed. It is the genesis delegation token that proves the user controls the private keys associated with the pod. The root delegation will look like this:
{
"iss": "did:peer:zROOT_VAULT_KEY...",
"aud": "did:peer:zROOT_VAULT_KEY...",
"sub": "did:peer:zROOT_VAULT_KEY...",
"cmd": "/*",
"exp": 9999999999,
"nonce": "a1b2c3d4",
"pol": [],
"prf": []
}
For every actual request to the SDV the User Wallet creates a per-request invocation token. The invocation token is the JSON-RPC call body and references the root delegation in its prf array:
{
"iss": "did:peer:zROOT_VAULT_KEY...",
"sub": "did:peer:zROOT_VAULT_KEY...",
"exp": 9999999999,
"nonce": "a1b2c3d4e5f6g7h8",
"cmd": "/vault/init",
"args": {
"peer_did": "did:peer:zROOT_VAULT_KEY..."
},
"prf": [
"eyJhbGciOiJFZERTQS...<Root Delegation UCAN above>"
]
}
The primary security concern for the SDV is a DoS attach in which many new, fake pods are requested for non real users. While not fully avoiding this attack vector, it is possible to mitigate the impact with classical IP-based rate limiting and routine cleanup of empty vaults.
6.2. Scenario 2: Storing Personal Data (Envelope Encryption)
This scenario details how the Vault stores data it cannot read. We use Envelope Encryption (Hybrid Encryption): data is encrypted with a symmetric key (DEK), and that key is encrypted for the user with the public key related to their DID.
A symmetric key (e.g., ChaCha20) should be used for the actual file content. Encrypted data is wrapped into a meta structure that contains info about the encryption and about the wrapping of the decryption key. This meta structure looks like this:
{
"dataEncryption": [
{
"did": "did:peer:zkey1",
"dek": "ENCRYPTED_DEK_BYTES",
}
],
"ciphertext": "BASE64_ENCODED_BYTES"
}
The invocation token sent as the JSON-RPC body places this encrypted structure in args.payload:
{
"iss": "did:peer:zROOT_VAULT_KEY...",
"sub": "did:peer:zROOT_VAULT_KEY...",
"exp": 9999999999,
"nonce": "a1b2c3d4e5f6g7h8",
"cmd": "/doc/create",
"args": {
"endpoint": "/blog/posts",
"headers": {
"content-type": "application/octet-stream"
},
"payload": {
"blob": "0xabcd..."
}
},
"prf": [
"eyJhbGciOiJFZERTQS...<Root Delegation UCAN>"
]
}
6.3. Scenario 3: Sharing Data with a Service Provider
To share data without revealing the user’s master key, the User Wallet performs a "Key Rewrapping" operation. It adds a new header entry encrypted for the Service Provider with whom they wish to share data. It is triggered if the user agrees to share or data with the service provider
To improve privacy, each data sharing relation between a Data Owner and a Service Provider should work with Peer DIDs. When a Data Owner and a Service Provider first initialize their 'relation', they generate new Peer DIDs derived from their Root DID. Each party generates one new DID and shares the public DID documents with the other party via other channels then the SDV (For examples through QR codes on the Service Provider onboarding website).
To avoid a range of attacks, it is important that the sharing of Peer DID documents happens through secure out-of-band channels. For example it is important to use a secure TLS communication channel to avoid well known attacks when sharing public keys (such as man-in-the-middle attacks).
For recovery purposes, we recommend a deterministic scheme to derive Peer DIDs from a Root DID, although this cannot be enforced by the SDV. A user needs to share it’s peer DIDs with the SDV such that the SDV can build aliases of a vault. Put simply, without this step, the user would leak it’s actual Root DID through the URI used for the vault. The SDV needs to be able to route JSON-RPC invocations for both the Root DID and the Peer DID as vault subjects, resolving through the Alias Registry.
The large data blob is not touched. Only the small key (DEK) is re-encrypted. The /doc/share call atomically registers the SP’s Peer DID as an alias in the vault’s Alias Registry and appends the new wrapped DEK entry to the document header. The file now has two valid headers: both the User and the Service Provider can open the envelope. Note that from the Service Provider’s point of view, they directly interact with the newly generated user’s Peer DID. This ensures they cannot link the DID they see with DIDs that another Service Provider sees when interacting with the same user. If we reflect on how this could work in practice, a Service Provider would generate a QR code on their onboarding page that contains their Peer DID and the access they request. The user wallet performs a lot of actions in the background, but in the end only one more interaction needs to happen with the Service Provider. At the end of the flow, the User Wallet calls the Service Provider back with a UCAN token that contains all the info for them to access the data on the SDV.
6.4. Scenario 4: Service Provider Retrieving Data
The Service Provider uses the UCAN and their own Private Key to read the data.
The SDV validates the UCAN chain to ensure the Service Provider is authorized. The SDV also selects the wrapped key that matches the Peer DID of the requesting Service Provider in order not to expose additional encrypted data. Decryption happens entirely on the Service Provider’s side.
The User Wallet issues a delegation token to the Service Provider. The pol field restricts which endpoints the SP may access:
{
"iss": "did:peer:zAliceVerifier1Key...",
"aud": "did:key:zVerifier1Key...",
"sub": "did:peer:zAliceVerifier1Key...",
"cmd": "/doc/read",
"exp": 1735689600,
"nonce": "ffffffffffffffff",
"pol": [[ "any", ".endpoint",
["/shared-resource-1", "/shared-resource-2"]
]],
"prf": []
}
The Service Provider then creates a per-request invocation token (the JSON-RPC call body) referencing this delegation in prf:
{
"iss": "did:key:zVerifier1Key...",
"sub": "did:peer:zAliceVerifier1Key...",
"exp": 1735689900,
"nonce": "b1b2c3d4e5f6g7h8",
"cmd": "/doc/read",
"args": {
"endpoint": "/shared-resource-1"
},
"prf": [
"eyJhbGciOiJFZERTQS...<Delegation UCAN above>"
]
}
6.5. Scenario 5: Service Provider Writing Data
The Service Provider can also get access to a User’s data vault to write data. Writing of data can be both editing existing data items as well as creating data items. This is useful for use cases where a Service Provider produces data that actually belongs to an end user, such as a medical file of a patient. In the case of editing existing data, the encrypted data blob is fully replaced, but the DEK and it’s wrapped keys are not touched. In the case of putting new data items, the same procedure is followed with as a User creating data but immediately with two wrapped keys of the DEK.
Important to note is that the Secure Data Vault stores old versions of the same data item as well where possible. This prevents Service Providers from being able to "destroy" existing User Data. Secure Data Vault providers can themselves configure how many versions are stored or how old versions can be before getting clean up. This can be the SLA that Secure Data Vault providers can use to commercialize.
The primary difference between the Service Provider creating data on behalf of the user versus a user creating their own data is the amount of DEK wrappings. In this case the Service Provider puts the data with already two wrapped keys. It is important that the SDV validates that these two wrapped keys make sense, one of the two should be linked to the DID of the requestor and one of the two should be a known Peer DID alias of the User. Additionally, more then two wrapped keys should be forbidden, as the Service Provider is not allowed to already select certain parties to share data with. The user always needs to be involved on deciding with whom they want to share their data.
6.6. Scenario 6: Revoking Access (Key Rotation)
Revocation in a cryptographic system requires Key Rotation. Revoking the issued UCAN is always the first step in revocation and will already provide enough protection in almost all cases. However, to become robust against hacks or bugs on the SDV, the architecture also proposes to do a full key rotation and re-encryption to make sure the revoked Service Provider can not gain access anymore through falsely leaked ciphertexts.
A big distinction between a typical write update and a access revocation is that not only the data blob is replaced but all the wrapped key headers as well. At the same time, old versions are not kept (or should not be kept long) as this would defeat the purpose of the mitigation against accidental leaks. Even if the Service Provider kept a copy of the old encrypted blob, they cannot decrypt the new updates. If they try to access the current file on the SDV, they will fail because their key no longer works (in the rare case their UCAN would still be accepted).
For simple data items, like an address or bank account number, the re-encryption is not a performance issue and can easily happen on the device of the end user. However, when the shared data that needs to be revoked is big, like large video files, this full download, decryption, re-encryption, and upload can become expensive operations or in some cases maybe even impossible due to the constraints of the end user’s device. To solve this issue, the SDV can use Trusted Execution Environments perform the bulk of the rotation on the Secure Data Vault. In essence, the end user delegates their work to the SDV. The flow would then look like the diagram below.
The scenario discards UCAN validation for every interaction of with the SDV for brevity.
A critical part of this process is the validation of the Secure Data Vault TEE key when the user gives the decrypted DEK. The user’s device needs to validate the full chain of trust of the TEE attestation to make sure that it is not giving the decryption keys of their data to an unauthorized third party or a system administrator of the SDV provider. Different technologies exist for the Trusted Execution Environments, such as Intel SGX, Intel TDX, AMD SEV, ARM CCA, AWS Nitro Enclaves, … Each attestation document for these technologies is different but follows a similar approach. There is a known root of trust, such as the Intel well known public keys or AWS’s Nitro Hypervisor public keys, enclave measurements, and attested data. The root of trust is well known across any user of TEE, but in this case the enclave measurements should also be known beforehand to the End User Wallet. The enclave measurements essentially prove the code that is running in the enclave. The End User can only trust code that is open-source, validated by independent researchers, and with known enclave measurements. This avoids a SDV Provider putting backdoors in the enclaves they run on behalf of users.
6.7. Scenario 7: Synchronization (Cache vs. Cloud)
This scenario occurs during the Authentication & Presentation flow. Before presenting credentials to a verifier, the Wallet must ensure its local cache is up to date with the Cloud Vault ("Source of Truth"). This is important because the user is offline or has multiple devices. An Issuer with write access might have updated a VC in the background.
The /vault/sync call is bidirectional: the Wallet uploads any pending local mutations in the same request it fetches server-side changes, using since_hash as the common reference point. The Wallet always syncs before generating a proof to ensure it isn’t using revoked credentials. If the Vault is unreachable, the Wallet defaults to the local cache, accepting the risk of staleness.
6.8. Deep dive into UCAN permissions
In the different scenarios, different types of UCAN permissions need to be validated. This section summarizes the full validation chain for UCAN 1.0 invocation tokens and puts extra attention on what each command entails. Each request to the SDV is a JSON-RPC call whose body is a signed invocation token. Validation proceeds as follows:
-
The public key of the
issDID in the invocation token is resolved (self-contained fordid:key/did:peer, no external registry lookup required). -
The EdDSA signature of the invocation token is verified against the resolved public key.
-
The
exp(and optionalnbf) time bounds are validated against the current time. -
The
nonceof the invocation is checked against recently processed nonces to prevent replay attacks. -
The
subfield of the invocation is checked to confirm it matches the vault namespace being accessed. -
Each delegation token in the
prfchain is validated: its signature, time bounds, and thatissof the inner delegation matches theaudof the outer delegation, until the root self-signed delegation is reached. -
The invocation’s
cmdis matched against the delegation’scmdpattern (e.g./doc/readsatisfies/doc/or/). -
The invocation’s
argsare checked against the delegation’spolconstraints (e.g.args.endpointmust be in the allowed endpoint list). -
Special restrictions apply per command:
-
cmd: /doc/read: The SDV returns the Ciphertext and only the wrapped DEK for the DID matching the invocation’siss. No other wrapped keys are exposed. -
cmd: /doc/update: Whencontent.data_encryptionis absent, only the Ciphertext is replaced and the wrapped key headers are not modified. Whencontent.data_encryptionis present (key rotation after revocation), the submitted wrapped key set must be a strict subset of the original set — new Peer DIDs may not be introduced. -
cmd: /doc/create: A new Ciphertext with at most 2 wrapped keys is accepted. At least one wrapped key must correspond to a known User Peer DID or Root DID. A second wrapped key must match the DID of the requester. -
cmd: /doc/delete: No additional restrictions on the payload. -
cmd: /doc/share: The Ciphertext must not be modified; only a single newdek_entrymay be appended and the associatedpeer_didis registered as an alias in the same atomic operation.
-
7. Deployment View
The Deployment View describes the technical infrastructure used to execute the system. It maps the SCAMPER software building blocks to physical hardware or containerized environments.
7.1. Infrastructure Level 1: The Solid-Based Vault
The SCAMPER Secure Data Vault (SDV) is built upon a W3C Solid implementation (e.g., Community Solid Server). It is deployed as a cloud-agnostic, containerized architecture that strictly separates metadata (LDP-RS) from raw binary files (LDP-NR). To support unlinkable identities, a custom DID resolution and UCAN authorization pipeline sits in front of the native Solid routing.
- Motivation
-
By adopting a W3C Solid foundation, the Vault inherits industry-standard data routing (Linked Data Platform). However, we replace the centralized WebID/OIDC paradigm with SCAMPER’s decentralized DID/UCAN model. Because we use Pairwise DIDs for privacy, we must inject a custom Namespace Resolution Layer (
DID Alias Resolver) to seamlessly map multiple public-facing service endpoints to a single, deduplicated internal storage structure without exposing the user’s root identity. - Quality and Performance Features
-
-
Scalability: The SDV Application Container is entirely stateless. The UCAN Validator requires no session memory. We could deploy multiple instances behind the Reverse Proxy to handle high traffic without coordination issues.
-
Storage Optimization: Large files are pushed directly to cheap Object Storage. The Graph Database is kept small and fast, containing only URIs, blind indexes, and relationships.
-
Cloud Agnosticism: By targeting Docker, an S3-compatible API (MinIO), and a standard SPARQL endpoint, the entire Vault can be deployed on any cloud provider or self-hosted environment.
-
- Mapping of Building Blocks to Infrastructure
| Building Block | Deployment Target | Rationale |
|---|---|---|
SCAMPER UCAN Middleware |
Solid Server Pipeline |
Intercepts requests before Solid routing. Handles DID verification and UCAN capability validation against the public Service DID requested. |
DID Alias Resolver |
Solid Server Pipeline |
Acts as an |
LDP Router |
Solid Server Core |
The native Solid router. Inspects HTTP |
LDP-RS Persistence |
Graph Database Node |
Stores RDF Sources (Linked Data, Verifiable Credentials, Alias Mappings). Accessible via SPARQL. |
LDP-NR Persistence |
MinIO / Local FS |
Stores Non-RDF Sources (Images, Encrypted PDFs) using self-hosted S3-compatible APIs. |
7.2. Infrastructure Level 2: LDP-RS vs LDP-NR Storage Routing
To ensure the Graph Database remains lightning-fast for blind-index queries and relation traversals, the Solid backend strictly routes data based on its mathematical footprint:
-
The Metadata Path (LDP-RS): Any payload submitted as
application/ld+jsonortext/turtleis routed by the Solid server to the Graph Database. This handles the lightweight, highly connected structural data of the Vault. -
The Binary Path (LDP-NR): Any unstructured payload (e.g., a 5MB encrypted
application/pdfor standard JWE blob) bypasses the graph entirely. The Solid server streams these bytes directly to MinIO (or the local file system). The server then automatically generates a tiny RDF metadata node in the Graph Database acting as the "pointer" to the MinIO object, ensuring the graph remains aware of the file without hosting its weight.
8. Cross-Cutting Concepts
This section describes the overarching design principles, rules, and standardized solutions that apply across multiple building blocks within the SCAMPER Secure Data Vault (SDV) and Edge Wallet ecosystem.
8.1. 1. Privacy-Preserving Logging and Monitoring
Standard web server logging (e.g., Nginx default access logs) is highly dangerous in a zero-trust architecture, as it routinely captures URIs, IPs, and identifiers. To maintain the Unlinkability requirement of the SDV, all logging must be strictly sanitized.
-
No Plaintext Identifiers:
did:peervalues, IP addresses, and specific resource paths MUST NOT be written to application logs in plaintext. -
Cryptographic Hashing in Logs: When tracing a request through the system (e.g., tracking a failed UCAN validation), the system must log a one-way cryptographic hash (e.g., SHA-256) of the DID or URI. This allows Vault administrators to debug traffic flows and trace errors without ever knowing who is making the request.
-
Scrubbed Payloads: The LDP Router and API Gateway must automatically redact HTTP request/response bodies from all error logs.
8.2. 2. "Dumb Server, Smart Client" Cryptography
To ensure absolute digital sovereignty, the cryptographic boundary is strictly enforced between the Edge Wallet and the SDV.
-
Edge-Side Key Generation: All private keys, including the Master Biometric Seed and deterministically derived Pairwise keys (HKDF), exist exclusively on the User’s Edge Wallet.
-
No Server-Side Decryption: The Vault application layer is mathematically incapable of decrypting the user’s LDP-NR file blobs or LDP-RS blind indexes. The Vault only performs public-key signature verification (via the UCAN Invocation Validator). The invocation token arrives as the JSON-RPC request body; there is no separate Authorization header.
-
Self-Certifying Revocation: State changes regarding access control (revocation) are not trusted by default. The Vault only accepts state changes that are cryptographically signed by the owning identity (e.g., the
/access/revokecommand).
8.3. 3. Unified Error Handling (Anti-Probing)
Error responses from the Vault must be carefully designed to prevent malicious actors from mapping the Vault’s internal structure or discovering Alias routes.
-
Opaque JSON-RPC Error Codes: If a Verifier submits a validly formatted invocation token but lacks the capability, or if the requested endpoint does not exist, the Vault must return a generic JSON-RPC error (e.g., code
-32001with a fixed message"vault error") without elaborating on the specific failure. The JSON-RPCdatafield must never expose whether a resource exists but is inaccessible versus simply not existing, as distinguishing the two allows an attacker to probe vault contents. -
Sanitized Error Messages: Internal stack traces, database query failures (e.g., SPARQL timeouts), or storage adapter exceptions must be caught by a global exception handler and translated into generic JSON-RPC error codes before being returned to the Verifier or Wallet.
8.4. 4. Data Persistence & Routing Rules (Solid LDP)
The system enforces a strict bifurcation of data storage to maintain performance and privacy.
| Data Profile | Content-Type Examples | Persistence Target | Encryption Standard |
|---|---|---|---|
Metadata / Graph (LDP-RS) |
|
RDF Triple Store |
Blind Indexed (HMAC) or Public Plaintext (Helper Data) |
Binary / Files (LDP-NR) |
|
Object Storage (MinIO) |
Envelope Encrypted (JWE + DEK) |
-
The Pointer Rule: Every LDP-NR binary stored in Object Storage must have a corresponding, lightweight LDP-RS node in the Graph Database acting as a pointer. This ensures the Solid routing logic can resolve the file’s metadata without downloading the raw binary.
8.5. 5. Idempotency and Offline-First Synchronization
Because the Edge Wallet operates on a mobile device, network connectivity is assumed to be intermittent.
-
Idempotent Vault Operations: All
/doc/update,/doc/delete, and/doc/shareJSON-RPC invocations to the SDV must be idempotent. If the Edge Wallet loses connection mid-upload and retries the exact same invocation (samenonce), the Vault must handle it gracefully without creating duplicate entries or throwing structural errors. Note: thenoncereplay check prevents the SDV from processing a stale retransmission as a fresh invocation — the client must issue a new invocation with a freshnoncefor each genuinely new operation.
9. Architecture Decisions
This section documents the most critical architectural decisions that shape the SCAMPER Secure Data Vault (SDV) and Edge Wallet. These decisions focus on balancing strict zero-knowledge privacy with system performance and storage efficiency.
9.1. ADR 1: UCAN 1.0 over OAuth2 / OIDC for Authorization
-
Context: Traditional web architectures rely on centralized Authorization Servers (OAuth2) and Identity Providers (OIDC). SCAMPER requires a decentralized, offline-first approach where the Vault acts as a "dumb" resource server without knowing the user’s root identity. UCAN 1.0 introduces a formal separation between delegation tokens (granting capability from one DID to another) and invocation tokens (exercising a capability in a specific request), replacing the single-token model of earlier versions.
-
Decision: We use UCAN 1.0. The Edge Wallet generates and signs delegation tokens locally and creates a fresh, per-request invocation token for each JSON-RPC call. Invocation tokens are the JSON-RPC request body — no separate Authorization header is needed. The
noncefield in each invocation provides per-request replay protection without requiring server-side session state. -
Consequences:
-
Positive: Complete elimination of centralized identity bottlenecks. The Vault requires zero session state and minimal database tables to verify access. The
polfield in delegation tokens enables fine-grained, declarative access policies (e.g., endpoint allowlists) without custom server logic. -
Negative: Shifts immense cryptographic complexity to the Edge Wallet. The user must manage their own key material, requiring robust recovery mechanisms (e.g., Biometric/SFE seed recovery). Clients must implement UCAN 1.0 token generation and proof-chain construction.
9.2. ADR 2: Solid LDP-RS / LDP-NR Storage Split
-
Context: The Vault needs to support highly optimized Linked Data queries (SPARQL) to resolve verifiable credentials and helper data, while also storing heavy binary files (e.g., 5MB PDFs, medical scans).
-
Decision: We strictly adhere to the W3C Solid protocol’s data bifurcation. RDF Sources (LDP-RS) are routed to a dedicated Graph Database (e.g., Oxigraph/Jena), while Non-RDF Sources (LDP-NR) are routed to an S3-compatible Object Store (e.g., MinIO).
-
Consequences:
-
Positive: Prevents memory exhaustion in the Graph Database. Ensures the Vault can scale infinitely to handle heavy binary payloads without degrading query performance.
-
Negative: Introduces architectural complexity. The Solid router must manage "pointers" (metadata nodes in the graph that reference the S3 blobs) to keep the systems synchronized.
9.3. ADR 3: DID Alias Routing for Storage Deduplication
-
Context: To maintain Unlinkability, the Edge Wallet generates a unique
did:peerfor every Service Provider (Bank, University). Storing a separate physical copy of a 5MB file for everydid:peerthat has access would cause massive storage bloat. -
Decision: We implemented a Namespace Resolution Layer (Alias Resolver) in the Vault’s application pipeline. The Wallet registers public Service DIDs as aliases that route to a single, hidden Storage DID.
-
Consequences:
-
Positive: Massive storage efficiency. Files are stored and encrypted only once (using Envelope Encryption), and access control is enforced at the alias level. Simplifies UCAN issuance for the Wallet.
-
Negative: We accept a localized correlation risk. While the Service Providers cannot collude, the Vault Administrator could theoretically look at the Alias Registry and deduce that multiple Service DIDs point to the same Storage DID. We accept this trade-off for operational practicality.
9.4. ADR 4: Cryptographic Append-Only Revocation Logs
-
Context: In a local-first architecture, the Edge Wallet cannot act as an always-online server to issue short-lived tokens. We must issue long-lived UCANs, which necessitates a robust revocation mechanism that survives server migrations.
-
Decision: Instead of using a standard database table for a Certificate Revocation List (CRL), the Vault uses an Append-Only Cryptographic Log. Revocations are processed as
/access/revokecommands signed by the issuing DID. -
Consequences:
-
Positive: The revocation state is self-certifying. The Vault can be migrated to new infrastructure without needing to establish trust in the old server’s database; the new server simply recalculates the cryptographic signatures on the log.
-
Negative: The Vault must perform signature verification on the revocation log every time a token is presented, slightly increasing compute overhead per request. The revocation logs are crucial during data migration. Long-lived tokens pose risks in situations where verifiers lose control over their keys without the user knowing.
9.5. ADR 5: JSON-RPC over REST for the Protected Data API
-
Context: UCAN 1.0 invocation tokens are self-contained, signed RPC calls: the
cmdfield is the method name andargsholds the parameters. Mapping this model onto REST requires an artificial translation between HTTP verbs (GET/PUT/POST/PATCH/DELETE) and UCAN commands, adding complexity without benefit. JSON-RPC 2.0 aligns the transport directly with the UCAN invocation model. -
Decision: The SDV exposes its protected data API as JSON-RPC 2.0 over HTTPS. Each request body is a UCAN 1.0 invocation token. The
cmdfield resolves to the JSON-RPC method;argsprovides the parameters. The Public Read API (unauthenticated discovery endpoints) remains plain HTTPS GET for maximum compatibility. -
Consequences:
-
Positive: The transport is a natural fit for the UCAN invocation model.
cmdandargsare self-describing and version-stable; no mapping from HTTP verb semantics is needed. JSON-RPC error codes replace ambiguous HTTP status codes, and thedatafield can carry structured error details without leaking vault internals. -
Negative: Loses HTTP caching and CDN edge-layer optimizations for read requests (since all protected calls are POST bodies). Clients require a JSON-RPC library and cannot rely on simple
curl-style HTTP tooling for authenticated vault access.
10. Quality Requirements
This section defines the core quality attributes of the SCAMPER Secure Data Vault (SDV) and Edge Wallet. These requirements act as the benchmark for the system’s success, focusing on privacy, sovereign security, and edge-centric usability.
10.1. 10.1 Quality Tree
The quality tree provides a high-level overview of the system’s driving quality attributes:
-
Privacy (Unlinkability)
-
Service Provider Isolation: Verifiers cannot collude to track a user’s recources they do not have access to.
-
Zero-Knowledge Storage: The SDV cannot read the user’s files or search queries.
-
Security (Data Sovereignty)
-
Cryptographic State: All access control and revocation must be mathematically verifiable.
-
Edge Authority: The server cannot issue or modify capabilities without the Edge Wallet’s private keys.
-
Performance
-
Storage Efficiency: Deduplication of heavy binary files (LDP-NR).
-
Query Speed: Millisecond resolution of graph traversals (LDP-RS).
-
Usability (Resilience)
-
Device Recovery: Users can recover their full identity and vault access without centralized backups.
-
Portability (Vault Provider Independence)
-
Completeness: Users can migrate their vault data between different providers without losing any data or control.
-
Ease of Migration: The migration process does not require complex actions from the user or trust in the old provider’s infrastructure.
10.2. 10.2 Quality Scenarios
These concrete scenarios translate the abstract quality attributes into highly specific, testable conditions for the engineering and QA teams.
10.2.1. Scenario 1: Privacy (The Collusion Test)
-
Stimulus: The Bank and the University attempt to merge their data to build a master profile of Alice.
-
Response: Because the Edge Wallet deterministically generated unique Pairwise DIDs (
did:peer) for each institution, and issued separate UCANs pointing to isolated Alias paths, the institutions have no matching identifiers (they could use Personal Data to match documents if they had access to the encrypted underlying data, mitigating the risk of this is out of scope for the SDV). -
Measure: An external audit verifies that no overlapping identifiers exist between the two Service Providers' requests.
10.2.2. Scenario 2: Security (The Rogue Admin Test)
-
Stimulus: A malicious administrator at the cloud hosting provider gains direct database access to the Vault’s Graph Database and Object Storage.
-
Response: The administrator sees only opaque UUIDs in MinIO and blinded HMAC hashes in Oxigraph. They destroy or scramble data, but they cannot forge any meaningful data without the Edge Wallet’s private keys.
-
Measure: The API Gateway’s UCAN Validator immediately rejects the third party’s request because the Vault lacks the Edge Wallet’s private key required to sign a valid delegation chain (
prf). Access is mathematically denied.
10.2.3. Scenario 3: Performance (The Heavy Load Test)
-
Stimulus: 1,000 Verifiers simultaneously request 5MB encrypted medical scans (LDP-NR) using valid UCANs.
-
Response: The Solid LDP Router intercepts the requests, verifies the UCANs, and streams the binaries directly from the S3-compatible MinIO storage.
-
Measure: The RDF Graph Database experiences no memory bloat during the file transfers. CPU usage on the SDV Application Container scales linearly with the signature verification load, and the system maintains a certain success rate under load.
10.2.4. Scenario 4: Usability (The Device Loss Protocol)
-
Stimulus: A user loses their smartphone and buys a completely new, blank device.
-
Response: The user scans their face/biometrics using Secure Fuzzy Extraction (SFE). The Edge Wallet reconstructs the Master Biometric Seed. The Wallet then uses HKDF to re-derive the Vault Storage DID and downloads the encrypted Active Grants backup file.
-
Measure: After the biometric scan and recovery of the biometric seed, the new device can successfully query the Vault’s Graph DB via blind indexing and find the user’s VCs, audit logs and active grants (recovering the backup).
10.2.5. Scenario 5: Migration (The Sovereignty Test)
-
Stimulus: The user decides they no longer trust their current SDV provider and wants to self-host their Vault.
-
Response: The user copies the raw Object Storage files and the RDF Triples to the new server.
-
Measure: Because the revocation list is an append-only log of cryptographically signed
ucan/revoketokens (not a fragile SQL state), the new server immediately enforces all previous revocations without needing to "trust" the old server’s data.
11. Risks and Technical Debt
This section outlines the known technical risks, accepted architectural compromises, and accumulated technical debt within the SCAMPER Secure Data Vault (SDV) and Edge Wallet ecosystem. Identifying these upfront allows the engineering team to prioritize mitigation strategies in future sprints.
11.1. 11.1 Architectural Risks
These are systemic risks introduced by the core design choices (documented in Chapter 9: Architecture Decisions).
-
Risk 1: Edge-Side Key Recovery Failure (The "Lockout" Risk)
-
Description: Because SCAMPER uses a strict Zero-Knowledge, Zero-Trust architecture, there is no centralized "Forgot Password" database. If the user loses their device and the Secure Fuzzy Extraction (SFE) biometric algorithm fails to perfectly reconstruct their Master Seed, the user is permanently locked out of their Vault. The data becomes mathematically unrecoverable cryptographic noise.
-
Mitigation: (WP2 responsibility)
-
Risk 2: Alias Registry Correlation
-
Description: As accepted in ADR 3, the SDV’s Alias Resolver maps multiple public-facing Service DIDs (e.g., Bank, University) to a single internal Storage DID to save disk space. A malicious Vault Administrator with database access could analyze this registry and deduce that the Bank and the University are interacting with the exact same human, degrading the theoretical Unlinkability of the system.
-
Mitigation: We accept this trade-off for storage efficiency. To mitigate it, the Vault infrastructure must employ strict internal access controls, disk encryption at rest, and automated audit logging for admin actions.
11.2. 11.2 Technical Debt
These are engineering compromises made to accelerate development or integrate with existing standards.
-
Debt 1: Building on Existing Solid Server
-
Description: By building upon a W3C Solid implementation and stripping out its native WebID/OIDC authentication to inject a custom UCAN middleware and did resolver, we are possibly diverging from the upstream repository.
-
Impact: Every time the core Solid project releases a major security patch or architectural update, our team will incur the technical debt of porting those changes into our heavily customized fork, specifically ensuring our UCAN/DID pipeline does not break the LDP router.
-
Debt 2: Revocation Log Computation Overhead
-
Description: Because we use an append-only cryptographic log for token revocations, the SDV must recalculate the signature chain of the log upon request. As the user revokes more connections over a span of years, this log will grow linearly.
-
Impact: Eventually, parsing the revocation log will introduce latency to standard file retrieval requests.
12. Glossary
This section defines the core terminology, acronyms, and domain-specific concepts used throughout the SCAMPER architecture. It serves as the ubiquitous language for the engineering, security, and product teams.
| Term | Definition |
|---|---|
Alias Registry |
A routing table inside the SDV that translates a public-facing Service DID (e.g., |
Blind Indexing |
A cryptographic technique (typically using HMAC) that hashes search terms into deterministic gibberish. It allows the Edge Wallet to search the Vault’s Graph Database for specific metadata without the Vault ever knowing what is being searched for. |
DID (Decentralized Identifier) |
A globally unique W3C standard identifier that resolves to a public key document without relying on a centralized registry (like a DNS provider or tech monopoly). |
DEK (Data Encryption Key) |
A randomly generated symmetric key used to encrypt a single data payload. The DEK itself is then encrypted (wrapped) with the public key of each authorized party, forming the core of SCAMPER’s Envelope Encryption structure. Multiple wrapped copies of the same DEK can coexist in a file’s header — one per authorized recipient — without exposing the plaintext to the Vault. |
did:peer |
A specific DID method used in SCAMPER. It generates pairwise, unique identifiers for every new connection, ensuring that Service Providers cannot collude or link a user’s activities across different domains. |
Edge Wallet |
The "Smart Client" mobile application controlled by the user. It is the sole cryptographic authority in the SCAMPER ecosystem, responsible for key derivation, data encryption, and UCAN issuance. |
Envelope Encryption |
A hybrid encryption pattern in which data is encrypted once with a fast symmetric key (the DEK), and the DEK is encrypted separately for each authorized recipient using their public key. This allows efficient multi-party sharing of large files: only the small DEK header needs to be re-encrypted when access is granted or revoked, not the file content itself. |
HKDF (HMAC-based Key Derivation Function) |
The mathematical algorithm used by the Edge Wallet to deterministically generate thousands of reproducible Pairwise DIDs and keys from a single Master Biometric Seed. |
JWE (JSON Web Encryption) |
The standard format used for Envelope Encryption in the Vault. The heavy file (e.g., PDF) is encrypted with a symmetric Data Encryption Key (DEK), and the DEK is encrypted for the specific Verifier. |
Key Rewrapping |
The operation of decrypting an existing DEK with one party’s private key and re-encrypting it with a different recipient’s public key, without touching the underlying ciphertext. In SCAMPER, the Edge Wallet performs Key Rewrapping to grant a Service Provider access to an already-stored file, appending a new wrapped DEK header for that provider’s Peer DID. |
LDP-NR (Non-RDF Source) |
A Solid protocol term for unstructured binary data (e.g., PDFs, JPEGs, encrypted JWE blobs). In SCAMPER, these are strictly routed to the Object Storage layer (MinIO). |
LDP-RS (RDF Source) |
A Solid protocol term for structured Linked Data (e.g., JSON-LD, Turtle, Triples). In SCAMPER, these are strictly routed to the Graph Database to enable lightning-fast metadata traversal. |
PQC (Post-Quantum Cryptography) |
Cryptographic algorithms designed to remain secure against attacks by quantum computers. SCAMPER uses a hybrid scheme combining a classical algorithm (X25519) with a post-quantum algorithm (ML-KEM / Kyber 768) into a single key exchange (X25519MLKEM768), ensuring security against both classical and future quantum adversaries without sacrificing compatibility with current systems. |
SDV (Secure Data Vault) |
The cloud-agnostic "Dumb Server" that hosts the user’s encrypted data. It possesses no private keys, cannot decrypt LDP-NR files, and cannot reverse LDP-RS blind indexes. |
SFE (Secure Fuzzy Extraction) |
The cryptographic process of turning noisy, analog biometric readings (like a 3D face scan) into a perfectly stable, reproducible cryptographic Master Seed, enabling deviceless key recovery. |
SPARQL |
The standard query language used to retrieve and manipulate data stored in Resource Description Framework (RDF) format. Used by the SDV’s Graph Database to execute blind searches and resolve helper data. |
SSI (Self-Sovereign Identity) |
A design philosophy in which individuals fully own and control their digital identities and personal data without relying on any centralized authority (government, corporation, or identity provider). SCAMPER is built on SSI principles: users hold their own cryptographic keys, control vault access via UCAN delegation, and can migrate their data at any time without dependence on their SDV provider. |
TEE (Trusted Execution Environment) |
A hardware-isolated, cryptographically attested compute context (e.g., Intel SGX, AMD SEV, ARM CCA, AWS Nitro Enclaves) whose code integrity can be verified by any external party through an attestation document. In SCAMPER, TEEs are optionally used by the SDV to perform expensive key rotation and re-encryption on behalf of the Edge Wallet for large files. The user’s wallet validates the TEE attestation — including enclave measurements against a known open-source build — before trusting it with DEK material. |
UCAN 1.0 (User-Controlled Authorization Networks) |
A decentralized authorization protocol based on JSON Web Tokens (JWTs). UCAN 1.0 uses two distinct token types: a delegation token (grants a named capability |
Delegation Token |
A UCAN 1.0 token issued by the Data Owner to grant a capability to another DID. Fields: |
Invocation Token |
A UCAN 1.0 token that constitutes an actual JSON-RPC request. Sent as the JSON-RPC 2.0 request body. Fields: |
JSON-RPC |
The transport protocol used for all authenticated interactions with the SDV’s Protected Data API. Each request is a UCAN 1.0 invocation token encoded as a JSON-RPC 2.0 method call over HTTPS. The invocation’s |
VC (Verifiable Credential) |
A tamper-evident digital credential (e.g., a digital passport or salary slip) whose authorship and integrity can be cryptographically verified by any party. |
Verifier |
A Service Provider (e.g., a bank, employer, or hospital) that receives a UCAN delegation from the Data Owner and uses it to retrieve and decrypt data from the SDV. Each Verifier establishes a unique Peer DID relationship with the Data Owner and holds its own private key to unwrap the copy of the DEK that was re-encrypted specifically for it during the Key Rewrapping step. |