Dataspace Protocol 2024-1

NOTE: For GitHub users, the link to the rendered content is https://docs.internationaldataspaces.org/dataspace-protocol/.
NOTE: The human-friendly version of this specification in the IDSA Knowledge base will always show the latest version of the document. The version history and changes are provided via the GitHub Repository.

About versions of the Dataspace Protocol

This version (2024-1) of the Dataspace Protocol specification is the release candidate and considered to be stable. Further changes shall not affect conformity. Since version 0.8 the specification is stable with changes in details. All changes made to the specification can be reviewed in the GitHub repository.

NOTE: A versioning scheme beside the commits to the repository is not available but will be provided in the future.

Abstract

The Dataspace Protocol is a set of specifications designed to facilitate interoperable data sharing between entities governed by usage control and based on Web technologies. These specifications define the schemas and protocols required for entities to publish data, negotiate Agreements, and access data as part of a federation of technical systems termed a Dataspace.

Introduction

Sharing data between autonomous entities requires the provision of metadata to facilitate the transfer of Datasets by making use of a data transfer (or application layer) protocol. The Dataspace Protocol defines how this metadata is provisioned:

How Datasets are deployed as DCAT Catalogs and usage control is expressed as ODRL Policies.
How Agreements that govern data usage are syntactically expressed and electronically negotiated.
How Datasets are accessed using Transfer Process Protocols.

These specifications build on protocols located in the ISO OSI model (ISO/IEC 7498-1:1994) layers, like HTTPS. The purpose of this specification is to define interactions between systems independent of such protocols, but describing how to implement it in an unambiguous and extensible way. To do so, the messages that are exchanged during the process are described in this specification and the states and their transitions are specified as state machines, based on the key terms and concepts of a Dataspace. On this foundation the bindings to data transfer protocols, like HTTPS, are described.

The specifications are organized into the following documents:

Dataspace Model and Dataspace Terminology documents that define key terms.
Common Functionalities and their Binding in HTTPS declares cross-cutting functions as, e.g., the declaration of supported versions of this Dataspace Protocol.
Catalog Protocol and Catalog HTTPS Binding documents that define how DCAT Catalogs are published and accessed as HTTPS endpoints respectively.
Contract Negotiation Protocol and Contract Negotiation HTTPS Binding documents that define how Contract Negotiations are conducted and requested via HTTPS endpoints.
Transfer Process Protocol and Transfer Process HTTPS Binding documents that define how Transfer Processes using a given data transfer protocol are governed via HTTPS endpoints.

This specification does not cover the data transfer process as such.
While the data transfer is controlled by the Transfer Process Protocol mentioned above, e.g. the initation of the transfer channels or their decomissioning, the data transfer itself and especially the handling of technical exceptions is an obligation to the Transport Protocol.
As an implication, the data transfer can be conducted in a separated process if required, as long as this process is to the specified extend controlled by the Transfer Process Protocol.
Nevertheless, illustrative message examples are provided in the Transfer Process Protocol section. The best practices section may contain further non-normative examples and explanations.

Context of this specification

The Dataspace Protocol is used in the context of Dataspaces as described and defined in the subsequent sections with the purpose to support interoperability. In this context, the specification provides fundamental technical interoperability for Participants in Dataspaces. Beyond the technical interoperability measures described in this specification, semantic interoperability should also be addressed by the Participants. On the perspective of the Dataspace, interoperability needs to be addressed also on the level of trust, on organizational levels, and on legal levels. The aspect of cross-dataspace communication is not subject of this document, as this is addressed by the Dataspaces' organizational and legal agreements.

The interaction of Participants in a Dataspace is conducted by the Participant Agents, so-called Connectors, which implement the protocols described above. While most interactions take place between Connectors, some interactions with other systems are required. The figure below provides an overview on the context of this specification.

An Identity Provider realizes the required interfaces and provides required information to implement the Trust Framework of a Dataspace. The validation of the identity of a given Participant Agent and the validation of additional claims is a fundamental mechanism. The structure and content of such claims and identities may, however, vary between different Dataspaces, as well as the structure of such an Identity Provider, e.g. a centralized system, a decentralized system or a federated system. Other specifications, like the Identity and Trust Protocol (IATP), define the respective functions.

A Connector will implement additional internal functionalities, like monitoring or policy engines, as appropriate. It is not covered by this specification if a Connector implements such or how.

The same applies for the actual data that is transferred between the systems. While this document does not define the transport protocol, the structure, syntax or semantics of the data, a specification for those aspects is required and subject to the agreements of the Participants or the Dataspace.

Best Practices

The Dataspace Protocol is under development and the working group is active on this draft, reviewed and improved the content multiple times. During the process several aspects were discussed, which are not considered part of the normative specification, but important to be documented as support for the users of this specification as best practices. The Best Practices are non-normative.

Users of this specification are invited to provide feedback such as, but not limited to:

What information is missing?
What information, including examples, would you like to see?
What did you like in this document?

Please provide your feedback as Issue in our GitHub repository.

Last updated 2 months ago