Specification

This document outlines the Catalog Protocol. The used terms are described here.

1 Introduction

The Catalog Protocol defines how a Catalog is requested from a Catalog Service by a Consumer using an abstract message exchange format. The concrete message exchange wire format is defined in the binding specifications.

1.1 DCAT Vocabulary Mapping

This section describes how the DSP Information Model maps to DCAT resources.

1.1.1 Dataset

A Dataset is a DCAT Dataset with the following attributes:

odrl:hasPolicy

A Dataset must have 1..N hasPolicy attributes that contain an ODRL Offer defining the Usage Policy associated with the Catalog. Offers must NOT contain any explicit target attributes. The target of an Offer is the associated Dataset. This is in line with the semantics of hasPolicy as defined in the ODRL Information Model, explaining that the subject (here the Dataset) is automatically the target of each Rule. To prevent conflicts, the target attribute must not be set explicitely, for example, in the Offer or Rules.

1.1.2 Distributions

A Dataset may contain 0..N DCAT Distributions. Each distribution must have at least one DataService which specifies where the distribution is obtained. Specifically, a DataService specifies the endpoint for initiating a Contract Negotiation and Transfer Process.

A Distribution may have 0..N hasPolicy attributes that contain an ODRL Offer defining the Usage Policy associated with the Dataset and this explicit Distribution. Offers must NOT contain any target attributes. The target of an Offer is the Dataset that contains the distribution.

Support for hasPolicy attributes on a Distribution is optional. Implementations may choose not to support this feature, in which case they should return an appropriate error message to clients.

1.1.3 Data Service

A Data Service may specify an endpoint supporting the Dataspace Protocol such as a Connector.

dspace:dataServiceType

If the Data Service refers to an endpoint that supports the Dataspace Protocol, it must include the property dspace:dataServiceType:

1.1.4 Participant Id

The identifier of the participant providing the Catalog is specified using the dspace:participantId attribute on that DCAT Catalog.

1.2 DCAT and ODRL Profiles

The Catalog is a DCAT Catalog with the following restrictions:

Each ODRL Offer must be unique to a Dataset since the target of the Offer is derived from its enclosing context.
A Catalog must not have an odrl:hasPolicy attribute, since it is not intended to negotiate on the access to Catalog objects. An implementation might however regulate the visibility and/or the content of its Catalog dependent of the requester.

2 Message Types

All messages must be serialized in JSON-LD compact form as specified in the JSON-LD 1.1 Processing Algorithms and API. Further Dataspace specifications may define additional optional serialization formats.

2.1 Catalog Request Message

Sent by

Consumer

Resulting state

TERMINATED

Response

ACK or ERROR

Schema

TTL Shape, JSON Schema

Example

Message

Diagram(s)

The Catalog Request Message is message sent by a Consumer to a Catalog Service. The Catalog Service must respond with a Catalog, which is a valid instance of a DCAT Catalog.

The message may have a filter property which contains an implementation-specific query or filter expression type supported by the Catalog Service.
The Catalog Service may require an authorization token. Details for including that token can be found in the protocol binding, e.g., Catalog HTTPS Binding. Similarly, pagination may be defined in the protocol binding.

2.2 Dataset Request Message

Sent by

Consumer

Resulting state

TERMINATED

Response

ACK or ERROR

Schema

TTL Shape, JSON Schema

Example

Message

Diagram(s)

The Dataset Request Message is message sent by a Consumer to a Catalog Service. The Catalog Service must respond with a Dataset, which is a valid instance of a DCAT Dataset.

The message must have a dataset property which contains the id of the Dataset.
The Catalog Service may require an authorization token. Details for including that token can be found in the protocol binding, e.g., Catalog HTTPS Binding.

3 Response Types

The ACK and ERROR response types are mapped onto a protocol such as HTTPS. A description of an error might be provided in protocol-dependent forms, e.g., for an HTTPS binding in the request or response body.

3.1 ACK - Catalog

Sent by

Provider

Schema

TTL Shape, JSON Schema

Example

Catalog Example

Diagram(s)

The Catalog contains all Datasets which the requester shall see.

3.2 ACK - Dataset

Sent by

Provider

Schema

TTL Shape, JSON Schema

Example

Dataset Example

Diagram(s)

3.3 ERROR - Catalog Error

Sent by

Consumer, Provider

Schema

TTL Shape, JSON Schema

Example

Error

Diagram(s)

A Catalog Error is used when an error occurred after a Catalog Request Message or a Dataset Request Message and the Provider cannot provide its Catalog to the requester.

Field

Type

Description

code

String

An optional implementation-specific error code.

reasons

Array[object]

An optional array of implementation-specific error objects.

4 Technical Considerations

4.1 Queries and Filter Expressions

A Catalog Service may support Catalog queries or filter expressions as an implementation-specific feature. However, it is expected that query capabilities will be implemented by the Consumer against the results of a Catalog Request Message, as the latter is an RDF vocabulary. Client-side querying can be scaled by periodically crawling the Provider's Catalog Services, caching the results, and executing queries against the locally-stored Catalogs.

4.2 Replication Protocol

The Catalog Protocol is designed to be used by federated services without the need for a replication protocol. Each Consumer is responsible for issuing requests to 1..N Catalog Services, and managing the results. It follows that a specific replication protocol is not needed, or more precisely, each Consumer replicates data from catalog services by issuing Catalog Request Messages.

The discovery protocol adopted by a particular Dataspace defines how a Consumer discovers Catalog Services.

4.3 Security

It is expected (although not required) that Catalog Services implement access control. A Catalog as well as individual Datasets may be restricted to trusted parties. The Catalog Service may require Consumers to include a security token along with a Catalog Request Message. The specifics of how this is done can be found in the relevant protocol binding, e.g., Catalog HTTPS Binding. The semantics of such tokens are not part of this specification.

4.3.1 The Proof Metadata Endpoint

When a Catalog contains protected Datasets the Provider has two options: include all Datasets in the Catalog response and restrict access when a contract is negotiated; or, require one or more proofs when the Catalog Request is made and filter the Datasets accordingly. The latter option requires a mechanism for clients to discover the type of proofs that may be presented at request time. The specifics of proof types and presenting a proof during a Catalog request is outside the scope of the Dataspace Protocol. However, Catalog Protocol bindings should define a proof data endpoint for obtaining this information.

4.4 Catalog Brokers

A Dataspace may include Catalog Brokers. A Catalog Broker is a Consumer that has trusted access to 1..N upstream Catalog Services and advertises their respective Catalogs as a single Catalog Service. The broker is expected to honor upstream access control requirements.

Last updated 1 year ago