Making Digital Artifacts on the Web
Verifiable and Reliable
Abstract:
The current Web has no
general mechanisms to make digital artifacts
such as datasets, code, texts, and images verifiable and permanent. For
digital artifacts that are supposed to be immutable, there is moreover no
commonly accepted method to enforce this immutability. These shortcomings have
a serious negative impact on the ability to reproduce the results of processes
that rely onWeb resources, which in turn heavily impacts areas such as science
where reproducibility is important. To solve this problem, we propose trusty
URIs containing cryptographic hash values. We show how trusty URIs can be used
for the verification of digital artifacts, in a manner that is independent of
the serialization format in the case of structured data files such as
nanopublications.We demonstrate how the contents of these files become
immutable, including dependencies to external digital artifacts and thereby
extending the range of verifiability to the entire reference tree. Our approach
sticks to the core principles of the Web, namely openness and decentralized
architecture, and is fully compatible with existing standards and protocols.
Evaluation of our reference implementations shows that these design goals are
indeed accomplished by our approach, and that it remains practical even for
very large files.
EXISTING SYSTEMS:
v Our
approach sticks to the core principles of the Web, namely openness and
decentralized architecture, and is fully compatible with existing standards and
protocols.
v There
are a number of existing approaches to include hash values in URIs for
verifiability purposes, e.g. for legal documents.
v This
reversibility is needed once an existing trusty URI resource containing
self-references should be verified.
v We
transformed these nanopublications into the formats N-Quads and TriX using
existing off-theshelf converters.
DISADVANTAGE:
v The
same input always leads to exactly the same hash value, whereas just a
minimally modified input returns a completely different value.
v The
downside of such custom-made solutions
is that custom-made software is required to generate, resolve, and check the
hash references.
v Here,
approach that could replace such specific ones, thereby establishing
interoperability of systems and standard infrastructure for creating,
resolving, and checking hash references.
PROPOSED SYSTEMS
v propose
trusty URIs containing cryptographic hash values. We show how trusty URIs can
be used for the verification of digital artifacts, in a manner that is
independent of the serialization format in the case of structured data files
such as nanopublications.
v we
propose an approach to make items on the (Semantic) Web verifiable, immutable,
and permanent.
v This
approach includes cryptographic hash values in Uniform Resource Identifiers
(URIs) anadheres to the core principles of the Web, namely openness and
decentralized architecture.
v Nanopublications
have been proposed as a new way of scientific publishing.
ADVANTAGE:
·
Nanopublications can cite other nanopublications via their
URIs, thereby creating complex citation networks.
·
Published nanopublications are supposed
to be immutable, but there is currently
no mechanism to enforce this.
·
It is well-known that even artifacts
that are supposed to be immutable tend to change over time, while often keeping
the same URI reference.
CONCLUSION:
We
have presented a proposal for unambiguous URIreferences to make digital
artifacts on the (Semantic Web verifiable, immutable,
and permanent. If adopted,
it could have a considerable impact on the structure and functioning
of the Web, could improve the efficiency and reliability of tools using Web resources,
and could become an important technical pillar for the Semantic Web, in
particular for scientific data, where provenance and verifiability are
important. Scientific data analyses, for example, might be conducted in the
future in a fully reproducible manner within “data projects” analogous to today’s
software projects. The dependencies in the form of datasets could be
automatically fetched from the Web,
similar to what Apache Maven does for software projects, but decentralized
and verifiable.
Comments
Post a Comment