close

Вход

Забыли?

вход по аккаунту

?

Slides

код для вставкиСкачать
A web-based repository service for
vocabularies and alignments in the
Cultural Heritage domain
Lourens van der Meij
Antoine Isaac
Claus Zinn
• Authors not here
• Projects
Using SW techniques for CH data
Focus on vocabularies and alignments
• Knowledge Organization Systems (KOS) like thesauri
are used to describe cultural objects
• Many different KOSs are used in different institutions
• Merging them in one global vocabulary is not
realistic nor desirable
Semantic matching as a solution to tackle
semantic heterogeneity
Eliciting needs for a repository
Application cases
• Semantic search and browsing
• (Re-)Indexing
Overall functions
• Uniform access to vocabularies
• Access & management of alignments
Experiment idea: test SW techniques for flexibility, ease
of re-use and linking models & data
Existing RDF best practices: SKOS
animals
NT cats
cats
UF domestic cats
RT wildcats
BT animals
SN used only for domestic cats
domestic cats
USE cats
wildcats
Existing RDF best practices: SKOS
Crucial features for a repository
• Vocabulary membership
• Cross-vocabulary mapping properties
Existing RDF best practices: OAEI
From Ontology Alignment Evaluation Initiative
• Mapping cells
–
–
–
–
2 entities being matched
1 relation type (any!)
1 measure
Provide hook for annotations
• Alignments between ontologies as set of cells
– Can also be annotated
http://oaei.ontologymatching.org
Existing RDF best practices: OAEI
Need for a service API?
• Need for dedicated middleware: some reqs beyond
basic data access are not met by standard SPARQL
–
–
–
–
–
–
Full-text search on labels
Ranking of results
Access control/authentication
Query complexity control
LoD data publication strategy
Other data exchange formats (JSON)
• APIs are also a good way to structure practices in a
domain
API design
• API is inspired by both SKOS and OAEI APIs
• But dedicated to simple vocabularies
Not fully-fledged ontologies
• Dedicated to vocabularies and alignments
More than usual terminology repositories
• Alignments are for simple vocabularies
Restricting OAEI-based functions to SKOS mappings
Distributed service architecture
• Allowing to serve either vocabularies or alignments
or both
Fitting different stakeholder missions/interests
• One service can sit on several others
Distribution thought as a scalability-enabler
Sends reassuring message re. access control
Distributed service architecture
CATCH service implementation
CATCH service implementation
Plus: many alignments automatically created in
the STITCH project
Driven by “business” interests
E.g., KB has a list of relevant KOSs in its context
Doelgroep
-audience
BISAC
subject
codes
other
classifications
NBC
class.
DDC
Dewey
decimal
class.
domain/
discipline
classifications
Brinkman
GTT
LCSH
subject
headings
RAMEAU
subject
headings
SWD
subject
headings
subject
thesauri /
subj. heading
lists
KB
Deposit
Coll.
KB
Scientific
Coll.
LC
(US Nat.
Lib)
BnF
(French
Nat. Lib)
DNB
(German
Nat. Lib)
book
collection
datasets
LC
authority
file
AutoritГ©s
BNF
Personen
namen
datei
person/
corporation
data
NUR
UNESCO
class.
Biblion
Dutch
Public
Libraries
Dutch
Booktrade
KB
overlap between book collections
(thickness indicates degree of overlap)
Vertical adjustment between a coll. and KOSs
denotes KOSs' being used to describe that coll.
KB
Corporatie
+ Persoon
Johan Stapel
Deployment (1)
Vocabulary and
alignment browser
Deployment (2)
RAMEAU (French NL) as linked data
• Interlinked with LCSH (Library of Congress)
• Soon to SWD (German NL)
• Using manual mappings from the MACS project
http://stitch.cs.vu.nl/repository
Deployment (3)
STITCH re-indexing prototype (ISWC 2009)
• Plugged onto KB cataloguing system
Lessons learnt
• Middleware is still useful
– To match real application requirements
– To gather communities of practice around new usages
• But SW tools really help building it
• Relevance of existing models like SKOS
– Only one part of SKOS unused (collections) and one extension
required (concept scheme groups)
– Disclaimer: we were involved in SKOS 
• Interest from the Cultural Heritage domain
(Changing landscape of) Issues
• Some basic middleware functions like full-text search
are now tackled by vendor-specific SPARQL ext.
We prefer it that way пЃЉ
• Working out the distributed architecture is difficult
Progress on federated RDF repositories can be useful
• Versioning/changes MUST be addressed at a finegrained level (concepts)
Maybe the issue with the least mature solutions!
Future work
Already started!
CATCHplus: continuing CATCH efforts, bringing
them even closer to production
New repository and interface
Current work
• Refinement of HTTP API
E.g., Possibility to search for pairs of related concepts, with
constraints
Closer to SPARQL, but still limiting complexity
• Based on Openlink Virtuoso
– Disk-based implementation can handle huge datasets
– Built-in LOD function & full-text features
Current work
• Architecture is no longer distributed, for now!
Difficult conflict between requirements
– Some clients had requirements for SPARQL
– Federated SPARQL query is (was?) not yet mature
• Named graphs are being experimented
– For representing KOS data bundles (file upload)
– For contextualizing triples (one shortcoming of SKOS/RDF)
Thanks!
http://stitch.cs.vu.nl/repository
Документ
Категория
Презентации
Просмотров
3
Размер файла
3 656 Кб
Теги
1/--страниц
Пожаловаться на содержимое документа