Describe Musical Works and Events for the LOD

Abstract

Music is everywhere. Files of recorded music are spread all over the web. Yet, despite the fact that this knowledge is described in sometimes great details in information systems of many cultural and media institutions around the world, nothing is harder than to find the history of a musical piece, its composer, its cultural origins, its lyricists and influences, its covers and interpretations, etc.

DOREMUS is a research project that aims to develop tools and methods to describe, publish, interconnect and contextualize music catalogues on the web using semantic web technologies. Its primary objective is to provide common knowledge models and shared multilingual controlled vocabularies. The data modeling working group relies on the cataloguing expertise of three major cultural institutions: Radio France, BnF (French national Library), and Philharmonie de Paris.

FRBRoo is used as a starting point, for its flexibility and its separation of concerns between a (musical) Work and Event (interpretation). We have extended the model with classes and properties specific to musical data, and we have published a set of shared multilingual vocabularies. The result shall enable fine-grained descriptions of both traditional and classical music works as well as the numerous concerts that are being regularly held.

The aim of this tutorial is to provide in-depth explanations of those models and controlled vocabularies and to show different applications consuming those data, such as exploratory search and music recommendation applications.


Tutorial Structure

The tutorial has three parts:

The first part deals with the problems of modeling classical and traditional music while distinguishing the works and their interpretations (e.g. concerts) that generally survive their creators. We will describe the conceptual models that have been developed by digital libraries such as CIDOC-CRM[1], FRBRoo[2], BIBFRAME[3] or the recent bib extension of schema.org. We will present  the numerous controlled vocabularies that are being developed by the community such as musical genres, musical instruments, etc. Furthermore we will present the extensions of FRBRoo, on which the DOREMUS projet is working, for musical works and musical events representation.

The second part deals with the problems of converting, interconnecting and publishing data following those models. Data originally comes from very different formats and models, ranging from MARC21 representation to ad-hoc XML formats. We will demonstrate how the Datalift platform (http://datalift.org/) can be used to convert data in RDF following the models described previously. Next, we will exemplify how OpenRefine (http://www.openrefine.org/) can be used to transform into SKOS the numerous controlled vocabularies and code lists. Finally, we will highlight what are the key ontology matching problems for interlinking those datasets.We will emphasize multilingual issues. We will conclude with best practice for publishing such datasets as well as how to consume them using APIs.

The third part of this tutorial will present real-world applications that make use of those models such as the French National Library (BnF), the Spanish National Library (BnE), the British Museum, etc. We will present some exploratory search interfaces and motivate the need for future music recommendations applications.


Motivation for the tutorial

We welcome all Linked Data and Semantic Web researchers and practitioners who are interested in how semantic approaches and LOD publication may be applied to the description of musical works and events on the Web. The tutorial will address subjects of interest for researchers in information extraction, knowledge modeling, multimedia processing, data mining, data science, human-computer interaction, humanities, and web information systems.

The tutorial is of special interest if you are involved in entertainment, cultural heritage or history as it will cover some of the main ontologies of those domains. This tutorial will bring participants up-to-date with key conceptual models such as CIDOC-CRM, FRBRoo, BIBFRAME, recent http://bib.schema.org/ extension and DOREMUS extension for music.


Tutors

Raphaël Troncy

Raphaël Troncy is an associate professor in the Multimedia Department at EURECOM whose research mainly concerns the use of semantic web technologies in multimedia systems, knowledge modeling for information systems, data integration, knowledge extraction and web science in general. He is the primary investigator of a number of national (e.g., ACAV, Datalift, DOREMUS, ASRAEL, NexGen-TV) and EU projects (e.g., Apps4Europe, LinkedTV, MediaMixer, 3cixty) in the area of this tutorial. He was a co-organiser of numerous workshops (DeRiVE 2011, 2012, 2013, 2015), SemStats (2013, 2014, 2015), LiME (2014, 2015) at ESWC, ISWC or WWW. He also gave numerous tutorials at WWW (2014, 2008) and ISWC (2008, 2009).

Konstantin Todorov

Konstantin Todorov is an associate professor at the University of Montpellier and at the LIRMM laboratory where he is a member of the Open Data Research Group. His research interests lie in the field of web data science and more precisely data and knowledge management and integration. His recent publications deal with multilingual ontology alignment, using background knowledge for schema and instance matching and dataset recommendation for data linking.

Jean Delahousse

Jean Delahousse is an ontologist and information systems architect with extensive experience in Semantic Web, Knowledge Engineering, Text Mining, Linked Open Data, with both industrial and research activities in those domains. He has contributed papers to numerous international and national conferences on topics such as Semantic Web, terminologies and ontologies, and is active in many relevant committees and professional groups, leading the GFII Semantic Web Group and participating in semantic web standards evolution, co-founding the Open Law association to promote business and technical innovations based on legal open data. Jean gives semantic web and LOD training to numerous organisations in Europe. In the DOREMUS research project, he is in charge of creating the pedagogic tools for the project.

Pierre Choffé

Pierre Choffé was a manager of orchestras and music ensembles for 20 years. Building up on this experience, he now works at the French National Library (BnF) in the DOREMUS project which aims to enrich our musical knowledge.

Martin Doerr

Martin Doerr is a Research Director at the Information Systems Laboratory and head of the Centre for Cultural Informatics of the Institute of Computer Science, FORTH. He has been leading the development of systems for knowledge representation and terminology, metadata and content management. He has been leading or participating in a series of national and international projects for cultural information systems. His long-standing interdisciplinary work and collaboration with the International Council of Museums on modeling cultural-historical information has resulted besides others in an ISO Standard, ISO21127:2006, a core ontology for the purpose of schema integration across institutions.

Chrysoula Bekiari

Chrysoula Bekiari is an R&D engineer in the Information Systems Laboratory and the Centre for Cultural Informatics of the Institute of Computer Science, FORTH. She holds a Master of Arts in Computer Science from Queens College of the City University of New York and a degree in Mathematics from University of Patras. Her research interests include knowledge engineering and conceptual modeling, database design, heterogeneous and federated databases, data models, information retrieval, management and museum information systems, resource planning and monitoring.

[1] https://en.wikipedia.org/wiki/CIDOC_Conceptual_Reference_Model

[2] https://en.wikipedia.org/wiki/FRBRoo

[3] https://en.wikipedia.org/wiki/BIBFRAME