1 Linked Data for ProductionStanford projects Love from afar: describing music audio & video recordings in bibframe, the performed music ontology & beyond Good afternoon and thank you for having me here & being here yourselves. Today I’m here to talk to you about linked data in general, and in particular the BIBFRAME 2.0 ontology and the Performed Music Ontology, a project to extend BIBFRAME to better support the description of performed music materials. Some of you I know, will have heard parts of this before, but I promise there will be something new as well. For the first time that I’ve talked about this, I am actually going to use a real life example—the opera L’Amour de loin (Love from afar in English) by the Finnish composer Kaija Saariaho, thus the title. With this focus, I hope to illustrate to you some of the refinements and extensions PMO brings to the cataloging of performed music using BIBFRAME, and the benefits that linked data may bring to discovery in a linked-data driven library catalog. nancy Lorimer olac meeting ala 2017
2 L’Amour de loin (Love from afar)opera in 5 acts composed by Kaija Saariaho (2000) libretto by Amin Maalouf libretto based on the semi-autobiographical poem “La vida breve” by the troubadour Jaufré Rudel libretto has a published Turkish translation (from French) 2 recorded performances, 1 on video, 1 as a sound recording (both SACD & streaming) full score, vocal score both published by Chester has derivative work: Cinq reflets de L’Amour de loin So first an over view of our opera to put things in perspective. L’Amour de loin is a 5 act opera composed by Kaija Saariaho to a French libretto by Amin Maalouf for the Salzburg Festival in The libretto is based on a poem called “La vida breve” by the troubadour Jaufre Rudel, and is also published separately from the opera. It even has a published Turkish translation. There are two recordings of Love from afar: a video recording of the 2005 performance by the Finnish National Opera; and a SACD recorded by the Rundfunk Chor Berlin & Deutsches-Symphonie-Orchester Berlin in The score is published by Chester and is available both as a full score and a vocal score. There is also one derivative work “Cinq reflections de L’Amour de loin” also by Saariaho.
3 BIBFRAME 2.0 A very quick introduction…So, a short refresher to Bibframe. The Library of Congress initiated the development of BibFrame as a successor for the MARC format. BibFrame is designed to enable the discovery of bibliographic information on the web and in the broader networked world. It utilizes Resource Description Framework (RDF), a data model consisting of statements expressed in triples. In RDF, every entity (such as author, subject, place, etc.) ideally should have a corresponding unique identifier in the form of a URI. Thus every entity and every relationship is expressed as a URI, not a text string. The text string can change, but the link will remain. BIBFRAME is loosely based on the (Functional Requirements of Bibliographic Records) or FRBR model and is centered on three primary entities—Works, Instances, and Items. Other entities are defined in relation to one of those entities. Works are basically equivalent to the FRBR expression, whilst Instances are equivalent to the FRBR manifestation. There is no equivalent to the FRBR Work in the BIBFRAME conceptual model, but as we will see later it still can be expressed. One aspect of BIBFRAME that moves beyond FRBR is the addition of bf:Event which became an independent class with the publication of BIBFRAME 2.0. More about that later.
4 What BIBFRAME is and is not“an initiative to evolve bibliographic description standards to a linked data model” an OWL ontology for use by the library community a “core” ontology that covers the basics of bibliographic description as currently conceived a potential successor to MARC for cataloging a common framework for the creation of linked data by the library community a “replacement” for MARC a one-stop shop for linked data creation requires additional vocabularies requires extensions for domains your only choice… linked data is all about choice! There have been a lot of descriptions of what BIBFRAME, what it is and does, both laudatory and derogatory. So let’s summarize what is true and maybe not so true. [do trues] To note: Several large libraries over the world have converted their library data to BIBFRAME or are moving toward doing so: Swedish National Library, Hungarian National Library & Museum, the National Library of Spain, and some libraries in Italy. [is nots] BIBFRAME is intended as a lightweight framework, not a fully developed ontology. To use BIBFRAME, the framework requires the “filling out” of vocabularies and relationships, either drawing on other ontologies or developing new subclasses or vocabularies within in BIBFRAME. This is what we are doing in the Performed Music Ontology project.
5 Linked Data for Performed Music: the Performed Music OntologyDevelop a BIBFRAME-based ontology extension for performed music in all formats Domain-specific enhancements and/or extensions of BIBFRAME for use by the library community as a common standard Establish a model by which these standards can be created, endorsed, and maintained by the community Do this through partnering with domain communities and the PCC Linked Data for Performed Music is a sub-project of Linked Data for Production, a Mellon-funded grant led by Stanford University Libraries in collaboration with 5 other research libraries: Columbia, Cornell, Harvard, Princeton, and the Library of Congress. While the overall focus of the grant is to lay the groundwork for moving technical services workflow into a linked data environment, a number of the subprojects concentrate on ontology development for specific domains. Stanford decided to concentrate on performed music. The primary goal of our project has been to develop a performed music extension to BIBFRAME, covering description of recorded sound from wire recordings to streamed audio to music video. Using BIBFRAME as a core ontology, we are recommending domain-specific vocabularies, enhancements and/or extensions to BIBFRAME for use by the library community as the initial basis for a common standard. As we do this, we hope to establish a model whereby similar standards can be created, endorsed, and most importantly maintained by the library community. Clearly, the ontology will change and develop over time, but we hope to create here a strong base for development. Because we are emphasizing a particular domain, community is of paramount importance to this project. We can develop a beautiful ontology, but it is of no use if it is not acceptable to the domain-community who might use it.Stanford has thus been partnering with domain communities—the Music Library Association and the Association of Recorded Sound Collections, as well as keeping Program for Cooperative Cataloging (PCC) in the loop.
6 Why ontologies? “An ontology is a formal naming and definition of the types, properties, and interrelationships of the entities that really or fundamentally exist for a particular domain of discourse. It is thus a practical application of philosophical ontology, with a taxonomy.” (Wikipedia, viewed January 2, 2017) note this is an intellectual as well as a structural framework Before we embark on how we’ve doing in this wild adventure, let’s take a step back and think about why we are doing this. What are ontologies anyway and why do we need them? An ontology is “…a formal naming and definition of the types, properties, and interrelationships of the entities that really or fundamentally exist for a particular domain of discourse. It is thus a practical application of philosophical ontology, with a taxonomy.” While this is a Wikipedia definition, it does sum things up nicely. An ontology is where you name & define the entities contained in your domain and the relationships among them. As a practical application of philosophical ontology, creating an ontology is an intellectual endeavor, with choices based on structural and practical knowledge of the domain being modeled. A linked data ontology is built on RDF, RDFS or OWL, the Web Ontology language, but these can only express very basic relationships—this is a subclass of this; this entity is related to this entity—highly abstracted relationships. An ontology brings in the specific vocabularies and relationships that define your domain, providing structure and vocabulary (or taxonomy as Wikipedia puts it).
7 Metadata is the operating system for content: it defines the ecosystem in which content works.@DublinCore …and that ecosystem is an ontology Or in perhaps a more user-friendly fashion, we could say this: I would add that the “ecosystem” is you ontology.
8 Kaija Saariaho* bf:Person performance 2004 Deutsche GrammophonOperas bf:GenreForm Deutsche Grammophon bf:Publisher Kaija Saariaho* bf:Person DVD-Video bf:Carrier Stanford Libraries performance 2004 = LRM Work + Expression L’Amour de loin. 2004 bf:Video L’Amour de loin (Love from afar) To get back to our opera. With the BF basic diagram, I’ve mapped each of the bubbles with data about the opera DVD. Starting at the bottom, we have an item: we’ll call it copy 1. At the instance level, we have the video publication, and at the work level the expression of the work as bf:Video. It is important to remember that the bf:Work is equivalent to the expression level in FRBR or IFLA-LRM. There is no modelled equivalent to the FRBR work, though we can get around that later. I’ve given them all rather random names; these labels would depend on your cataloging standard or lack thereof. For the smaller bubbles [click], we have entities that describe or extend the fundamental classes. Copy 1 is held by Stanford with a specific barcode; the DVD is published by Deutsche Grammophon and is a DVD-Video (a bf:Carrier). It has a genre “Operas” and has a contributor Kaija Saariaho a bf:Person (as well as many other persons and some corporate bodies). Finally, we have that event [click]. It is linked to a work, the link being that an event is the “event content of” a work. That is the performance that is recorded in this video. Unlike our other metadata, the event as its own entity is not available in MARC cataloging and so opens new ground. We’ll talk about that more in a moment. * + librettist, performers, directors, conductors, designers, engineers, etc. copy 1
9 Extending BIBFRAME in PMOSo now we’ve reviewed, we’ll move on to the Performed Music Ontology. In extending an ontology such as BF, there are several things you can do: first, you can simply choose or create vocabularies that provide individual values for your RDF statements; second, you can add classes, subclasses, properties and/or subproperties within the confines of the ontology as it stands; finally, you can literally extend the ontology, by adding modeling in new areas, sometimes replacing something already there. PMO does all three of these. better accommodation of performed music metadata
10 Vocabularies why? Example:to provide specific relationships not already in BIBFRAME to provide values for the objects of triples Example: bf:FileType no subclasses/individuals in BIBFRAME want to add in: First vocabularies. Vocabularies are really more part of an application profile rather than the ontology itself, but it helps to explore possibilities while developing that ontology. As I mentioned earlier, we need more vocabulary to provide values for the objects of triples in BF—it does provide them itself. It is all very well to say that a work has a file type, but we want to know what that file type is. As individual members of the subclass “File type” these are known as “individuals” or “instances” of the class.
11 Addition of vocabularies to:RDA vocabularies: bf:AppliedMaterial bf:BaseMaterial bf:Carrier bf:Content bf:EncodingFormat bf:FileType bf:MusicFormat bf:MusicNotation bf:TactileNotation bf:GrooveCharacteristics bf:PlaybackChannels bf:PlaybackCharacteristic bf:RecordingMedium id.loc.gov vocabularies bf:Role pmo:MediumOfPerformance bf:GenreForm RDA unconstrained properties work relationship properties New vocabularies bf:PlayingSpeed pmo:DiscCutting bf:TapeConfig bf:EncodingFormat generalist additions to bf:Carrier bf:RecordingMethod bf:TrackConfig bf:TactileNotation bf:GrooveCharacteristics bf:PlaybackChannels bf:PlaybackCharacteristic bf:RecordingMedium bf:BroadcastStandard bf:SoundContent bf:VideoFormat For the most part, PMO suggests using the RDA terms as found in the RDA registry to serve as individual members of various bf:Class(es). RDA vocabularies since they cover a large number of BF classes, rather than a few, and because of their relative simplicity in modeling they are an easy application of reuse. A number of vocabularies found in id.loc.gov are also encouraged. We also suggest the use of the RDA unconstrained properties to relate bf:Works to one another. Nowhere else is there an equally rich and developed vocabulary. Finally, we created a few new vocabularies ourselves where they were not otherwise available—playing speed, disc cutting method, tape configuration, encoding format, and a set of colloquial terms for bf:Carrier.
12 New classes/subclasses/propertiesSubclass of bf:Identifier bf:AudioTake bf:Gtin14Number bf:MusicDistributorIdentifier pmo:VideoGamePlatformIdentifer pmo:AllmusicIdentifier pmo:MusicBrainzIdentifier pmo:DiscogsIdentifier pmo:ImdbIdentifier Other classes/properties pmo:Tempo pmo:DiscCuttingType pmo:ThematicCatalogStatement pmo:OpusNumberStatement pmo:KeyModeStatement pmo:phonogramCopyrightDate Subclasses & properties for: medium of performance events works Another step we took was to look through the ontology and add new classes, subclasses and/or properties required for cataloging performed music that were missing. For instance, we added several new subclasses to bf:Identifier: :AudioTake, :Gtin14Number (a number used by the publishing industry in describing packaging), :MusicDistributorIdentifier (recently defined in the MARC21 format), :VideoGamePlatformIdentifier, and links to Allmusic, MusicBrainz, Discogs, and IMDb. :AudioTake, Gtin14Number and MusicDistributorNumber have since been adopted into BF, and so removed from the PMO. Still music-oriented, we also added classes for Tempo and Disc cutting type, and changed bf:musicThematicNumber, bf:musicOpusNumber and bf:musicKey to object properties, with light modeling of their various parts. Besides the classes listed, we are also currently working on classes and properties to support modeling of events, medium of performance, and works.
13 These additions and the ones that follow are are recorded in Protégé, a well-known ontology editor coming from the biomedical field that is based at Stanford. You can access the ontology through the web version (WebProtégé), though you need to register first. And since just this week, the ontology is also available in GitHub in the LD4P space, in a rather less reader-friendly format.
14 pmo:Performance (subclass of bf:Event) pmo:performanceOfaddition of: pmo:Performance (subclass of bf:Event) pmo:performanceOf other subclasses of bf:Event: Audition Ceremony Concert Benefit concert Command concert ConcertSeries ConcertTour MasterClass Performance FirstPerformance LivePerformance OpenMicPerformance RecordingSession Rehearsal Now onto modeling. A big part of our work in the last few months has been the modeling of the relationships of events, works, medium, and performers with one another. This entails a complex set of relationships we were never satisfactorily able to express in a MARC record. Events are a new thing for catalogers to model. Events in MARC tend to be relegated to subjects or notes; only really festivals and meetings get access points. BF does include events in its modeling, but it is very barebones, as you can see above. A bf:Event can be the event content of a bf:Work, or inversely, a bf:Work can have Event Content in an Event. This is a bit generic to be really useful, so one of the first things we did was to create subclasses of bf:Event, and of bf:contentOf. For music, the primary subclasses of bf:Event are pmo:Performance and pmo:RecordingSession. [click] With our sample work today, we are talking about a live recording of the video, and so will use pmo:Performance. A performance is thus the event content of a bf:Video.
15 The basic triad of relationships: pmo:Performance bf:Work bf:Audio added properties: pmo:hasRecording/ recordingOf pmo:realizedIn/ realizationOf In our previous screen, you saw a individual relationship of bf:Video to a performance, here tipped on its side somewhat. They are linked by the property “pmo:hasRecording” or “pmo:recordingOf”. Now this basic relationship works if the performance is an improvisation or some other non-work based performance. However, most classical and popular music performance is a performance of a previously existing work. Since we don’t always know the type of work or its specific expression, we will use the generic bf:Work to represent the work. It is linked to the performance by the properties “performanceOf” and pmo:hasPerformance. Between the generic work and the video work, the link can be bf:hasExpression/expressionOf if you are working in a FRBR environment, or pmo:realizedIn/realizationOf if not.
16 Performance: Name: Performance of Kaija Saariaho’s Love from afar. 2004 Date: September Place: Helsinki, Finland Performers: Gerald Finley (Jaufré Rudel, Prince de Blaye) Dawn Upshaw (Clémence, Comtesse de Tripoli) Finnish National Opera. Orchestra Finnish National Opera. Chorus Conductor : Essa-Pekka Salonen Director : Peter Sellars Performance of: Amour de loin Has videorecording: Amour de loin. Video. 2005 An event may have several attributes: a name, a place, a date, as well as various contributors. On the left is what a data entry screen for an event might look like. This template was created by me using CEDAR, a template creator developed by the Stanford Bioportal group, who specialize in biomedical ontologies. While it looks like a simple database record, the template and the data together create a linked data graph. Of course, you wouldn’t want to display it like that, but perhaps more like something on the right. To note: [click] at the bottom of each is a pair of links: from the event to the work, and from the event to the videorecording, showing the triadic link of the previous page.
17 pmo:performanceOf pmo:hasRecordingWork: Title: Amour de loin AKA : Love from afar Composer: Saariaho, Kaija Librettist: Maalouf, Armin Genre: Operas Subject: Jaufré Rudel Commissioned by : Salzburg Festival, Théâtre du Châtelet Libretto: Maalouf, Amin. Amour de loin (Libretto) Performance(s) : World premiere performance of Kaija Saariaho’s Love from afar. 2000 Performance of Kaija Saariaho’s Love from afar. 2006 Performance of Kaija Saariaho’s Love from afar. 2008 Audio recordings: Love from afar. 2009 Video recordings: Love from afar. 2006 pmo:performanceOf Video work: Title: Amour de loin. Video. 2006 Genre: Live performances Filmed operas Performers: Gerald Finley, baritone (Jaufré Rudel, Prince de Blaye) Dawn Upshaw, soprano (Clémence, Comtesse de Tripoli) Finnish National Opera. Orchestra, orchestra Finnish National Opera. Chorus, chorus Conductor : Essa-Pekka Salonen Director : Peter Sellers Recording of : World premiere of Kaija Saariaho’s Love from afar. 2000 Realizatino of: Amour de loin (Work) These links would lead directly to graphs representing the work and the video. Remember, that while I have presented these to you as text with labels, underneath it consists of sets of URI triples pmo:hasRecording
18 Performance: Name: Performance of Kaija Saariaho’s Love from afar. 2004 Date: September Place: Helsinki, Finland Video work: Title: Amour de loin. Video. 2006 Genre: Live performances Filmed operas Performers: Gerald Finley, baritone (Jaufré Rudel, Prince de Blaye) Dawn Upshaw, soprano (Clémence, Comtesse de Tripoli) Finnish National Opera. Orchestra, orchestra Finnish National Opera. Chorus, chorus Conductor : Essa-Pekka Salonen Director : Peter Sellers Work: Title: Amour de loin AKA : Love from afar Composer: Saariaho, Kaija Librettist: Maalouf, Armin Or to bring back our triadic diagram for a moment with the actual work in question …
19 Medium of performance: requirementsmedium stated in the work, by the composer, arranger, etc. medium as actually performed in a performance distinguish between individual mediums and ensembles distinguish specifc roles—solo, ad lib, etc. describe small ensembles both as an ensemble & as individual instruments provide a count of each medium, and of the whole link contributors to a particular medium link contributors to a particular musical part (violin 1) be flexible enough to include media outside of music sphere (actors) So that was the basics of event in PMO. Now a little about medium of performance. For music people this is, of course, a major point of interest. Any of you who follow the continuing saga of the MARC 382 field will understand. Our medium of performance modeling attempts to cover everything stated in a 382 plus some—relating an ensemble to its individual instruments for instance, connecting performers to their mediums, and adding dramatic roles as well. As you can imagine, this can become complex very fast and would take a long time to explain. I’ll stick to a very narrow example here.
20 Modeling of the soprano soloist in L’Amour de loin additions: Here we have a model representing a medium of performance with a singer, in this case Dawn Upshaw. If you look at the left, you can see that bf:Audio has a pmo:PerformedMedium. That performed medium has several performed medium parts (e.g., a soprano, a baritone, chorus, orchestra). Here I’ve only done one—the soprano—which is an individualVoice, a subclass of pmo:MediumOfPerformance. From the performer side, the bf;Audio has a bf:Contribution. That contribution has the following: an agent (Dawn Upshaw, a bf:Person); a bf:role (singer); and a pmo:performsMOP (soprano). This brings the contributor and her instrument (in this case voice) together. To this is added the pmo:performanceType which states it is a “solo” part, and finally, because this is opera a pmo:DramaticRole, Clemence, Comtesse de Tripoli. Modeling of the soprano soloist in L’Amour de loin additions: pmo:DramaticRole pmo:hasDramaticRole pmo:mediumType
21 Modeling of the soprano soloist + orchestra in L’Amour de loinAs I said that was one part. Here is the same graph with the orchestra added in. You can see why I’m not giving you the entire opera mop in its entirety!
22 Work: Title: Amour de loin AKA : Love from afar Composer: Saariaho, Kaija Librettist: Maalouf, Armin Text work: Libretto Libretto in Turkish Notated Music work: Reflets sur L’Amour de loin Video work: Title: Amour de loin Performance: Name: Performance Date: September Recording session: Name: Recording session Date: October Name: World premiere Date: August 15 Date: March Audio work: Notated Music work: Title: Amour de loin Vocal score Finally, one last aspect that we touched on lightly earlier. You’ll remember that every recording in BIBFRAME is a bf:Audio or a bf:Video. I’ve also listed here 2 performances and 2 recording sessions: the one performance recorded as the video, the two recording sessions that somehow make up the bf:Audio, and at the top, the world premiere which took place in August 2000, but for which no recording was released. We’ll add to that 2 notated music works [click]—the full score and the vocal score derived from it, a related work “Reflets sur L’Amour de loin”, and [click] the separately published libretto and its translation. You can immediately see there is a slight problem—while these all are events or works based on or related to L’Amour de loin, there is nothing that brings them all together. You could separately link everything with a “relatedTo” type of property, but that could be very work intensive. There is a better solution. Remember that generic work in the event triad? If we bring that back [click] we can connect all these works together. The performances and recording sessions [click] are linked by pmo:hasPerformance to the bf:Work, the video, audio, and notatedmusic works[click] by bf:hasExpression (or bf:hasRealization). That leaves the related works to be connected by other properties.
23 Bringing it back to a basic diagram we see that the bf:Work connects the performance (only one shown here), the video, the audio, and the notated music. As a bonus, data that repeats in all the work subclasses can instead be written once in the generic work, since the subclasses inherit the attributes of the work. Now you may be saying this looks like a FRBR work, and indeed, if you are working in a FRBR (or IFLA-LRM) environment, this is what it is. In other environments, however, the implementation may be somewhat different. What this does show, however, is how necessary in some circumstances this generic work is. Remember, though it is optional—there is no need to catalog it if it is not necessary to do so. It just so happens that in music it is usually necessary.
24 Let’s then add in those related works, the libretto which is based on an early poem, the related notated music work. To link these, I’m using the RDA unconstrained properties. Remember that BF does not give you all the vocabulary you need—you have to (and it is good practice) to make use of other existing vocabularies.
25 Discovery… Vale frowned. “Are you telling me this place [the Library] isn’t properly organized?” “It’s extremely organized”, Irene said defensively. “It’s just not very helpfully organized, from our point of view. Don’t worry. Nobody’s ever been lost. Well, not permanently.” Cogman, Genevieve. The Burning Page. An Invisible Library novel. Reprint edition. Ace: 2017 Now in our last few minutes, I’d like to make a few comments about discovery using linked data, and there is nothing like turning to semi-popular fiction to give a perspective.. This first quote from Genevieve Cogman’s Invisible Library series, describes the first encounter of the great library which aims to house all the literature and its unique variants from the entire multiverse. The trio (at the end of an adventure) end up lost in the library, having entered at a random point. The newcomer, a detective name Vale asks …., to which Irene, the librarian replies. That is the crux of the matter—our libraries are organized but not necessarily in the way our patrons want to use it or enter it.
26 Discovery… In the nineteenth century Brakebills had appointed a librarian with a highly Romantic imagination who had envisioned a mobile library in which the books fluttered from shelf to shelf like birds, reorganizing themselves spontaneously under their own power in response to searches. For the first few months the effect was said to have been quite dramatic… But the system turned out to be totally impractical. The wear and tear on the spines alone was too costly, and the books were horribly disobedient… Now a second quote, from Lev Grossman’s The magicians, describing the library at the Brakebills College for Magicians. Now this is more like it!! An attempt to adapt the library organization to user needs! Unfortuneatly, it doesn’t work with physical objects. But with data, and particularly linked data this kind of fine tuning can be become a reality with a good search interface. Grossman, Lev. The Magicians: a novel. Magicians Triology. Penguin Books, 2010.
27 Here is an experimental example of such an interfaceHere is an experimental example of such an interface. This is Casalini Libri’s Share-Virtual Discovery Environment. Casalini has converted 100,000 bib record from several libraries into BF, reconciled all the URIs and created this flexible presentation interface. Here I have searched for Kaija Saariaho. Here you get her photo, various name forms, links to various national identifiers including wikidata, which take you to more information, plus some locally added information—a Wikipedia article and articles in Google ( on the right)
28 Further down we have a works list, any of which can be selected.
29 You may also start with the workYou may also start with the work. Again you get different forms of the title, and links to various contributors and to publications (here only one). Now if we extended this catalog, we could add in searching by event, by types of works (Audio, etc.) + other facets such as genre, etc.The point of entry to the catalog is that which the user has chosen, but always leads out to the other elements. Instead of the materials resorting themselves physically, the metadata resorts itself on the screen. This is the type of thing possible with linked data. And that also ends my talk. I hope you’ve got something out of it and have lots of questions.
30 Thank you!! PMO on Duraspace:https://wiki.duraspace.org/display/LD4P/Performed+Music+Ontology PMO on GitHub: https://wiki.duraspace.org/pages/viewpage.action?pageId= Linked Data for Production site: Nancy Lorimer Stanford University
31 Modeling of Violins 1 & 2 of a string quartet linked to bf:Contribution