How students use a single search box to search for music materials

2 How students use a single search box to search for musi...
Author: Anthony Mathews
0 downloads 1 Views

2 How students use a single search box to search for music materialsThe “Black Box” How students use a single search box to search for music materials Kirstin Dougan Music and Performing Arts Library University of Illinois at Urbana-Champaign MLA 2017 Orlando, FL

3 Native interfaces > one box vs. One box ≠ native interfacesHow we frame discussion of the single search box in the current environment is important. Librarians are expert searchers and know the ins and outs of different search features in the various research tools we use. We know when it’s better to use the library catalog or Grove or RILM to find something and how to search efficiently. Therefore it’s not surprising that I often hear the argument that “one-box” (aka Google style) searching in a discovery layer or federated search tool is less powerful and precise than searching directly in the native interfaces of individual catalogs and indexes. This is of course true, but by and large our patrons are not expert searchers. Previous research shows that the lure of the single search box is strong and students aren’t necessarily using very sophisticated methods when searching in individual tools either. Also, we should remember that the benefits of discovery layers/federated search can outweigh the negatives at certain stages in the research process.

4 Research question Does the nature of music materials lead to quantifiable patterns in how patrons search for them, and does it differ noticeably from searches done in other subjects? With all of that in mind, I wanted to discover whether the unique nature of music materials—multiple formats, languages, etc.--leads to specific patterns in how patrons search for them and also whether there is any quantifiable difference in how patrons searching for music materials search vs. how patrons searching in other disciplines search.

5 The tool Therefore today I’m going to talk about how patrons search in the music-specific module of University of Illinois Library’s Easy Search federated search tool which is displayed prominently on our Library’s homepage. This tool searches our library catalog, WorldCat, our various music article databases, newspaper tools, our streaming A/V tools, and Oxford Music Online. While I’ve done previous observational studies watching students search in an effort to find specific things, I wanted to see how they search when no one is looking, so to speak. To do this, I analyzed the transaction logs of searches conducted from this box over a fifteen month time span. Because the data is anonymous, there is no way for me to tell who is conducting these searches, whether they are students or faculty, what their area of specialty is, or even if they are in the music department, so I won’t be able to draw any conclusions based on patron types. Since other people use our library and computers (with our homepage as the home screen in browsers), there are instances of searches for completely irrelevant things--engineering articles and even local business information for example. However, the vast majority of searches in the dataset are for music-related materials.

6 Federated search vs. discovery layerFederated search tools point searchers to other search tools; discovery layers point them to specific items in those tools Bento displays are a hybrid approach that shows short lists of results organized by source The U of I Easy Search tool is a home-grown federated search tool with both a generic multi-discipline search option and different modules for different, customized subject areas, and most of our branch libraries include this box on their homepage, which is how I am able to compare data of searches between different disciplines. Most of you won’t have a federated search tool like ours and will instead have a discovery layer, usually built on a central index. They differ in their functionality, but some systems and schools are starting to use bento display for their discovery layers, which is a hybrid approach, listing a subset of results from each of the target tools. Our generic easy search module, which targets interdisciplinary databases such as JSTOR, Scopus, and Academic Search, has this display option, but the MPAL Easy Search does not because not all of the target tools have APIs we can use to pull in data (next slides).

7 Advanced search options are minimal and allow the user to select different subject modules (including more than one at a time) and/or to search by author and/or title. These next few slides give you an idea of what the results screen looks like for our module. I couldn’t fit it all on one screen with any hope you’d be able to read it.

8

9

10 What they are good* at findingKnown citations (esp. articles, books, chapters, dissertations, etc.) Works with distinctive titles/single iterations (e.g., one work per recording or score) *Or perhaps “better than bad” at While the retrieval and display approach for federated and discovery layers can be different, inherently they share some similarities. The first being that the search mechanism is rarely as powerful as the original tools. However, both do a decent job of known item searching, especially for articles and books. For scores and recordings discovery layers can prove reasonable for searching for things with distinctive titles, especially when talking about a single work per score or recording. However, the fact that many discovery layers trying to FRBRize or group “like” items can cause problems distinguishing between editions and/or scores and recordings that have different contents can mistakenly be grouped together (or overlooked).

11 The data Now that we’ve done a quick overview of our tool and how it compares to discovery layers, I want to talk about what data I looked at in the transaction logs. This is just a sample screenshot of two of the fields—there are other data elements recorded for each search, such as: session ID, how they got to the search screen, what their previous search was (read from the bottom up), whether they used any search indexes besides keyword, whether they took spelling suggestions made by the system, etc.

12 Data points Searches per session Search terms per searchUse of Boolean, quotation marks, parentheses, etc. Frequency of use of spelling suggestions To get at the bigger question of how patrons search for music materials, I had to ask a lot of smaller questions and their corresponding data points

13 Data points, cont’d Type of thing being sought/searched article/bookscore/recording journal name topic (just keywords) author/composer/performer etc. (just name) By looking at the search queries, I can make some educated guesses about what type of thing they are searching for (or in some cases I can tell exactly what based on the search terms). Of course this is somewhat subjective, but when I’m not sure what the patron might have been after, I can recreate their search and see what comes up.

14 Data points, cont’d Search string elementspersonal name and title keywords title or title keywords (e.g., West Side Story) general topic keywords genre/instrumentation (e.g., trumpet ensembles) work numbers publisher/label names format (score, facsimile, parts, etc.) DOIs

15 Questions I can’t (necessarily) answerWere they looking for a piece of music or information about it? Were they looking for pieces by a composer or information about him/her Did the patron do the “right” or “best” search? Did they find what they were looking for? When doing observational studies I found that patrons tended to stop when they found the first thing that met their criteria. When looking at search logs I can’t necessarily tell exactly what they were looking for, unless it was a specific known item search. But even then, the search strings may not give me enough information to know what format or version they wanted. It can also be hard to determine whether they did the “best” or “right” search. With most tools there are multiple ways to skin a cat, as it were—keywords and then narrow by facets later, or apply limits and more keywords first…one approach isn’t necessarily “right”. However, I can still learn a lot about how patrons are searching.

16 The findings so far I looked at a year’s worth of MPAL specific data. For some questions I looked at the whole MPAL dataset (average strings per session and average words per search). For further analysis I looked at every tenth MPAL search session, or 2500 sessions, with about 4900 search strings total. I haven’t made my way through all of them yet I’m afraid, so I can only share some preliminary data. However, what I’ve found so far is already interesting (I think).

17 Comparison by source of searchesThis shows the comparison between the generic Easy Search module (listed on our main Gateway) and six of the subject modules including MPAL. You can see that MPAL has the third most sessions during the time

18 Comparison by source of searchesResearch on ES data overall (the “Gateway everything tab” above”) by colleagues here at Illinois has shown a lot of the searches being done are known-item searches, specifically pasted article or book citations, which tend to be longer than simple keyword searches. You can see this by the larger numbers in the “average words per search” column on the right. Not surprisingly the

19 How many searches/session?63.2% of sessions have one search string 30.6% sessions have 2-4 search strings 5% of sessions have 5-9 search strings < 1% have 10 or more search strings After that overview comparison I started looking at a smaller set of 2500 sessions within the MPAL data. Searches are mostly only searching with a few strings—either they are finding what they want quickly, or more likely, once they do a few searches they are leaving the Easy Search tool to use the catalog or another tool directly.

20 Words per search stringYou can see that the majority of MPAL searches (55%) are two and three word searches—not necessarily surprising if you are looking for a piece of music and use the composer’s name and part of the title to search—but still makes for a broad search in many cases. This combined with a closer look at search strings may indicate that MPAL patrons are using scores and recordings more than books and articles. Something we could probably intuit, but it’s good to know that many patrons are trying to navigate solely on two or three words to start.

21 Rare searches Title or author index searches (only 207 and 141 of those, respectively) Including edition or label information Spelling change suggestions (just 83) e.g., suggesting Hilary Hahn instead of Hillary Hahn The overwhelming majority of searches are keyword searches—which is what I advocate when teaching about our library catalog, since we have post-search facets available and not every composer or title have added entries, meaning sometimes a keyword search is the only way to find an item. But there are times, of course, when a title or author search (or combination of the two) is more efficient. Easy Search does have a have a spell check feature, but it doesn’t know a lot of foreign words (so when a searcher got stuck on “scordatura” they had to get to the right spelling on their own (which they eventually did).

22 Unexpected search elementsdick farney + booker pittman mendelssohn's violin concerto (REICH: Sextet / Piano Phase / Eight Lines (Griffiths Kevin/ London Steve Reich Ensemble/ The/ Stephen Wallace) (Cpo: )) Some things, like plus signs for “and” and use of possessive were not entirely anticipated. It makes sense that you might want to search by a label number. We don’t happen to have this recording.

23 Tricky search elementsDoesn’t work well Mozart k501 mahler symphony no.9 francesca lebrun sonata in f Worked just fine six quartets for bassoon and strings: opus 1 (recent researches in the music of the classical era) The data that make a search more precise, like work numbers in the first two examples, can cause problems. Of course these are problems in a regular catalog too—but patrons aren’t getting any savvier about it. But sometimes things work that you wouldn’t expect, like the second example with a lot of metadata!

24 Search variations french medieval poems medieval minstrel poem frenchmedieval minstrel french medieval minstrel french and occitan medieval minstrel french and occian medieval minstrel and french Probably one of the richest areas for exploration in this data set are the permutations that search strings underwent as patrons changed their search terms. Defining any potential patterns here could have a big impact on teaching search strategy in information literacy sessions.

25 Why does all of this matter?Helps give us data to confirm our intuition and experiences Helps us advocate for search features in catalogs and discovery layers that other subject areas may not use as frequently Helps us understand where patrons are coming from when they search so we can teach them better Helps provide data for what music librarians know intuitively an through experience—that known-item searching in music isn’t as easy as known-item searching in other fields. Unlike most searches in the sciences where they are frequently searching by pasting in a whole citation (often articles or conference proceedings. Our systems favor this behavior—the more data points they can match on, the better the algorithms do. But searching for music materials will never be this simple and we will continue to rely on subject headings, descriptions, facets, and so on to navigate to the appropriate item. Stay tuned a full report on my findings some time this year!

26 Questions? Kirstin Dougan [email protected]Thank you! Questions? Kirstin Dougan