How Google Works: Are Search Engines Really Dumb and Should Educators Even Care? Paul Barron [email protected] [email protected] All Right Reserved.

1 How Google Works: Are Search Engines Really Dumb and Sh...
Author: Elfrieda Rogers
0 downloads 0 Views

1 How Google Works: Are Search Engines Really Dumb and Should Educators Even Care?Paul Barron All Right Reserved. This presentation may be copied and distributed for nonprofit educational purposes only Session revised

2 14th Annual Longwood University Summer Literacy InstituteWe know our students … “Whereas libraries once seemed like the best answer to the question, Where do I find…? the search engine now rules.” “No Brief Candle: Preconceiving Research Libraries for the 21st Century;” Part II Council of Library and Information Resources JEFF STAHLER: (c) Columbus Dispatch Dist. by Newspaper Enterprise Association, Inc For them, “to Google” is a lifestyle, a habit pattern. Do you agree? 14th Annual Longwood University Summer Literacy Institute 2

3 Research Sources for Middle & High School Students1. Google or other online search engine (94%) 2. WikipediA or other online encyclopedia (75%) 3. YouTube or other social media sites (52%) 4. Their peers (42%) 8. Online databases (EBSCO, JSTOR, or Grolier (17%) 9. Research librarian at school (16%) “How Teens Do Research in the Digital World” 14th Annual Longwood University Summer Literacy Institute

4 The Definition of “Research”“Middle and high school teachers suggest that the definition of “research” has changed in the digital world, and that change is reflected in how students approach the task.” “When asked how middle and high school students “do research,” the first response in every student and teacher focus group was ‘Google’.” “Some teachers say, for students today, ‘research = Googling’.” “How Teens Do Research in the Digital World” 14th Annual Longwood University Summer Literacy Institute

5 14th Annual Longwood University Summer Literacy InstituteLove is blind! “Students perceive themselves as skilled searchers of Google and every other search tool (because they’re “experts” at searching Google). “[Educators] know that these perceptions aren’t true.” “Undergraduate students rated their information literacy skills very high, but their search queries and behaviors did not support this. They were not sophisticated users of Google at all, let alone library resources.” “What Do Librarians Do, Exactly?” Helen Georgas The Informed Librarian Online; 14th Annual Longwood University Summer Literacy Institute

6 Unfortunate Facts at UC Berkeley“At the undergraduate level, what is anecdotally apparent to most faculty and librarians: Students lack skills needed to use digital resources for research. As “digital natives” they are adept at finding information for personal purposes; but … those skills often aren’t sufficient to accomplish their academic work effectively.” “Report of the Commission on the Future of the UC Berkeley Library;” 14th Annual Longwood University Summer Literacy Institute

7 UC Berkeley World Rankinghttps://www.timeshighereducation.com/world-university-rankings/2016/world-ranking#!/page/0/length/25 https://goo.gl/9ANmZY 14th Annual Longwood University Summer Literacy Institute

8 14th Annual Longwood University Summer Literacy InstituteFurthermore “Students tended to overuse Google and misuse scholarly databases. Indeed, they’re not even very good at using Google for these purposes.” “Google’s own research scientists have lamented that students are unable to take advantage of the resources that are readily available to those who know how to find them.” We can’t use this database; it doesn’t look like Google! Report of the Commission on the Future of the UC Berkeley Library 14th Annual Longwood University Summer Literacy Institute

9 Daniel M. Russell, Google’s …Senior Research Scientist for Search Quality says, “In universities a lot of the Google Generation do the dumbest things you can possibly imagine. Scholarly searching is not an intuitive skill; students cannot learn well by imitating peers.” “That is where librarians come in; … teach them what is possible.”    Searching For Better Research Habits Steve Kolowich 14th Annual Longwood University Summer Literacy Institute

10 14th Annual Longwood University Summer Literacy InstituteDo you agree that… “There are consequences to our students and our educational system if we [allow] a search engine to define the parameters of effective research.” The University of Google: Education in the (Post) Information Age Tara Brabazon Especially when our students do not know how Google determines the results. 14th Annual Longwood University Summer Literacy Institute

11 14th Annual Longwood University Summer Literacy InstituteIf educators hope … To change students’ excessive use of Google, educators must embrace Google and learn how the search engine works, in order … To influence students to integrate Google use with other reliable sources of information. 14th Annual Longwood University Summer Literacy Institute

12 Presentation ObjectiveIncrease our understanding of how search engines and Google work by dispelling some search engine myths. 14th Annual Longwood University Summer Literacy Institute 12

13 Presentation Objective: Dispel …Search engine myths: Google accepts pay for placement, understands a searcher’s query, treat all sites and domains the same when determining results, and determine the results based on the popularity of the site with searchers. But we’re not equal. I’m .edu. I’m .net. 14th Annual Longwood University Summer Literacy Institute 13

14 Why learn how Google works? Because …“We expect a lot search engines. We ask them vague questions about topics that we are unfamiliar and anticipate a concise organized response.” “You would have better success if you laid your head on the keyboard and coaxed the computer to read your mind.” Understanding Search Engines: Mathematical Modeling and Text Retrieval Michael W. Berry and Murray Browne Even though most search engine user don’t understand how they operate, we still expect them to understand our search queries and return the most relevant results. Two search engine developers have a different view. They think the likelihood of relevant results is increased if … 14th Annual Longwood University Summer Literacy Institute

15 To understand how search engines work ……we must understand, “search engines have no understanding of words or language. (They) don't recognize user intent, can't distinguish goal-oriented search from browsing search.” A ResourceShelf Interview: 20 Questions with Dr. Gary Flake, Ph.D. Head of Yahoo Research Labs Thursday, June 3, 2004 14th Annual Longwood University Summer Literacy Institute 15 15

16 14th Annual Longwood University Summer Literacy InstituteAnd today … “Google announced the biggest change since 2000. Google will focus on trying to understand the meanings of phrases and concepts as opposed to matching keywords in a search query to the same words on Web pages.” “Google Alters Search to Handle More Complex Queries” New York Times; September 26, 2013 goo.gl/iuEtH8 14th Annual Longwood University Summer Literacy Institute

17 If Google doesn’t understand my query …… how does Google determine how to select and rank the results in response to my query? 14th Annual Longwood University Summer Literacy Institute

18 Myth: Google Accepts “Pay for Ranking”“At Google we take our commitment to delivering useful and impartial search results very seriously.” “We don’t ever accept payment to add a site to our index, update it more often, or improve its ranking.” Matt Cutts Head of Google’s Web Spam Team 14th Annual Longwood University Summer Literacy Institute

19 Google does accept payment for ……advertising. 14th Annual Longwood University Summer Literacy Institute

20 What Google Considers on the WebpageGoogle’s algorithms rely on more than 200 unique signals to determine a ranking. For example, how often the search terms occur on the webpage, if the search terms appear in the title or the URL, and whether synonyms or the search terms occur on the page. Facts about Google and Competition An Update to our Search Algorithms (8/10/12) 14th Annual Longwood University Summer Literacy Institute

21 What Google Considers Off the WebpageLinks PageRank – A measure of the number and the quality of links to a webpage. Assumption - Important webpages receive more links from other webpages. Facts about Google and Competition 14th Annual Longwood University Summer Literacy Institute

22 Matt Cutts of Google states,“Popularity is different from accuracy and PageRank is different than popularity.” Let’s test that assertion by searching for … Therefore, PageRank is different from accuracy. 14th Annual Longwood University Summer Literacy Institute

23 14th Annual Longwood University Summer Literacy InstituteSearch Results The first 36 results are from Jew Watch Com which is the most “popular and accurate result” for our search. 14th Annual Longwood University Summer Literacy Institute

24 Jew Watch – A Popular & Accurate Site?14th Annual Longwood University Summer Literacy Institute

25 This ad used to be at the …… bottom of the search results … Google states, “We’re disturbed about these search results as well.” 14th Annual Longwood University Summer Literacy Institute

26 Google’s Explanation http://www.google.com/explanation.htmlThis page has been deleted from the Google database. For a copy see: 14th Annual Longwood University Summer Literacy Institute

27 The Value of Quality Links“With PageRank, five or six high-quality links from websites would be valued much more highly than twice as many links from less reputable or established sites.” Librarian Central How does Google collect and rank results? 14th Annual Longwood University Summer Literacy Institute

28 Checking the Links to JewWatch.comGoogle will return .edu sites that are linked to JewWatch.com. 14th Annual Longwood University Summer Literacy Institute

29 Law School Links to Jew Watch.comGoogle evaluates not only the number of links but the quality (reputation) of the linking site. 14th Annual Longwood University Summer Literacy Institute

30 Please explain why Google does not consider …… the fact that the site is popular with us, the searchers who view the sites! 14th Annual Longwood University Summer Literacy Institute

31 Why not consider searchers’ preferences?"We believe the approach which relies heavily on an individual's tastes and preferences [to rank results] just doesn't produce the quality and relevant ranking that our algorithms do." Amit Singhal; Google Fellow “This is tough stuff;” 25 February 2010 14th Annual Longwood University Summer Literacy Institute

32 14th Annual Longwood University Summer Literacy InstituteWhy!?! First: “We have all been trained to trust Google and click on the first result.” “College students trust Google; they click on the number one abstract most of the time, even when the abstracts are less relevant.” In Google We Trust: Users’ Decisions on Rank, Position, and Relevance; Laura Granka Journal of Computer-Mediated Communication “How Google Measures Search Quality” Datawocky 14th Annual Longwood University Summer Literacy Institute

33 Trusting Google too Much?“Second: For informational queries … if a result on page 4, provides better information than the results on the first three pages, users will not know this result exists!” “Therefore, usage behavior does not provide the best feedback on the rankings.” But we are the best results! “How Google Measures Search Quality” Datawocky 14th Annual Longwood University Summer Literacy Institute

34 14th Annual Longwood University Summer Literacy InstituteFrom 2005 to 2014 2005 Scan Pattern 2014 Scan Pattern “The average user scanned more results in 2014 vs. 2005, but spent less time looking at each result before clicking a result.” THE EVOLUTION OF GOOGLE SEARCH RESULTS PAGES & THEIR EFFECTS ON USER BEHAVIOUR 14th Annual Longwood University Summer Literacy Institute

35 14th Annual Longwood University Summer Literacy InstituteAnd in 2016 What explains the change in scan pattern? 14th Annual Longwood University Summer Literacy Institute

36 Do students read webpages?In 1997, the first study of how users read web content summarized the findings in two words: they don't. Users scan it. In 2006, research found that users frequently scan website … focusing on words at the top or left side of the page, while barely glancing at words that appeared elsewhere. Recent research quantified this finding: given the duration of an average page view, users read at most 28% of the words on the page. How Little Do Users Read? 14th Annual Longwood University Summer Literacy Institute

37 14th Annual Longwood University Summer Literacy InstituteConsider this … “The computer screen is … literally a small thing [that] may display just over 300 words. If this world becomes our reality, we actually are relying on less information, not the more that is available.” “The Google-ization of Knowledge” Natasja Larson, Laura Servage, and Jim Parsons ; Faculty of Education; University of Alberta 14th Annual Longwood University Summer Literacy Institute

38 Google doesn’t need to consider …… the popularity of a website with searchers because their algorithm is so up-to-date that Google always returns the best results. Right? RIGHT! RIGHT! 14th Annual Longwood University Summer Literacy Institute

39 Evaluating Google’s OpinionGoogle returns all sites with the words, martin and luther and king and school and flyers 14th Annual Longwood University Summer Literacy Institute

40 Google’s 1st Result (3-26-2017)14th Annual Longwood University Summer Literacy Institute 40

41 Martin Luther King.org Homepage14th Annual Longwood University Summer Literacy Institute

42 Martin Luther King.org is hosted by …14th Annual Longwood University Summer Literacy Institute

43 The student wants to know …Why was that site returned as the 1st result among the 828,000 results!?! I thought Google and other search engines always returned the best results. 14th Annual Longwood University Summer Literacy Institute 43 43

44 Checking for .edu Links to the WebpageRemember the importance of PageRank which measures the number and quality of links to a webpage. Link Check – Returns results that are linked to a site; for example, .edu sites that are linked to Martin Luther King.org. 14th Annual Longwood University Summer Literacy Institute

45 14th Annual Longwood University Summer Literacy InstituteLink Check Results QUESTION By reviewing the webpage description can you determine the purpose of the .edu sites’ linking to Martin Luther King.org? 14th Annual Longwood University Summer Literacy Institute

46 Linking and Webpage RelevanceWhen reputable [webpage] author(s) repeatedly link to a webpage, or when highly regarded or colleges/universities, governments, or organizations, link to a webpage, the rank of the linked-to webpage increases, regardless of whether the page is relevant. 14th Annual Longwood University Summer Literacy Institute

47 Google‘s opinion is important; …What can I do to influence the results returned by Google? 14th Annual Longwood University Summer Literacy Institute 47

48 14th Annual Longwood University Summer Literacy InstituteQuestion. Search Engine Components Spider/Web Crawler/Robot Index Search Engine The only feature that you can control is the query entered into the search engine. 14th Annual Longwood University Summer Literacy Institute 48 48 48

49 Keyword Searching “Keyword-based search works well if the users know exactly what they want and formulate queries with the “right words.” “It does not help much and is sometimes even hopeless if the users only have vague concepts about what they are asking.” Toward Topic Search on the Web Microsoft Research; March 2011 Let’s go see the librarian. 14th Annual Longwood University Summer Literacy Institute

50 Queries by Middle School Students“A predominate difficulty students experience while performing Web-based research is constructing effective search strings.” “[M]iddle school students demonstrate unsophisticated skills when constructing search strings, using mainly broad terms and phrases.” “Internet Searching by K-12 Students: A Research-based Process Model” 14th Annual Longwood University Summer Literacy Institute 50

51 Queries by High School Students“ [H]igh school students struggle with conceptualizing the topic for their query, sometimes omitting required concepts.” “Internet Searching by K-12 Students: A Research-based Process Model” 14th Annual Longwood University Summer Literacy Institute 51

52 Queries by College Students“[S]earch engines generally performed poorly, a lack of computer skills and an inability to construct appropriate search statements limited college students' success.” Nowicki, Stacy. Student vs. Search Engine: Undergraduates Rank Results for Relevance portal: Libraries and the Academy - Volume 3, Number 3, July 2003 I have query block! Query block is similar to writer’s block. But this student has a useful resource on her desk. What is it? 14th Annual Longwood University Summer Literacy Institute 52 52

53 What we know and understand is …“Librarians realize that for their students learning a process as complex as research is like learning a new language. Librarians see the huge gaps in actual student ability and know that the problem is more than something requiring remedial attention.” Process Not Product: Learning to be Information Literate Tami Echavarria Robinson 14th Annual Longwood University Summer Literacy Institute

54 He should have seen the librarian first!14th Annual Longwood University Summer Literacy Institute

55 The Importance of “Friends”Remember these stats from the introduction? 4. Their peers (42%) 8. Online databases (17%) 9. Research librarian at school (16%) Learning the Ropes: How Freshmen Conduct Course Research Once They Enter College 14th Annual Longwood University Summer Literacy Institute

56 Google Search ResourcesSearch Help Center https://support.google.com/websearch#topic= https://goo.gl/Vot32N Advanced Search https://www.google.com/advanced_search https://goo.gl/vGcSrY Operators https://support.google.com/websearch/answer/ https://goo.gl/CDc1P2 14th Annual Longwood University Summer Literacy Institute