1 Programming Languages for XMLXPath XQuery Extensible StyleSheets Language (XSLT) Jan. 2017 Yangjun Chen ACS-7102
2 XPath XPath is a simple language for describing sets of similar paths in a graph of semistrucured data. The XPath Data Model Sequence of items corresponds to a set of tuples in the relational algebra. An item is either: A value of primitive type: integer, real, boolean, or string. A node (three kinds of nodes) Sept. 2014 Yangjun Chen ACS-7102
3 Documents. These are files containing an XML document, Three kinds of nodes: Documents. These are files containing an XML document, perhaps denoted by their local path name or URL. Elements. These are XML elements, including their opening tags, their matching closing tags if there is one, and everything in between (i.e., below them in the tree of semistructured data that an XML document represents). (c) Attributes. These are found inside opening tags. The items in a sequence needn’t be all of the same type although often they will be. Sept. 2014 Yangjun Chen ACS-7102
4 A sequence of five items:10 “ten” 10.0
5 It is common to apply XPath to documents that are files. We can Document Nodes It is common to apply XPath to documents that are files. We can make a document node from a file by applying the function: doc(file name) The named file should be an XML document. We can name a file either by giving its local name or a URL if it is remote. doc(“movie.xml”) doc(“/usr/slly/data/movies.xml” ) doc(“infolab.stanford.edu/~hector/movies.xml” ) Sept. 2014 Yangjun Chen ACS-7102
6 An XPath expression starts at the root of a document and gives a sequence of tags and slashes (/). doc(file name)/T1/T2/…/Tn doc(“movie.xml”)/StarMoviedata/Star/Name Evaluation of XPath expressions: Start with a sequence of items consisting of one node: the document node. 2. Then, process each of T1, T2, …, Tn in turn. To process Ti, consider the sequence of items that results from processing the previous tag, if any. Examine those items, in order, and find for each of all its subelements whose tag is Ti. Sept. 2014 Yangjun Chen ACS-7102
7 doc(“movie.xml”)/StarMoviedata/Star/Name Xml version = “1.0” encoding = “utf-8” standalone = “yes” ?>
8 /StarMoviedata/Star/Name Xml version = “1.0” … ?>
9 Relative Path ExpressionsIn several contexts, we shall use XPath expressions that are relative to the current node or sequence of nodes. a current node
10 Attribute in Path ExpressionsPath expressions allow us to find all the elements within a document that are reached from the root along a particular path. We can also end a path by an attribute name preceded by an at-sign. Sept. 2014 Yangjun Chen ACS-7102
11 So far, we have only navigated though semistructured-data graphs in Axes So far, we have only navigated though semistructured-data graphs in two ways: from a node to its children or to an attribute. In fact, XPath provides several axes to navigate a graph in different ways. Two of these axes are child (the default axis) and attribute, for which @ is really a shorthand. Axes used in Xpath expressions: Self Parent descendant Ancestor Next-sibling Following Preceding / - stands for child @ – stands for attribute . - stands for self .. – stands for parent // - stands for descendant Sept. 2014 Yangjun Chen ACS-7102
12 So far, we have only navigated though semistructured-data graphs in Axes So far, we have only navigated though semistructured-data graphs in two ways: from a node to its children or to an attribute. In fact, XPath provides several axes to navigate a graph in different ways. Two of these axes are child (the default axis) and attribute, for which @ is really a shorthand. //City /child::StarMovieData/descentend::Star/attribute::starID /descendant::City /StarMovieData//Star//City produces the same results as //City. Sept. 2014 Yangjun Chen ACS-7102
13 By “context”, we mean an element in a document, working as a Context of Expression By “context”, we mean an element in a document, working as a reference point. So it makes sense to apply axes like parent, ancestor, or next-sibling to the elements in a sequence. Wildcards In an XPath expression, we can use * to say “any tag”. Likewise, @* says “any attribute.” “
14 Conditions in Path ExpressionsAs we evaluate a path expression, we can restrict ourselves to follow only a subset of the paths whose tags match the tags in the expression. To do so, we follow a tag by a condition, surrounded by square brackets. Such a condition can be anything that has a boolean value. Values can be compared by comparison operators: = , >=, !=. A compound condition can be constructed by connecting comparisons with operations: , . StarMovieData /StarMovieData/Star[.//City = “Malibu”]/Name Star City Name
15 Conditions in Path ExpressionsSeveral other useful forms of condition are: An integer [i] by itself is true only when applied the ith child of its parent. /StarMovieData/Stars/Star[2] A tag [T] by itself is true only for elements that have one or more subelements with tag T. /StarMovieData/Stars/Star[Address] An attribute [A] by itself is true only for elements that have a value for the attribute A. Sept. 2014 Yangjun Chen ACS-7102
16 /Movies/Movie/Version[1]/@year Xml version = “1.0” encoding = “utf-8” standalone = “yes” ?>
17 XQuery XQuery is an extension of XPath that has become a standard forhigh-level querying of databases containing XML data. XQuery Basics All values produced by XQuery expressions are sequences of items. Items: primitive values nodes: document, element, attribute nodes XQuery is a functional language, which implies that any XQuery expression can be used in any place that an expression is expected. Sept. 2014 Yangjun Chen ACS-7102
18 FLWR (pronounced “flower”) expressions are in some sense FLWR Expressions FLWR (pronounced “flower”) expressions are in some sense analogous to SQL select-from-where expressions. An XQuery expression may involve clauses of four types, called for-, let-, where-, and return-clauses (FLWR). The query begins with zero or more for- and let-clauses. There can be more than one of each kind, and they can be interlaced in any order, e.g., for, for, let, for, let. Then comes an optional where-clause. Finally, there is exactly one return-clause. Return
19 Let Clause for Clause let variable := expressionThe intent of this clause is that the expression is evaluated and assigned to the variable for the remainder of the FLWR expression. Variables in XQuery must begin with a dollar-sign. More generally, a comma-separated list of assignments to variables can appear. let $stars := doc(“stars.xml”) for Clause for variable in expression let $movies := doc(“movies.xml”) for $m in $movies/Movies/Movie Sept. 2014 Yangjun Chen ACS-7102
20 Stars.xml Xml version = “1.0” encoding = “utf-8” standalone = “yes” ?>
21 Movies.xml Xml version = “1.0” encoding = “utf-8” standalone = “yes” ?>
22 Where Clause return Clausewhere $s/Address/Street = “123 Maple St.” and $s/Address/City = “Malibu” where condition This clause is applied to an item, and the condition, which is an expression, evaluates to true or false. return Clause return expression This clause returns the values obtained by evaluating expression. let $movies := doc(“movies.xml”) for $m in $movies/Movies/Movie return $m/Version/Star
23 let $movies := doc(“movies.xml”) for $m in $movies/Movies/Movie Xml version = “1.0” encoding = “utf-8” … ?>
24 Replacement of variables by Their Valueslet $movies := doc(“movies.xml”) for $m in $movies/Movies/Movie return
25 We can join two or more documents in XQuery in much the same Joins in XQuery We can join two or more documents in XQuery in much the same way as in SQL. In each case, we need variables, each of which ranges over elements of one of the documents or tuples of one of the relations, respectively. In SQL, we use a from-clause to introduce the needed tuple variables In XQuery, we use a for-clause. let $movies := doc(“movies.xml”) $stars := doc(“stars.xml”) for $s1 in $movies/Movies/Movie/Version/Star $s2 in $Stars/Stars/Star where data($s1) = data($s2/Name) return $s2/Address/City Select ssn, lname, Dname From employees s1, departments s2 Where s1.dno = s2. Dnumber Sept. 2014 Yangjun Chen ACS-7102
26 let $movies := doc(“movies.xml”) $stars := doc(“stars.xml”) for $s1 in $movies/Movies/Movie/Version/Star $s2 in $Stars/Stars/Star where data($s1) = data($s2/Name) return $s2/Address/City Xml version = “1.0” …. … ?>
27 XQuery Comparison OperatorsA query: find all the stars that live at 123 Maple St., Malibu. The following FLWR seems correct. But it does not work. let $stars := doc(“stars.xml”) for $s in $stars/Stars/Star where $s/Address/Street = “123 Maple St.” and $s/Address/City = “Malibu” return $s/Name Xml version = “1.0” encoding = “utf-8” standalone = “yes” ?>
28 Elimination of DuplicatesXQuery allows us to eliminate duplicates in sequences of any kind, by applying the built-in distinct values. Example. The result obtained by executing the following first query may contain duplicates. But the second not. let $starsSeq := ( let $movies := doc(“movies.xml”) for $m in $movies/Movies/Movie return $m/Version/Star ) return
29 Quantification in XQueryThere are expressions that say, in effect, for all (), and there exists (): every variable in expression1 satisfies expression2 some variable in expression1 satisfies expression2 let $stars := doc(“stars.xml”) for $s in $stars/Stars/Star where every $c in $s/Address/City satisfies $c = “Hollywood” return $s/Name Find the stars who have houses only in Hollywood. let $stars := doc(“stars.xml”) for $s in $stars/Stars/Star where $c in $s/Address/City satisfies $c = “Hollywood” return $s/Name Find the stars with a home in Hollywood. (Key word some is not used.) Sept. 2014 Yangjun Chen ACS-7102
30 Select ssn, fname, salary from employee where salary > all (select salary from employee where dno = 4); Select fname, lname from employee where exists (select * from dependent where essn = ssn); Sept. 2014 Yangjun Chen ACS-7102
31 XQuery provides built-in functions to compute the usual Aggregation XQuery provides built-in functions to compute the usual aggregations such as count, average, sum, min, or max. They take any sequence as argument. That is, they can be applied to the result of any XQuery expression. let $movies := doc(“movies.xml”) for $m in $movies/Movies/Movie where count($m/Version) > 1 return $m Find the movies with multiple versions. Select s.ssn, s.lname, count(r.lname) from employee s, employee r where s.ssn = r.superssn group by s.ssn, s.lname; having count( .lname) < 3; Sept. 2014 Yangjun Chen ACS-7102
32 Branching in XQuery ExpressionsThere is an if-then expression in Xquery of the form: if (expression1) then (expression2) let $kk := = “King Kong”] for $v in $kk/Version return if = then
33 Movies.xml Xml version = “1.0” encoding = “utf-8” standalone = “yes” ?>
34 Ordering the Result of a QueryIt is possible to sort the result as part of a FLWR query order list of expressions let $movies := doc(“movies.xml”) for $m in $movies/Movies/Movie, $v in $m/Version order return
35 Movies.xml
36 Extensible Stylesheet LanguageXSLT (Extensible Stylesheet Language for Transformation) is a standard of the World-Wide-Web Consortium. Its original purpose was to allow XML documents to be transformed into HTML or similar forms that allowed the document to be viewed or printed. - In practice, XSLT is another query language for XML to extract data from documents or turn one document form into another form. XSLT Basics Like XML schema, XSLT specifications are XML documents, called stylsheet. The tag used in XSLT are found in a name-space: Sept. 2014 Yangjun Chen ACS-7102
37 At the highest level, a stylesheet looks like: Xml version = ‘1.0” encoding = “utf-8” ?>
38 XPath expression can be either rooted (beginning with a slash) Templates
39 Xml version = “1.0” encoding = “utf-8” ?>
40 Obtaining Values from XML Data
41 Recursive Use of TemplatesPowerful transformations require recursive application of templates at various elements of the input.
42 Xml version = “1.0” encoding = “utf-8” ?>
43 We can put a loop within a template that gives us freedom over the Iteration in XSLT We can put a loop within a template that gives us freedom over the order in which we visit certain subelements of the element to which the template is being applied.
44 Xml version = “1.0” encoding = “utf-8” standalone = “yes” ?>
45 0L>
Carrie Fisher … more stars 0L> Hollywood Mallibu 0L> Xml version = “1.0” encoding = “utf-8” ?> 46 We can introduce branching into our templates by using an if tag.Conditions in XSLT We can introduce branching into our templates by using an if tag. Stars
47
| Month |
|---|
| Month | Savings | January | $100 |
|---|