Home > Online Product Documentation > Table of Contents > Getting Started with Queries
This section provides information to get you started using queries. It does not provide complete information about how to define a query. Instead, it provides instructions for defining typical queries you might want to run. There are numerous cross-references to later sections that provide complete information about a particular query construct.
The topics discussed in this section include
When you query a document, you do not usually want to obtain all marked-up text. However, an understanding of queries that return all marked-up text makes it easier to define a query that retrieves just what you want.
The following figure shows a complete query (
/bookstore) and the way the XPath processor interprets it:
This query returns the
bookstore element. Because the
bookstore element is the document element, which contains all elements and attributes in the document, this query returns all marked-up text.
In the query, the initial forward slash (/) instructs the XPath processor to start its search at the root node.
Suppose you run the following query on
bookstore.xml:
This query returns an empty set. It searches the immediate children of the root node for an element named
book. Because there is no such element, this query does not return any marked-up text. Note that this query does not return an error. The query runs successfully, but the XPath processor does not find any elements that match the query. All
book elements are grandchildren of the root node, and the XPath processor only checks the children of the root node.
Usually, you use a query to obtain a portion of an XML document. To obtain the particular elements that you want, you must understand how to obtain an element that is a child of the document element. With this information, you can obtain any elements in the document.
The following figure shows how the XPath processor interprets the
/bookstore/book query:
When the XPath processor starts its search at the root node, there is only one element among the immediate children of the root node. This is the document element. In this example,
bookstore is the document element.
The query in this figure returns the
book elements that are children of
bookstore. This query does not return the
my:book element, which is also a child of
bookstore.
Now you can define queries that obtain any elements you want. For example:
This query returns
title elements contained in
book elements that are contained in
bookstore.
Sometimes you want all like-named elements regardless of where they are in a document. In this case, you do not need to start at the root node and navigate to the elements you want.
For example, the following query returns all
last-name elements in any XML document:
The double forward slash (//) at the beginning of a query instructs the XPath processor to start at the root node and search the entire document. In other words, the XPath processor searches all descendants of the root node.
If you perform this query on
bookstore.xml, it returns the
last-name elements that are children of
author elements, and it also returns the
last-name element that is a child of a
publication element.
Although sometimes you might want all like-named elements wherever they are in a document, other times you might want only those like-named elements from a particular part of the document (branch of the tree).
For example, you might want all
price elements contained in
book elements, but not
price elements contained in
magazine elements. The query is to return such a result is:
This query returns all
price elements that are contained in
book elements. Some of these
price elements are immediate children of
book elements. One returned
price element is a great-grandchild of the second
book element. The following figure shows how the XPath processor interprets this query:
Some queries can look very similar but return very different results. The following figure shows this.
Suppose you want the titles of all the books. You might decide to define your query like this:
This query does return all titles of books, but it also returns the title of a magazine. This query instructs the XPath processor to start at the root node, search all descendants, and return all
title elements. In
bookstore.xml, this means that the query returns the title of the magazine in addition to the titles of books. In some other document, if all titles are contained in
book elements, this query returns exactly what you want.
To query and obtain only the titles of books, you can use either of the following queries. They obtain identical results. However, the first query runs faster.
The first query runs faster because it uses the
child axis, while the second query uses the
descendent-or-self axis. In general, the simpler axes, such as
child,
self,
parent, and
ancestor, are faster than the more complicated axes, such as
descendent,
preceding,
following,
preceding-sibling, and
following-sibling. This is especially true for large documents. Whenever possible, use a simpler axis.
To specify an attribute name in a query, precede the attribute name with an at sign (@). The XPath processor treats elements and attributes in the same way wherever possible. For example:
This query returns the
style attributes associated with the magazine, the three books, and the
my:book element. That is, it returns all the
style attributes in the document. It does not return the elements that contain the attributes.
Following is another query that includes an attribute:
This query returns the three
style attributes for the three
book elements.
The following query returns the
style attribute of the context node:
If the context node does not have a
style attribute, the result set is empty.
The next query returns the
exchange attribute on
price elements in the current context:
Following is an example that is not valid because attributes cannot have subelements:
Following is a query that finds the
style attribute for all
book elements in the document:
Attributes cannot contain subelements. Consequently, you cannot apply a path operator to an attribute. If you try to, you receive a syntax error.
Attributes are inherently unordered. Consequently, you cannot apply a position number to an attribute. If you try to, you receive a syntax error.
You can use an at sign (@) and asterisk (*) together to retrieve a collection of attributes. For example, the following query finds all attributes in the current context:
Sometimes you want to retrieve only those elements that meet a certain condition. For example, you might want information about a particular book. In this case, you can include a filter in your query. You enclose filters in brackets ( [ ] ).
The following figure shows how the XPath processor interprets a query with a filter:
This query checks each
book element to determine whether it has a
title child element whose value is
"History of Trenton". If it does, the query returns the
book element. Using the sample data, this query returns the second
book element.
The following topics provide details about filters:
Suppose you define the following filter:
If you need to specify this filter as part of an attribute value, use single quotation marks instead of double quotation marks. This is because the attribute value itself is (usually) inside double quotation marks. For example:
Strings within an expression may contain special characters such as [, {, &, `, /, and others, as long as the entire string is enclosed in double quotes ("). When the string itself contains double quotes, you may enclose it in single quotes ('). When a string contains both single and double quotes, you must handle these segments of the string as if they were individual phrases, and concatenate them.
Following is another example of a query with a filter clause. This query returns
book elements if the
price of the book is greater than 25 dollars:
The next query returns
author elements if the author has a degree:
The next query returns the
date attributes that match
"3/1/00":
The next query returns
manufacturer elements in the current context for which the
rwdrive attribute of the
model is the same as the
vendor attribute of the
manufacturer:
You can apply constraints and branching to a query by specifying a filter clause. The filter contains a query, which is called the subquery. The subquery evaluates to a Boolean value, or to a numeric value. The XPath processor tests each element in the current context to see if it satisfies the subquery. The result includes only those elements that test true for the subquery.
The XPath processor always evaluates filters with respect to a context. For example, the expression
book[author] means for every
book element that is found in the current context, determine whether the
book element contains an
author element. For example, the following query returns all books in the current context that contain at least one excerpt:
The next query returns all titles of books in the current context that have at least one excerpt:
You can specify any number of filters in any level of a query expression. Empty filters
( [ ] ) are not allowed.
A query that contains one or more filters returns the rightmost element that is not in a filter clause. For example:
The previous query returns
author elements. It does not return
degree elements. To be exact, this query returns all authors who have at least one degree if the author is of a book for which the document contains at least one excerpt. In other words, for all books in the current context that have excerpts, this query finds all authors with degrees.
The following query finds each
book child of the current context that has an author with at least one degree:
The next query returns all books in the current context that have an excerpt and a title:
Following is a query that finds all child elements of the current context with
specialty attributes:
The following query returns all
book children in the current context with
style attributes:
The next query finds all
book child elements in the current context in which the value of the
style attribute of the
book is equal to the value of the
specialty attribute of the
bookstore element:
In a query, you can include an asterisk (*) to represent all elements. For example:
This query searches for all
book elements in
bookstore. For each
book element, this query returns all child elements that the
book element contains.
The * collection returns all elements that are children of the context node, regardless of their tag names.
The next query finds all
last-name elements that are grandchildren of
book elements in the current context:
The following query returns the grandchild elements of the current context.
Usually, the asterisk (*) returns only elements. It does not return processing instructions, attributes, or comments, nor does it include attributes or comments when it maintains a count of nodes. For example, the following query returns
title elements. It does not return
style attributes.
Wildcards in strings are not allowed. For example, you cannot define a query such as the following:
To use a wildcard for attributes, you can specify @*. For example:
For each
book element, this query returns all attributes. It does not return any elements.
The XPath processor provides many functions that you can call in a query. This section provides some examples to give you a sense of how functions in queries work. Many subsequent sections provide information about invoking functions in queries. For a complete list of the functions you can call in a query, see XPath Functions Quick Reference.
Following is a query that returns a number that indicates how many
book elements are in the document:
In format descriptions, a question mark that follows an argument indicates that the argument is optional. For example:
This function returns a string. The name of the function is
substring. This function takes two required arguments (a string followed by a number) and one optional argument (a number).
Queries are case sensitive. This applies to every part of the query, including operators, strings, element and attribute names, and function names.
For example, suppose you try this query:
This query returns an empty set because the name of the document element is
bookstore and not
Bookstore.
Blank spaces in queries are not significant unless they appear within quotation marks.
The precedence of query operators varies for XPath 1.0 and XPath 2.0, as shown in the following tables. In these tables, operators are listed in order of precedence, with highest precedence being first; operators in a given row have the same precedence.
|
Operation Type
|
XPath Operators
|
|---|---|
|
Grouping
|
|
|
Filter
|
|
|
Unary minus
|
|
|
Multiplication
|
|
|
Addition
|
|
|
Relational (Comparison)
|
|
|
Union
|
|
|
Negation
|
|
|
Conjunction
|
|
|
Disjunction
|