Using QueryPath 2.0.1
How to make the most of the QueryPath library.
Table of Contents
About QueryPath
QueryPath is a library designed to help you quickly and efficiently search,
modify, and traverse XML and HTML documents. It implements many of the functions
found in jQuery. However,
it is optimized for server-side work.
Basic introductory documentation on QueryPath can be found at the
official QueryPath website.
That site is kept updated with all of the latest information.
This suite of documentation
provides detailed information about the QueryPath API.
Ready to get going? Start with the qp() function.
Along with the basic documentation, there are several examples linked from
there.
What can you do with QueryPath?
QueryPath can handle many different XML/HTML processing functions. Here
are a few examples:
- Work remote web services like Flickr, Google, Twitter, and YouTube
- Import legacy HTML
- Crawl and parse sites
- Retrieve RDFa and Microformats from HTML
- Parse XML configuration files
- Access the semantic web
- Maintain pure HTML templates and merge data with QueryPath
QueryPath is a flexible library. We hope you find it useful.
The Structure of QueryPath
To successfully use QueryPath, you will need to be acquainted with only
one function (qp()) and one class (QueryPath).
The qp() Function
The qp() function is a factory for creating new QueryPath
objects. It should be used for constructing new QueryPath
objects. The principle job of the qp() function is to create a
QueryPath instance and bind a document to that object. Any
QueryPath object can be attached to only one document.
QueryPath knows how to handle XML, HTML, DOMDocument, SimpleXMLElement
objects, and a few other document data structures. It can also load files
and remote URLs and parse the contents.
When handling such documents, QueryPath may have to guess whether a
document is XML or HTML. To read more about the rules, see the API
documentation for qp().
The QueryPath Class
The QueryPath class provides all of the tools for working with a
document. In addition to implementing all of the manipulation and traversal
methods of jQuery, it provides many other tools that may be useful when
managing documents on the server side or in a desktop application.
To learn about the rich set of methods available on a QueryPath
object, take a look at the QueryPath API documentation.
Note: To construct a QueryPath object, you should use the
qp() factory function. Rarely is there a reason to call
QueryPath's constructor directly.
Extending QueryPath
While QueryPath provides many useful features, it does not
(obviously) provide all features. It does, however, support a very simple
method of building extensions. Simple follow the extension pattern
exhibited in the QPList.php extension, and you can quickly add
your own methods to QueryPath in a maintainable and re-usable way.
On very, very rare occasions, you may need to extend the QueryPath
object itself. As of QueryPath 2.0, you can now do this. You are encouraged
to do so, though, only in cases where extensions cannot provide the
desired behavior.
Tutorials
The best place to start on QueryPath is with the
IBM DeveloperWorks intro. It covers the basics of QueryPath,
and provides a simple 10-line twitter client.
A blog on QueryPath and HTML introduces the basics of working with
HTML. Another three-part blog on QPDB (1,
2,
3)
introduces SQL programming with QueryPath.
The original
Tutorial
also remains accurate.
In that tutorial you will find a few simple examples, and lots of links to more
sophisticated examples.
Resources
The guide here contains detailed documentation of each method in the
QueryPath library. But there are other sources of documentation.
The QueryPath.org website is a hub of
information about QueryPath. Check out screencasts, articles, and other
sources of information there.
Issue tracking and development of QueryPath are hosted on
GitHub,
a shared development platform.
There are two mailing lists for QueryPath: One for
support and one for
developers.
Matt Butcher regularly posts articles about QueryPath on
TechnoSophos.