An Introduction to XSLT for Digital Humanists

DHOXSS 2013

http://web.uvic.ca/~mholmes/dhoxss2013/

Contents

Introduction: what to expect

Welcome to the DHOXSS 2013 XSLT course. Below is a schedule for the work we will be doing during the week. Please note that it is tentative and may well change, especially later in the week, when presentations may be added or removed based on needs arising out of your projects.

I'm expecting that each of you will bring with you at least one TEI XML file from your own project which will form the basis of much of your work during the hands-on sessions, especially later in the week. The overall objective of the course is not to teach you everything there is to know about XSLT, but to get you to a point where you are working productively on your own project and able to continue working with it after the course finishes. If you don't have an XML file or project of your own, you are welcome to work with any of the examples listed in the Resources section below; we will be using these files as the basis for our early XPath and transformation work as a group.

XSLT is a programming language expressed in XML. If you have never done any programming before, you may find it a little intimidating at first. If you have done some traditional programming, you may still find XSLT a little difficult because it is quite different from conventional programming languages such as Python, Java or C++. You will probably find the learning curve particularly steep during the first couple of days, so be prepared to be challenged, stretched and occasionally frustrated. However, once you get over the "hump" into Wednesday, you can expect an increasing payoff in terms of what you're able to do with your own data.

We will be looking at three fundamental ways of using XSLT:

  1. Rendering a TEI document into XHTML for display on the Web. This is perhaps the most common use of XSLT. This obviously requires that you know the basics of XHTML, so if you're a bit rusty on this, please take a little time to brush up your skills before the course. There is a "cheatsheet" for XHTML in the Resources list.
  2. Interrogating an XML document. This involves using XSLT (and in particular, XPath) to ask research questions of your documents. For instance, if you're working with drama, you might want to calculate which characters have the largest speaking roles, or which character has the shortest speeches.
  3. Fixing and enhancing your existing TEI document. This involves using something commonly referred to as an identity transform to make small changes to your document. For instance, you may want to work with one of the excellent Folger Shakespeare TEI-encoded texts, but there are many tags in the text which you don't need or want; with an identity transform, you can easily strip out the tags you don't want to use, and leave the rest of the encoding intact.

Schedule (tentative)

Times Monday 8 July Tuesday 9 July Wednesday 10 July Thursday 11 July Friday 12 July
Morning: 11:00-12:30 Welcome and introductions; logistics.

Introduction to XSLT: Where does it fit in, and what does it look like?

Navigating the XML tree and selecting nodes: XPath path expressions
A first few XPath functions (A)

XSL Variables: storing information (A)

Hands-on
Template Modes (repeated use of input)

Hands-on
Using data from another document (the doc() function)

Hands-on
Comprehensive review and question period

Afternoon: 14:00-16:00 Transforming XML using XSLT (executing an XSLT program in Oxygen)

The XSLT template paradigm (anatomy of an XSLT stylesheet)

Built-in template rules and how to override them

Hands-on
XSLT Constructors: dynamically constructed content

XSLT Copy constructors and identity transforms: transforming TEI to TEI.

Hands-on
Conditionals and looping (xsl:if, xsl:choose, and xsl:for-each) (A)

Hands-on
Named templates

Hands-on
Multiple input files from one stylesheet (the collection() function)

Hands-on
Late Afternoon: 16:30-17:30 XPath node predicates

Hands-on
Hands-on Hands-on Hands-on Hands-on

Wrap-up, where to go from here, staying in touch, etc.

Other topics we may cover if we have need or time:

Quizzes

Every evening after class I'll post one or two quizzes reviewing what we've done. You can find them here:

http://web.uvic.ca/~mholmes/dhoxss2013/quizzes/

Resources

Example files

This is a list of links to example files used in the course. Please download each of these files to the computer you'll be working on, so that you can open them in Oxygen to work with them.

Cheat sheets

The following files are simple "cheatsheets" you can use to remind you of XHTML, CSS and XPath while you work:

Simple PDF versions of the presentations are available alongside the cheatsheets here:

http://web.uvic.ca/~mholmes/dhoxss2013/handouts/

Many slides and handouts refer to the page or chapter of the reference work that we personally use:

References to "Kay" are to this book.

Getting help after the course

You are very welcome to email me (mholmes@uvic.ca) with any XSLT questions. I'll be glad to hear from you and find out how your work is going, and I'm happy to help (it makes me feel good to be useful). If your problem is complicated and I'm very busy I might write back and say that it will take me some time to answer, but I will answer.

If you're comfortable asking questions in a public forum, though, that's a much better approach, because then other people will benefit from the answer you get. The best place to ask questions is the Mulberry XSLT list:

http://www.mulberrytech.com/xsl/xsl-list/

Members of this list include many of the most important developers of XSLT standards and tools, including Michael Kay, the author of Saxon. The tone is professional but friendly. When you do post to the list, you should remember some important principles:

The list is gently moderated, so the first time you post you will have to wait for a moderator to approve your post.

If you like reading, you might want to buy Jeni Tennison's book Beginning XSLT 2.0: From Novice to Professional. I haven't read it myself, but I've heard good things about it, and Jeni Tennison has been teaching and explaining XSLT to the world wit great success for many years.

Finally, a bit of advice on searching the web for help: you will find more bad advice than good advice if you do this. The main reason is that most questions and answers on the web relate to XSLT 1.0 problems. XSLT 1.0 was a very limited language, so in order to do some very simple things, you often had to perform very complicated operations. If you start following advice intended for users of XSLT 1.0, you will make your life much more difficult than it needs to be.

Acknowledgements

All of the presentations and materials used in this workshop were developed between 2011 and 2013 by Syd Bauman, David Birnbaum, Julia Flanders and Martin Holmes, and they have been used in many similar workshops given by all the developers. They are released under a Creative Commons Attribution-Noncommercial-Share Alike 3.0 Unported License. Other versions of these materials, along with many other useful related resources, are available from the Brown Women Writers Project Resources Page.

Contact information

Please feel free to contact Martin Holmes by email (mholmes@uvic.ca) with any questions or concerns.