The purpose of using schemas The schema languages DTD and XML Schema (and DSD2 and RELAX NG) Regular expressions – a commonly used formalism in schema languages
An Introduction to XML and Web Technologies
Schema Languages
Anders Møller & Michael I. Schwartzbach © 2006 Addison-Wesley
An Introduction to XML and Web Technologies
2
Motivation
We have designed our Recipe Markup Language ...but so far only informally described its syntax How can we make tools that check that an XML document is a syntactically correct Recipe Markup Language document (and thus meaningful)? Implementing a specialized validation tool for Recipe Markup Language is not the solution...
An Introduction to XML and Web Technologies
XML Languages
XML language: a set of XML documents with some semantics
schema: a formal definition of the syntax of an XML language
schema language: a notation for writing schemas
3
An Introduction to XML and Web Technologies
4
1
Validation instance document schema schema processor valid normalized instance document invalid
Why use Schemas?
Formal but human-readable descriptions Data validation can be performed with existing schema processors
error message
An Introduction to XML and Web Technologies
5
An Introduction to XML and Web Technologies
6
General Requirements
Regular Expressions
Commonly used in schema languages to describe sequences of characters or elements Σ: an alphabet (typically Unicode characters or element names) σ∈Σ matches the string σ α? matches zero or one α α* matches zero or more α’s α+ matches one or more α’s α β matches any concatenation of an α and a β α | β matches the union of α and β
7
An Introduction to XML and Web Technologies
Expressiveness Efficiency Comprehensibility
An Introduction to XML and Web Technologies
8
2
Examples
A regular expression describing integers:
0|-?(1|2|3|4|5|6|7|8|9)(0|1|2|3|4|5|6|7|8|9)*
DTD – Document