So, you’re looking at the top of a web page’s source code, and you see something like this:
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN" "http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd">
What’s the relationship between that and the actual code in the web page?
Well, a DOCTYPE tag declares what document type this webpage is, by formally specifying a Document Type Descriptor (that’s what the “dtd” in the filename and in the declaration means). This is the formal specification, written in its own computer language, used to define legal dialects of languages descended from SGML. Most predominantly, this includes languages like HTML 4.01 and XHTML. Hence this walkthrough.
In our specific case, it references a specification for XHTML, which is a modular XML-expressed version of HTML. Let’s look inside.
Firstly, if we look in the declaration, we see the link ““http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd” which, if you download it, shows the DTD itself. For XHTML, this is a relatively short document; the specification largely consists of modules, referenced from this document. Let’s have a look.
Within the DTD, you’ll see this section (around line 121):
<!-- Text Module (Required) ..................................... --> <![%xhtml-text.module;[ %xhtml-text.mod;]]>
This defines a module to be included, which itself is a technically part of the DTD as it is INCLUDEd.
If you navigate to the included module, “http://www.w3.org/TR/xhtml-modularization/DTD/xhtml-text-1.mod”, you’ll see a further set of INCLUDEd items, for example:
This entry includes the inline structural elements br and span, and further down the document we have more included modules containing inline phrasal elements (em, strong etc.), block structural (p and div), and block phrasal (h1, h2 etc.).
In each, you’ll see the definitions for tags such as p, div, code, strong, em and so on.
For comparison, have a look at the HTML 4.01 DTD, which you’ll be able to follow using the DOCTYPE:
…and linked to from here: http://www.w3.org/TR/html4/strict.dtd. As you’ll see, it’s not quite modular, but still contains code defining the elements (and their contents, attributes and so on) that are legal within the dialect concerned.