DTD - XML Building Module
- Previous Page DTD Introduction
- Next Page DTD Element
The main building blocks of XML and HTML documents are tags like <body>....</body>.
XML document building blocks
All XML documents (and HTML documents) are composed of the following simple building blocks:
- Element
- Attributes
- Entity
- PCDATA
- CDATA
Below is a brief description of each building block.
Element
Elements are the main building blocks of XML and HTML documents.Main building blocks.
Examples of HTML elements are "body" and "table". Examples of XML elements are "note" and "message". Elements can contain text, other elements, or be empty. Examples of empty HTML elements are "hr", "br", and "img".
Example:
<body>body text in between</body> <message>some message in between</message>
Attributes
Attributes can provideAdditional information about the element.
Attributes are always placed within the start tag of an element. Attributes are always preceded byName/ValueIt appears in pairs. The following "img" element has additional information about the source file:
<img src="computer.gif" />
The name of the element is "img". The name of the attribute is "src". The value of the attribute is "computer.gif". Since the element itself is empty, it is closed with a "/".
Entity
Entities are used to define variables for common text. Entity references are references to entities.
Most students are familiar with this HTML entity reference: " ". This 'non-breaking space' entity is used in HTML to insert an additional space in a document.
Entities will be expanded when the document is parsed by an XML parser.
The following entities are predefined in XML:
Entity reference | Character |
---|---|
< | < |
> | > |
& | & |
" | " |
' | ' |
PCDATA
PCDATA stands for parsed character data.
Character data can be imagined as the text between the start and end tags of an XML element.
PCDATA is the text that will be parsed by the parser. This text will be checked by the parser for entities and markers.
Labels in the text will be treated as markers, while entities will be expanded.
However, the parsed character data should not contain any &、< or > characters; they should be replaced by the entities &、< and > respectively.
CDATA
CDATA stands for Character Data.
CDATA is text that will not be parsed by the parser.The tags in these texts will not be treated as markup, and the entities will not be expanded.
- Previous Page DTD Introduction
- Next Page DTD Element