PHP XML Expat Parser

Course recommendation:

Built-in Expat parser makes it possible to process XML documents in PHP.

What is XML?

In XML, there are no predefined tags. You must define your own tags.

If you want to learn more about XML, please visit our XML Tutorial.

What is Expat?

To read and update - create, process - an XML document, you need an XML parser.

There are two basic types of XML parsers:

  • Tree-based parser: This parser converts the XML document into a tree structure. It analyzes the entire document and provides an API to access elements in the tree, such as the Document Object Model (DOM).
  • Event-based parser: Treats the XML document as a series of events. The parser calls functions to handle when a specific event occurs.

The Expat parser is an event-based parser.

An event-based parser focuses on the content of the XML document rather than its results. For this reason, an event-based parser can access data faster than a tree-based parser.

Please see the following XML snippet:

<from>John</from>

An event-based parser reports the above XML as a series of three events:

  • Starting element: from
  • Starting CDATA section, value: John
  • Closing element: from

The above XML example contains well-formed XML. However, this example is invalid XML because it does not have an associated document type declaration (DTD) and does not have an embedded DTD.

However, there is no difference when using the Expat parser. Expat is not a validating parser and ignores any DTD.

As an event-based, non-validating XML parser, Expat is fast and lightweight, making it very suitable for PHP web applications.

Note:An XML document must be well-formed; otherwise, Expat will generate errors.

Installation

The XML Expat parser is a part of the PHP core. These functions can be used without installation.

XML File

The following XML file will be used in our example:

<?xml version="1.0" encoding="ISO-8859-1"?>
<note>
<to>George</to>
<from>John</from>
<heading>Reminder</heading>
<body>Don't forget the meeting!</body>
</note>

Initialize XML parser

We need to initialize an XML parser in PHP, define handlers for different XML events, and then parse this XML file.

Example

<?php
//Initialize the XML parser
$parser=xml_parser_create();
//Function to use at the start of an element
function start($parser,$element_name,$element_attrs)
  {
  switch($element_name)
    {
    case "NOTE":
    echo "-- Note --<br />";
    break; 
    case "TO":
    echo "To: ";
    break; 
    case "FROM":
    echo "From: ";
    break; 
    case "HEADING":
    echo "Heading: ";
    break; 
    case "BODY":
    echo "Message: ";
    }
  }
//Function to use at the end of an element
function stop($parser,$element_name)
  {
  echo "<br />";
  }
//Function to use when finding character data
function char($parser,$data)
  {
  echo $data;
  }
//Specify element handler
xml_set_element_handler($parser,"start","stop");
//Specify data handler
xml_set_character_data_handler($parser,"char");
//Open XML file
$fp=fopen("test.xml","r");
//Read data
while ($data=fread($fp,4096))
  {
  xml_parse($parser,$data,feof($fp)) or 
  die (sprintf("XML Error: %s at line %d", 
  xml_error_string(xml_get_error_code($parser)),
  xml_get_current_line_number($parser)));
  }
//Free the XML parser
xml_parser_free($parser);
?>

The output of the above code:

-- Note --
To: George
From: John
Heading: Reminder
Message: Don't forget the meeting!

Explanation of the working principle:

  • Initialize the XML parser through the xml_parser_create() function
  • Create functions that work with different event handler programs
  • Add the xml_set_element_handler() function to define which function to execute when the parser encounters start and end tags
  • Add the xml_set_character_data_handler() function to define which function to execute when the parser encounters character data
  • Parse the file "test.xml" through the xml_parse() function
  • In case of errors, add the xml_error_string() function to convert XML errors to text descriptions
  • Call the xml_parser_free() function to release the memory allocated to the xml_parser_create() function

More information about the PHP Expat parser

For more information about PHP Expat functions, please visit our PHP XML Parser reference manual.