HOW DO YOU CREATE XML DOCUMENTS WITH PHP?
XML, or Extensible Markup Language, is a very common document format frequently used for machine-to-machine data exchange protocols, for data storage, in web services and for many other uses.
Many web applications need to work with XML files. Some examples include web services, AJAX and remote applications back ends, and scripts for data storage or data transfer.
Which one should you use? What are the pros and cons of each one? And which is faster and more efficient?
A simple XML document looks like this:
<?xml version=”1.0″ encoding=”UTF-8″?> <Example> <node id=”1″ name=”node 1″> <subnode id=”1.1″ name=”subnode 1.1″/> <subnode id=”1.2″ name=”subnode 1.2″> <inner_node id=”1.2.1″ name=”inner node 1.2.1″/> </subnode> </node> </Example>
XML documents can also be much more complex than that and include other markup elements (like namespaces or comments), but in this tutorial we will focus just on how to replicate this basic example, and on how to handle the two most basic XML elements: nodes and attributes.
While we could theoretically write an XML document using standard functions only (for example, “echoing” all the lines into a variable), this is a valid option only if the document we need to create is really simple and if we know exactly which data it will contain.
In practice, most of the time we will need to create XML documents reading data from somewhere (like a database) and dynamically add or edit nodes and attributes. Doing that with standard string functions is nearly impossible.
Fortunately, PHP gives us some nice tools for handling XML documents. The first we are going to see is the SimpleXML extension, and its main class called SimpleXMLElement.
As the name suggests, this is probably the most simple and straightforward extension you can use to generate XML documents.
Here is how you can use the SimpleXMLElement class to replicate the previous example:
/* SimpleXML */ $xml_header = '<?xml version="1.0" encoding="UTF-8"?><Example></Example>'; $xml = new SimpleXMLElement($xml_header); $node1 = $xml->addChild('node'); $node1->addAttribute('id', '1'); $node1->addAttribute('name', 'node 1'); $subnode1 = $node1->addChild('subnode'); $subnode1->addAttribute('id', '1.1'); $subnode1->addAttribute('name', 'subnode 1.1'); $subnode2 = $node1->addchild('subnode'); $subnode2->addAttribute('id', '1.2'); $subnode2->addAttribute('name', 'subnode 1.2'); $inner_node1 = $subnode2->addChild('inner_node'); $inner_node1->addAttribute('id', '1.2.1'); $inner_node1->addAttribute('name', 'inner node 1.2.1'); echo $xml->asXML();
Thanks to its simple syntax and its readable code, the SimpleXML extension is a good choice if you just need to do some basic XML editing.
However, this class has a few drawbacks. First of all, the class constructor needs an existing XML document to work with, so we need to manually create it ourselves. That is the reason why we need to define the $xml_header variable before creating the SimpleXMLElement object in the code above.
Also, this extensions may not be the best choice if more complex editing is required, as it lacks some functionalities. For example, the SimpleXMLElement class doesn’t have functions for removing existing nodes (you need to manually use unset) and cannot validate documents.
Be sure to know which operations you will need to perform before deciding to use this extension for your project, and check whether they are supported.
Now let’s see the next one.
The XMLWriter class is actually a wrapper for the libxml library. The first thing that stands out about this class is that it can be used for writing documents only; if you need to read an existing document, you have to use the XMLReader class or use another XML extension altogether.
Here is the code for creating our example document with XMLWriter:
$xml = new XMLWriter(); $xml->openURI('php://output'); $xml->startDocument('1.0', 'UTF-8'); $xml->startElement('Example'); $xml->startElement('node'); $xml->writeAttribute('id', '1'); $xml->writeAttribute('name', 'node 1'); $xml->startElement('subnode'); $xml->writeAttribute('id', '1.1'); $xml->writeAttribute('name', 'subnode 1.1'); $xml->endElement(); $xml->startElement('subnode'); $xml->writeAttribute('id', '1.2'); $xml->writeAttribute('name', 'subnode 1.2'); $xml->startElement('inner_node'); $xml->writeAttribute('id', '1.2.1'); $xml->writeAttribute('name', 'inner node 1.2.1'); $xml->endElement(); $xml->endElement(); $xml->endElement(); $xml->endElement(); $xml->flush();
The XMLWriter’s syntax is definitely not the most elegant and is quite verbose. Every node needs to be opened with the startElement method and closed with endElement. When more than a few nodes are nested, the code can easily became poorly readable.
While not very pretty, this class let you create almost any XML element inside your document, so it can be a better choice than SimpleXML for handling more complex XML structures.
XMLWriter‘s main limit is its strict procedural approach, in particular the fact that it doesn’t let you modify an XML element after it has been added to the document’s structure. This means that, if you need to add a node, you need to know all its attributes before adding it, as you cannot edit it later.
This may or may not be an issue for you, depending on your application. Also, nothing stops you from using both XMLWriter and SimpleXML on the same document (or any other XML extension), of course not simultaneously. For example, you could decide to create the document from scratch with XMLWriter (taking advantage of its performance and its functionalities) and edit it later with SimpleXML for changing attributes or adding new nodes.
Now let’s see the last extension.
DOCUMENT OBJECT MODEL (DOM)
DOM is the most powerful PHP extension for creating and editing XML documents. You start by creating a DomDocument type object as the main document, and then create and manipulate multiple nodes as DomElement objects.
Every document element (that is, the document itself and every single node) is therefore a separate object, and can be edited or moved at any time.
Here is the code to generate our example document with DOM:
$xml = new DomDocument(‘1.0’, ‘UTF-8’); $example_element = $xml->createElement(‘Example’); $node1_element = $xml->createElement(‘node’); $node1_element->setAttribute(‘id’, ‘1’); $node1_element->setAttribute(‘name’, ‘node 1’); $example_element->appendChild($node1_element); $subnode1_element = $xml->createElement(‘subnode’); $subnode1_element->setAttribute(‘id’, ‘1.1’); $subnode1_element->setAttribute(‘name’, ‘subnode 1.1’); $node1_element->appendChild($subnode1_element); $subnode2_element = $xml->createElement(‘subnode’); $subnode2_element->setAttribute(‘id’, ‘1.2’); $subnode2_element->setAttribute(‘name’, ‘subnode 1.2’); $node1_element->appendChild($subnode2_element); $inner_node1_element = $xml->createElement(‘inner_node’); $inner_node1_element->setAttribute(‘id’, ‘1.2.1’); $inner_node1_element->setAttribute(‘name’, ‘inner node 1.2.1’); $subnode2_element->appendChild($inner_node1_element); $xml->appendChild($example_element); $xml->formatOutput = TRUE; echo $xml->saveXML();
If you like object oriented code structure than you will definitely like DOM‘s syntax. DOM could be the right extension to use if you need to create complex XML documents, especially if you need to dynamically modify and extend them.
The main issue with DOM is its relatively poor performance for large documents.
Now let’s see how all these extensions perform.
For handling small XML documents, all these three extensions are perfectly fine to use. However, in some applications it may be necessary to handle very large documents, in which case the extension’s performance does matter.
I ran a test script that creates an XML document multiple times inside a loop, and at the end checks the execution time (my test machine was configured with PHP 7.0). I ran the test a few times to to be sure that the results were reliable.
If you want to run the test yourself, just click the button down here to download the PHP file.
(I also tried to check memory usage, but unfortunately I didn’t manage to get any consistent result.)
Here are the speed results. Numbers indicate the execution time in seconds, so less is better:
SPEED TEST RESULTS
As you can clearly see, DOM is the slowest library while XMLWriter is the fastest. For small documents the speed difference may be negligible, but if your application is going to handle large documents this difference can become very huge.
XMLWriter runs the test in about half the time compared to DOM, while DOM and SimpleXML show similar results. For critical applications (like web services or storing procedures) it could be a good idea to try to use XMLWriter, while DOM should only be used for generating small documents at a slow rate.
SimpleXML‘s could be used instead of DOM in cases where you don’t need all the DOM‘s advanced functionalities.
In the graph below you can see more clearly the speed gain. The bars indicate the tests’ execution time compared to DOM‘s (which is used as reference at 100%).
- DOM 100% 100%
- SimpleXML 87.6% 87.6%
- XMLWriter 54.3% 54.3%
Here are the pros and cons of each extension.
- Easy and straightforward to use.
- Good code readability.
- Limited functionalities.
- Poor performance for large documents.
- Easy to learn.
- Best performance.
- Cannot read or edit existing documents.
- Poor code readability.
- Complete set of functionalities.
- Very good code readability.
- Difficult to learn and sometimes too complex.
- Very poor performances for large documents.
Take home message: use XMLWriter for large documents, fast-rate services and critical applications. In other cases, use SimpleXML if its functionalities are enough for your needs, otherwise fallback to DOM.
I really thank you for reading this post, and if you liked it please take a second to share it!
Also don’t forget to download the full PHP example I prepared: