2.22. XML Integration

XML functionality in Qore is provided by the libxml2 library, which provides a powerful, stable, clean, and thread-safe basis for XML integration in Qore.

XML provides an excellent way to describe hierarchical data, and thanks to libxml2, Qore can allow for easy serialization and deserialization between XML strings and Qore data structures.

XML serialization (conversion from Qore data structures to XML strings) in Qore relies on the fact that Qore hashes retain insertion order, which means that conversion to and from Qore data structures and XML strings can be done without data loss and without reordering the XML elements. In general, XML serialization is relatively straighforward, but there are a few issues to be aware of, particularly regarding element attributes and lists. These issues are described in the following paragraphs.

First, a straightforward example:

$h = ( "record" : ( "name" : ( "first" : "Fred", "last" : "Smith" ) ) );
printf("%s\n", makeFormattedXMLString($h));

This produces the following result

<?xml version="1.0" encoding="UTF-8"?>
<record>
  <name>
    <first>Fred</first>
    <last>Smith</last>
  </name>
</record>

To set XML attributes, the Qore value must be a hash and the attributes are stored in another hash in the key ^attributes^. That is; the value of the ^attributes^ key must be a hash, and each member of this hash will represent an attribute-value pair.

For example:

$h = ( "record" : ( "^attributes^" : ( "type" : "customer" ) , 
	            "name" : ( "first" : "Fred", "last" : "Smith" ) ) );
printf("%s\n", makeFormattedXMLString($h));

This produces the following results:

<?xml version="1.0" encoding="UTF-8"?>
<record type="customer">
  <name>
    <first>Fred</first>
    <last>Smith</last>
  </name>
</record>

If instead we wanted to have text instead of child data under the "record" node, we must set the ^value^ key of the hash along with the ^attributes^ key as follows:

$h = ( "record" : ( "^attributes^" : ( "type" : "customer" ) , 
	            "^value^" : "NO-RECORD" ) );
printf("%s\n", makeFormattedXMLString($h));

Giving the following results:

<?xml version="1.0" encoding="UTF-8"?>
<record type="customer">NO-RECORD</record>

Arrays are serialized with repeating node names as follows:

$h = ( "record" : ( "part" : ( "part-02-05", "part-99-23", "part-34-28" ) ) );
printf("%s\n", makeFormattedXMLString($h));

Producing the following results:

<?xml version="1.0" encoding="UTF-8"?>
<record type="customer">
  <part>part-02-05</part>
  <part>part-99-23</part>
  <part>part-34-28</part>
</record>

It gets a little trickier when a key should repeated at the same level in an XML string, but other keys come between, for example, take the following XML string:

<?xml version="1.0" encoding="UTF-8"?>
<para>Keywords: <code>this</code>, <code>that</code>, and <code>the_other</code>.</para>

It's not possible to use a list, because text is required in between. As described earlier, the ^value^ hash key can be used to serialize text in an XML string. In this case, we need to have several text nodes and several code nodes in a mixed-up order to give us the XML string we want. Because qore hases have unique keys (we can't use the same key twice in the same hash), we resort to a key naming trick that allows us to virtually duplicate our key names and therefore arrive at the XML string we want. We do this by appending a '^' character to the end of the key name and then some unique text. When serializing hash keys, any text after (and including) the '^' character is ignored. For the special key name ^value^, we do not need to duplicate the final '^' character. Instead we just add unique text to ensure that our hash can contain all the data we want and that it will be serialized in the right order to the XML string as follows:

$h = ( "para" : ( "^value^" : "Keywords: ", 
                  "code" : "this", 
                  "^value^1" : ", ", 
                  "code^1" : "that", 
                  "^value^2" : ", and ", 
                  "code^2" : "the_other", 
                  "^value^3" : "." ) );
printf("%s\n", makeFormattedXMLString($h));

By ignoring the text after the '^' character, the above code will serialize to the XML string we want. In general, by using this convention, we can properly serialize multiple out-of-order keys without losing data and still have unique names for our hash keys.

Note than when deserializing XML strings to Qore data structures, the above rules are applied in reverse. If any out-of-order duplicate keys are detected, Qore will automatically generate unique hash key names based on the above rules.

Also note that CDATA text will be generated if a hash key starts with '^cdata'; such text will not be processed for escape code substitution. When deserializing XML strings to qore data structures, CDATA text will be placed unmodified under such a hash key as well.

Table 2.93. Functions For XML Serialization and Deserialization

Function Name

Description

makeFormattedXMLFragment()

Serializes a hash into an XML string with formatting without an XML header.

makeFormattedXMLString()

Serializes a hash into an XML string with formatting and an XML header.

makeXMLFragment()

Serializes a hash into an XML string without an XML header or formatting.

makeXMLString()

Serializes a hash into a complete XML string with an XML header and without formatting.

parseXMLAsData()

parses an XML string as data (duplicate, out-of-order XML elements are collapsed into lists) and returns a Qore hash structure.

parseXMLAsDataWithSchema()

parses an XML string as data (duplicate, out-of-order XML elements are collapsed into lists) and validates against an XSD schema string and returns a Qore hash structure.

parseXML()

parses an XML string (XML element order is preserved by appending numeric suffixes to Qore hash key names when necessary) and returns a Qore hash structure.

parseXMLWithSchema()

parses an XML string (XML element order is preserved by appending numeric suffixes to Qore hash key names when necessary) and validates against an XSD schema string and returns a Qore hash structure.


XML-RPC is a lightweight but powerful XML over HTTP web service protocol. Qore includes builtin support for this protocol. You can find more information about XML-RPC, including specifications and examples at http://xmlrpc.org.

Table 2.94. Functions Providing XML-RPC Functionality

Function Name

Description

makeFormattedXMLRPCCallString()

Serializes a hash into an XML string formatted for an XML-RPC call with formatting.

makeFormattedXMLRPCCallStringArgs()

Serializes a hash into an XML string formatted for an XML-RPC call with formatting, taking a single list argument for the argument list.

makeFormattedXMLRPCFaultResponseString()

Serializes a hash into an XML string formatted for an XML-RPC fault response with formatting.

makeFormattedXMLRPCResponseString()

Serializes a hash into an XML string formatted for an XML-RPC response with formatting.

makeFormattedXMLRPCValueString()

Serializes a hash into an XML string in XML-RPC Value format with formatting.

makeXMLRPCCallString()

Serializes a hash into an XML string formatted for an XML-RPC call without formatting.

makeXMLRPCCallStringArgs()

Serializes a hash into an XML string formatted for an XML-RPC call without formatting, taking a single list argument for the argument list.

makeXMLRPCFaultResponseString()

Serializes a hash into an XML string formatted for an XML-RPC fault response without formatting.

makeXMLRPCResponseString()

Serializes a hash into an XML string formatted for an XML-RPC response without formatting.

makeXMLRPCValueString()

Serializes a hash into an XML string in XML-RPC Value format without formatting.

parseXMLRPCCall()

deserializies an XML-RPC call string, returning a Qore hash respresenting the call information.

parseXMLRPCResponse()

deserializies an XML-RPC response string, returning a Qore hash respresenting the response information.

parseXMLRPCValue()

deserializies an XML-RPC value tree, returning a Qore hash respresenting the information.