FoX documentation.

All in one page

Separate pages

Introduction

Other documentation

In January 2007, a workshop was run introducing FoX. This was iFax: Integrating Fortran and XML, organized by the National Institute for Environmental eScience.

The materials for this workshop included several talks, a set of documentation, and a number of example programs and problem sets to work through. All of these materials are available for download.

API documentation

COMMON interfaces

OUTPUT interfaces

INPUT interface

These documents describe all publically usable APIs.

If a subroutine or function, or indeed one of its arguments, is not mentioned above, it is not to be considered part of the stable API, even if it is accessible.

The astute developer is reminded that all times the final reference documentation is the source, which is publically available.

Other things


FoX versioning

This documentation describes version 2.1.1 of the FoX library

FoX was originally based on the version 1.2 of the xmlf90 library, but has since evolved heavily.

This release version includes output modules for general XML, and for CML, and also a Fortran version of the SAX2 input parser interface

This is a stable branch, which will be maintained with important bugfixes, but on which no further major development will occur.

Version 2.1.1 has support for outputting complete XML documents, with support for all XML objects described in XML11, and XML Namespaces. A detailed description of its precise conformance level is in the WXML documentation.

In addition, there is a large suite of routines available for outputting valid CML documents.

There is also a SAX input module, compatible with the SAX 2 standard - precise conformance details listed in the SAX documentation.

Input modules are under development for DOM and XPath, and will be released with a later version.


Configuration and compilation

You will have received the FoX source code as a tar.gz file.

Unpack it as normal, and change directory into the top-level directory, FoX.

Requirements for use

FoX requires a Fortran 95 compiler - not just Fortran 90. All currently available versions of Fortran compilers claim to support F95. If your favoured compiler is not listed as working below, I recommend the use of g95, which is free to download and use. And if your favourite compiler is listed as not working, then please send a bug report to your compiler vendor.

As of version 2.1.1, FoX has been tested with the following compiler versions.

Successfully:

Known failures: * g95 prior to 25/5/7: fails to compile. * PGI version 6.2-3: compilation failure in WCML. All common/ tests pass. Several failures due to memory allocation in wxml/.

Results from other compilers are welcome.

As of version 2.0.2, the following other compilers had been tested and are known to work:

and the following compilers tested and known to fail

Configuration

This should suffice for most installations. However:

  1. If you have more than one Fortran compiler available, or it is not on your PATH, you can force the choice by doing:

    config/configure FC=/path/to/compiler/of/choice

  2. It is possible that the configuration fails. In this case

    • please tell me about it so I can fix it
    • all relevant compiler details are placed in the file arch.make; you may be able to edit that file to allow compilation. Again, if so, please let me know what you need to do.
  3. By default the resultant files are installed under the objs directory. If you wish them to be installed elsewhere, you may do

    config/configure --prefix=/path/to/installation

Note that the configure process encodes the current directory location in several places. If you move the FoX directory later on, you will need to re-run configure.

Compilation

In order to compile the full library, now simply do:

make

This will build all the FoX modules, and all the examples. However, you may only be interested in building the libraries, or perhaps a subset of the libraries. In that case, the following targets are available:

wxml_lib
wcml_lib
sax_lib

Testing

Three test-suites are supplied; in common/test, wxml/test, and wcml/test. In each case, cd to the relevant directory and then run ./run_tests.sh.

(The sax testsuite is available separately. Please contact the author for details.)

The tests will run and then print out the number of passes and fails. Details of failing tests may be found in the file failed.out.

Known failures:
* test_xml_Close_2 sometimes unexpectedly fails - this is not a problem, ignore it.

If any other failures occur, please send a message to the mailing list (FoX@lists.uszla.me.uk) with details of compiler, hardware platform, and the nature of the failure.

Linking to an existing program

A script is provided which will provide the appropriate compiler and linker flags for you; this will be created after configuration, in the top-level directory, and is called FoX-config. It may be taken from there and placed anywhere.

FoX-config takes the following arguments:

If it is called with no arguments, it will expand to compile & link flags, thusly:

   f95 -o program program.f90 `FoX-config`

For compiling only against FoX, do the following:

f95 -c `FoX-config --fcflags` sourcefile.f90

For linking only to the FoX library, do:

f95 -o program `FoX-config --libs` *.o

or similar, according to your compilation scheme.

Note that by default, FoX-config assumes you are using all modules of the library. If you are only using part, then this can be specified by also passing the name of each module required, like so:

FoX-config --fcflags --wcml

Using FoX in your own project.

The recommended way to use FoX is to embed the full source code into an existing project.

(It would be possible to extract portions of the code, and embed just the ones that you need, but I recommend against it; it would be easy to lose parts of the code which are essential for generating good XML.)

In order to do this, you need to do something like the following:

  1. Put the full source code as a top-level subdirectory of the tree, called FoX. (you can of course delete the DoX/ and examples/ subdirectories if you wish to save space)
  2. Incorporate calls to FoX into the program.
  3. Incorporate building FoX into your build process.

To incorporate into the program

There is an example of suggested use in the examples/ subdirectory.

The easiest, and least intrusive way is probably to create a F90 module for your program, looking something like example_xml_module.f90

Then you must somewhere (probably in your main program), use this module, and call initialize_xml_output() at the start; and then end_xml_output() at the end of the program.

In any of the subroutines where you want to output data to the xml file, you should then insert use example_cml_moule at the beginning of the subroutine. You can then use any of the cml output routines with no further worries, as shown in the examples.

It is easy to make the use of FoX optional, by the use of preprocessor defines. This can be done simply by wrapping each call to your XML wrapper routines in #ifdef XML, or similar.

To incorporate into the build process:

If you have some sort of automatic Makefile configuration; for picking up which compiler to use, etc. then within whatever script you use to do this, you should insert a sequence of commands like:

(cd FoX; config/configure; cd ..)

This will instruct FoX to perform its own automatic configuration process.

Within the Makefile itself, you need to alter your compiler flags in the following fashion. Assuming that you have some sort of FFLAGS Makefile variable, then it should be amended like so:

FFLAGS="$(FFLAGS) `FoX/FoX-config --fcflags`"

You must also alter the linking step to include the FoX subroutines Again, assuming that you have some sort of variable LDFLAGS holding your linking flags, then it should be amended like so:

LDFLAGS="$(LDFLAGS) `FoX/FoX-config --libs`"

If you don't have any automatic Makefile configuration, and rely on the user making hand-edited changes to Makefiles, then you must add to your documentation how to configure & build FoX.


FoX_common

FoX_common is a module exporting interfaces to a set of convenience functions common to all of the FoX modules, which are of more general use.

Currently, the only publically available function is str, which converts primitive datatypes into strings in a consistent fashion, conformant with the expectations of XML processors.

It is fully described in StringFormatting


String handling in FoX

Many of the routines in wxml, and indeed in wcml which is built on top of wxml, are overloaded so that data may be passed to the same routine as string, integer, logical or real data.

In such cases, a few notes on the conversion of non-textual data to text is on order. The standard Fortran I/O formatting routines do not offer the control required for useful XML output, so FoX performs all its own formatting.

This formatting is done internally through a function which is also available publically to the user, str.

To use this in your program, import it via:

use FoX_common, only; str

and use it like so:

 print*, str(data)

In addition, for ease of use, the // concatenation operator is overloaded, such that strings can easily be formed by concatenation of strings to other datatypes. To use this you must import it via:

 use FoX_common, only: operator(//)

and use it like so:

 integer :: data
 print*, "This is a number "//data

This will work for all native Fortran data types - but no floating point formatting is available as described below with concatenation, only with str()

You may pass data of the following primitive types to str:

Scalar data

Character (default kind)

Character data is returned unchanged.

Logical (default kind)

Logical data is output such that True values are converted to the string 'true', and False to the string 'false'.

Integer (default kind)

Integer data is converted to the standard decimal representation.

Real numbers (single and double precision)

Real numbers, both single and double precision, are converted to strings in one of two ways, with some control offered to the user. The output will conform to the real number formats specified by XML Schema Datatypes.

This may be done in one of two ways:

  1. Exponential notation, with variable number of significant figures. Format strings of the form "sn" are accepted, where n is the number of significant figures.

    Thus the number 111, when output with various formats, will produce the following output:

s1 1e2
s2 1.1e2
s3 1.11e2
s4 1.110e2

The number of significant figures should lie between 1 and the number of digits precision provided by the real kind. If a larger or smaller number is specified, output will be truncated accordingly. If unspecified, then a sensible default will be chosen.

This format is not permitted by XML Schema Datatypes 1.0, though it is in 2.0

  1. Non-exponential notation, with variable number of digits after the decimal point. Format strings of the form "rn", where n is the number of digits after the decimal point.

    Thus the number 3.14159, when output with various formats, will produce the following output:

r0 3
r1 3.1
r2 3.14
r3 3.142

The number of decimal places must lie between 0 and whatever would output the maximum digits precision for that real kind. If a larger or smaller number is specified, output will be truncated accorsingly. If unspecified, then a sensible default will be chosen.

This format is the only one permitted by XML Schema Datatypes 1.0

If no format is specified, then a default of exponential notation will be used.

If a format is specified not conforming to either of the two forms above, a run-time error will be generated.

NB Since by using FoX or str, you are passing real numbers through various functions, this means that they must be valid real numbers. A corollary of this is that if you pass in +/-Infinity, or NaN, then the behaviour of FoX is unpredictable, and may well result in a crash. This is a consequence of the Fortran standard, which strictly disallows doing anything at all with such numbers, including even just passing them to a subroutine.

Complex numbers (single and double precision)

Complex numbers will be output as pairs of real numbers, in the following way:

(1.0e0)+i(1.0e0)

where the two halves can be formatted in the way described for 'Real numbers' above; only one format may be specified, and it will apply to both.

All the caveats described above apply for complex number as well; that is, output of complex numbers either of whose components are infinite or NaN is illegal in Fortran, and more than likely will cause a crash in FoX.

Arrays and matrices

All of the above types of data may be passed in as arrays and matrices as well. In this case, a string containing all the individual elements will be returned, ordered as they would be in memory, each element separated by a single space.

If the data is character data, then there is an additional option to str, delimiter which may be any single-character string, and will replace a space as the delimiter.

wxml/wcml wrappers.

All functions in wxml which can accept arbitrary data (roughly, wherever you put anything that is not an XML name; attribute values, pseudo-attribute values, character data) will take scalars, arrays, and matrices of any of the above data types, with fmt= and delimiter= optional arguments where appropriate.

Similarly, wcml functions which can accept varied data will behave similarly.


WXML

wxml is a general Fortran XML output library. It offers a Fortran interface, in the form of a number of subroutines, to generate well-formed XML documents. Almost all of the XML features described in XML11 and Namespaces are available, and wxml will diagnose almost all attempts to produce an invalid document. Exceptions below describes where wxml falls short of these aims.

First, Conventions describes the conventions use in this document.

Then, Functions lists all of wxml's publically exported functions, in three sections:

  1. Firstly, the very few functions necessary to create the simplest XML document, containing only elements, attributes, and text.
  2. Secondly, those functions concerned with XML Namespaces, and how Namespaces affect the behaviour of the first tranche of functions.
  3. Thirdly, a set of more rarely used functions required to access some of the more esoteric corners of the XML specification.

Please note that where the documentation below is not clear, it may be useful to look at some of the example files. There is a very simple example in the examples/ subdirectory, but which nevertheless shows the use of most of the features you will use.

A more elaborate example, using almost all of the XML features found here, is available in the top-level directory as wxml_example.f90. It will be automatically compiled as part of the build porcess.

Conventions and notes:

Conventions used below.

Note that where strings are passed in, they will be passed through entirely unchanged to the output file - no truncation of whitespace will occur.

It is strongly recommended that the functions be used with keyword arguments rather than replying on implicit ordering.

Derived type: xmlf_t

This is an opaque type representing the XML file handle. Each function requires this as an argument, so it knows which file to operate on. (And it is an output of the xml_OpenFile subroutine) Since all subroutines require it, it is not mentioned below.

Function listing

Frequently used functions

Open a file for writing XML

By default, the XML will have no extraneous text nodes. This has the effect of it looking slightly ugly, since there will be no newlines inserted between tags.

This behaviour can be changed to produce slightly nicer looking XML, by switching on broken_indenting. This will insert newlines and spaces between some tags where they are unlikely to carry semantics. Note, though, that this does result in the XML produced being not quite what was asked for, since extra characters and text nodes have been inserted.

NB: The replace option should be noted. By default, xml_OpenFile will fail with a runtime error if you try and write to an existing file. If you are sure you want to continue on in such a case, then you can specify **replace**=.true. and any existing files will be overwritten. If finer granularity is required over how to proceed in such cases, use the Fortran inquire statement in your code. There is no 'append' functionality by design - any XML file created by appending to an existing file would almost certainly be invalid.

Close an opened XML file, closing all still-opened tags so that it is well-formed.

In the normal run of event, trying to close an XML file with no root element will cause an error, since this is not well-formed. However, an doptional argument, empty is provided in case it is desirable to close files which may be empty. In this case, a warning will still be emitted, but no fatal error generated.

Open a new element tag

Close an open tag

Add an attribute to the currently open tag.

By default, if the attribute value contains markup characters, they will be escaped automatically by wxml before output.

However, in rare cases you may not wish this to happen - if you wish to output Unicode characters, or entity references. In this case, you should set escape=.false. for the relevant subroutine call. Note that if you do this, no checking on the validity of the output string iis performed; the onus is on you to ensure well-formedness

The value to be added may be of any type; it will be converted to text according to FoX's formatting rules, and if it is a 1- or 2-dimensional array, the elements will all be output, separated by spaces (except if it is a character array, in which case the delimiter may be changed to any other single character using an optional argument).

NB The type option is only provided so that in the case of an external DTD which FoX is unaware of, the attribute type can be specified (which gives FoX more information to ensure well-formedness and validity). Specifying the type incorrectly may result in spurious error messages)

Add text data. The data to be added may be of any type; they will be converted to text according to FoX's formatting rules, and if they are a 1- or 2-dimensional array, the elements will all be output, separated by spaces (except if it is a character array, in which case the delimiter may be changed to any other single character using an optional argument).

Within the context of character output, add a (system-dependent) newline character. This function can only be called wherever xml_AddCharacters can be called. (Newlines outside of character context are under FoX's control, and cannot be manipulated by the user.)

Namespace-aware functions:

Add an XML Namespace declaration. This function may be called at any time, and its precise effect depends on when it is called; see below

Undeclare an XML namespace. This is equivalent to declaring an namespace with an empty URI, and renders the namespace ineffective for the scope of the declaration. For explanation of its scope, see below.

NB Use of xml_UndeclareNamespace implies that the resultant document will be compliant with XML Namespaces 1.1, but not 1.0; wxml will issue an error when trying to undeclare namespaces under XML 1.0.

Scope of namespace functions

If xml_[Un]declareNamespace is called immediately prior to an xml_NewElement call, then the namespace will be declared in that next element, and will therefore take effect in all child elements.

If it is called prior to an xml_NewElement call, but that element has namespaced attributes

To explain by means of example: In order to generate the following XML output:

 <cml:cml xmlns:cml="http://www.xml-cml.org/schema"/>

then the following two calls are necessary, in the prescribed order:

  xml_DeclareNamespace(xf, 'cml', 'http://www.xml-cml.org')
  xml_NewElement(xf, 'cml:cml')

However, to generate XML input like so: that is, where the namespace refers to an attribute at the same level, then as long as the xml_DeclareNamespace call is made before the element tag is closed (either by xml_EndElement, or by a new element tag being opened, or some text being added etc.) the correct XML will be generated.

Two previously mentioned functions are affected when used in a namespace-aware fashion.

The element or attribute name is checked, and if it is a QName (ie if it is of the form prefix:tagName) then wxml will check that prefix is a registered namespace prefix, and generate an error if not.

More rarely used functions:

If you don't know the purpose of any of these, then you don't need to.

Add XML declaration to the first line of output. If used, then the file must have been opened with addDecl = .false., and this must be the first wxml call to the document.o

NB The only XML versions available are 1.0 and 1.1. Attempting to specify anything else will result in an error. Specifying version 1.0 results in additional output checks to ensure the resultant document is XML-1.0-conformant.

NB Note that if the encoding is specified, and is specified to not be UTF-8, then if the specified encoding does not match that supported by the Fortran processor, you may end up with output you do not expect.

Add an XML document type declaration. If used, this must be used prior to first xml_NewElement call, and only one such call must be made.

Define an internal entity for the document. If used, this call must be made after xml_AddDOCTYPE and before the first xml_NewElement call.

Define an external entity for the document. If used, this call must be made after xml_AddDOCTYPE and before the first xml_NewElement call.

Define a parameter entity for the document. If used, this call must be made after xml_AddDOCTYPE and before the first xml_NewElement call.

Define a notation for the document. If used, this call must be made after xml_AddDOCTYPE and before the first xml_NewElement call.

Add an ELEMENT declaration to the DTD. The syntax of the declaration is not checked in any way, nor does this affect how elements may be added in the content of the XML document.

If used, this call must be made after xml_AddDOCTYPE and before the first xml_NewElement call.

Add an ATTLIST declaration to the DTD. The syntax of the declaration is not checked in any way, nor does this affect how attributes may be added in the content of the XML document.

If used, this call must be made after xml_AddDOCTYPE and before the first xml_NewElement call.

Add a reference to a Parameter Entity in the DTD. No check is made according to whether the PE exists, has been declared, or may legally be used.

If used, this call must be made after xml_AddDOCTYPE and before the first xml_NewElement call.

Add XML stylesheet processing instruction, as described in [Stylesheets]. If used, this call must be made before the first xml_NewElement call.

Add an XML Processing Instruction.

If data is present, nothing further can be added to the PI. If it is not present, then pseudoattributes may be added using the call below. Normally, the name is checked to ensure that it is XML-compliant. This requires that PI targets not start with [Xx][Mm][Ll], because such names are reserved. However, some are defined by later W3 specificataions. If you wish to use such PI targets, then set xml=.true. when outputting them.

The output PI will look like: <?name data?>

Add a pseudoattribute to the currently open PI.

Add an XML comment.

This may be used anywhere that xml_AddCharacters may be, and will insert an entity reference into the contents of the XML document at that point. Note that if the entity inserted is a character entity, its validity well be checked according to the rules of XML-1.1, not 1.0.

If the entity reference is not a character entity, then no check is made of its validity, and a warning will be issued

Functions to query XML file objects

These functions may be of use in building wrapper libraries:

Return the filename of an open XML file

Return the currently open tag of the current XML file (or the empty string if none is open)

Exceptions

Below are explained areas where wxml fails to implement the whole of XML 1.0/1.1; numerical references below are to the sections in [XML11]]. These are divided into two lists; where wxml does not permit the generation of a particular well-formed XML document, and where it does permit the generation of a particular non-well-formed document.

Ways in which wxml renders it impossible to produce a certain sort of well-formed XML document:

  1. XML documents which are not namespace-valid may not be produced; that is, attempts to produce documents which are well-formed according to [XML11] but not namespace-well-formed according to [Namespaces] will fail.
  2. Unicode support[[2.2]](http://www.w3.org/TR/xml11/#charsets) is limited. Due to the limitations of Fortran, wxmlis unable to manipulate characters outwith 7-bit US-ASCII. wxml will ensure that characters corresponding to those in 7-bit ASCII are output correctly within the constraints of the version of XML in use, for a UTF-8 encoding. Attempts to directly output any other characters will have undefined effects. Output of other unicode characters is possible through the use of character entities.
  3. Due to the constraints of the Fortran IO specification, it is impossible to output arbitrary long strings without carriage returns. The size of the limit varies between processors, but may be as low as 1024 characters. To avoid overrunning this limit, wxml will by default insert carriage returns before every new element, and if an unbroken string of attribute or text data is requested greater than 1024 characters, then carriage returns will be inserted as appropriate; within whitespace if possible; to ensure it is broken up into smaller sections to fit within the limits. Thus unwanted text sections are being created, and user output modified.

wxml will try very hard to ensure that output is well-formed. However, it is possible to fool wxml into producing ill-formed XML documents. Avoid doing so if possible; for completeness these ways are listed here. In all cases where ill-formedness is a possibility, a warning can be issued. These warnings can be verbose, so are off by default, but if they are desired, they can be switched on by manipulating the warning argument to xml_OpenFile.

  1. If you specify a non-default text encoding, and then run FoX on a platform which does not use this encoding, then the result will be nonsense, and more than likely ill-formed. FoX will issue a warning in this case.
  2. When adding any text, if any characters are passed in (regardless of character set) which do not have equivalants within 7-bit ASCII, then the results are processor-dependent, and may result in an invalid document on output. A warning will be issued if this occurs. If you need a guarantee that such characters will be passed correctly, use character entities.
  3. Although entities may be output, their contents are not comprehensively checked. It is therefore possible to output combinations of entities which produce nonsense when referenced and expanded. FoX will issue a warning when this is possible.
  4. Whenever an external subset of the DTD is referenced, and the document is not standalone, then FoX is unable to check that any unknown references exist or are used correctly, and can therefore no longer guarantee well-formedness. In this case, a warning will be issued.
  5. When adding ELEMENT and ATTLIST declarations in the DTD, no checking at all is done on the contents of the declarations passed in, neither at the level of mere syntax, nor at the level of consistency; so that if the declaration is invalid syntactically, the resultant XML document will be ill-formed. A warning will be issued if either function is used.
  6. Within the DTD whenever any External ID is written with a SYSTEM ID, this should be a string which can be resolved to a correctly-formatted URI string. No such check is made by wxml though.

Finally, note that constraints on XML documents are divided into two sets - well-formedness constraints (WFC) and validity constraints (VC. The above only applies to WFC checks. wxml makes some minimal checks on VC checks, but this is by no means complete, nor is it intended to be. If it is necessary to produce invalid but well-formed documents, VC checks may be switched off by manipulating the valid argument to xml_OpenFile.

References

[XML10]: W3C Recommendation, http://www.w3.org/TR/REC-xml/

[XML11]: W3C Recommendation, http://www.w3.org/TR/xml11

[Namespaces]: W3C Recommendation, http://www.w3.org/TR/xml-names11

[Stylesheets]: W3C Recommendation, http://www.w3.org/TR/xml-stylesheet


WCML

WCML is a library for outputting CML data. It wraps all the necessary XML calls, such that you should never need to touch any WXML calls when outputting CML.

The CML output is conformant to version 2.4 of the CML schema.

The available functions and their intended use are listed below. Quite deliberately, no reference is made to the actual CML output by each function.

Wcml is not intended to be a generalized Fortran CML output layer. rather it is intended to be a library which allows the output of a limited set of well-defined syntactical fragments.

Further information on these fragments, and on the style of CML generated here, is available at http://www.uszla.me.uk/specs/subset.html.

This section of the manual will detail the available CML output subroutines.

Use of WCML

wcml subroutines can be accessed from within a module or subroutine by inserting

 use FoX_wcml

at the start. This will import all of the subroutines described below, plus the derived type xmlf_t needed to manipulate a CML file.

No other entities will be imported; public/private Fortran namespaces are very carefully controlled within the library.

Dictionaries.

The use of dictionaries with WCML is strongly encouraged. (For those not conversant with dictionaries, a fairly detailed explanation is available at http://www.xml-cml.org/information/dictionaries)

In brief, dictionaries are used in two ways.

Identification

Firstly, to identify and disambiguate output data. Every output function below takes an optional argument, dictRef="". It is intended that every piece of data output is tagged with a dictionary reference, which will look something like nameOfCode:nameOfThing.

So, for example, in SIESTA, all the energies are output with different dictRefs, looking like: siesta:KohnShamEnergy, or siesta:kineticEnergy, etc. By doing this, we can ensure that later on all these numbers can be usefully identified.

We hope that ultimately, dictionaries can be written for codes, which will explain what some of these names might mean. However, it is not in any way necessary that this be done - and using dictRef attributes will help merely by giving the ability to disambiguate otherwise indistinguishable quantities.

We strongly recommend this course of action - if you choose to do follow our recommendation, then you should add a suitable Namespace to your code. That is, immediately after cmlBeginFile and before cmlStartCml, you should add something like:

call cmlAddNamespace(xf=xf, 'nameOfCode', 'WebPageOfCode')

Again, for SIESTA, we add:

call cmlAddNamespace(xf, 'siesta, 'http://www.uam.es/siesta')

If you don't have a webpage for your code, don't worry; the address is only used as an identifier, so anything that looks like a URL, and which nobody else is using, will suffice.

Quantification

Secondly, we use dictionaries for units. This is compulsory (unlike dictRefs above). Any numerical quantity that is output through cmlAddProperty or cmlAddParameter is required to carry units. These are added with the units="" argument to the function. In addition, every other function below which will take numerical arguments also will take optional units, although default will be used if no units are supplied.

Further details are supplied in section Units below.

General naming conventions for functions.

Functions are named in the following way:

Conventions used below.

Note that where strings are passed in, they will be passed through entirely unchanged to the output file - no truncation of whitespace will occur.

Also note that wherever a real number can be passed in (including through anytype) then the formatting can be specified using the conventions described in StringFormatting

Where an array is passed in, it may be passed either as an assumed-shape array; that is, as an F90-style array with no necessity for specifying bounds; thusly:

integer :: array(50)
call cmlAddProperty(xf, 'coords', array)

or as an assumed-size array; that is, an F77-style array, in which case the length must be passed as an additional parameter:

integer :: array(*)
call cmlAddProperty(xf, 'coords', array, nitems=50)

Similarly, when a matrix is passed in, it may be passed in both fashions:

integer :: matrix(50, 50)
call cmlAddProperty(xf, 'coords', matrix)

or

integer :: array(3, *)
call cmlAddProperty(xf, 'coords', matrix, nrows=3, ncols=50)

All functions take as their first argument an XML file object, whose keyword is always xf. This file object is initialized by a cmlBeginFile function.

It is highly recommended that subroutines be called with keywords specified rather than relying on the implicit ordering of arguments. This is robust against changes in the library calling convention; and also stepsides a significant cause of errors when using subroutines with large numbers of arguments.

Units

Note below that the functions cmlAddParameter and cmlAddProperty both require that units be specified for any numerical quantities output.

If you are trying to output a quantity that is genuinely dimensionless, then you should specify units="units:dimensionless"; or if you are trying to output a countable quantity (eg number of CPUs) then you may specify units="units:countable".

For other properties, all units should be specified as namespaced quantities. If you are using a very few common units, it may be easiest to borrow definitions from the provided dictionaries;

(These links do not resolve yet.)

cmlUnits: http://www.xml-cml.org/units/units siUnits: <http://www.xml-cml.org/units/siUnits atomicUnits: http://www.xml-cml.org/units/atomic

A default units dictionary, containing only the very basic units that wcml needs to know about, which has a namespace of: http://www.uszla.me.uk/FoX/units, and wcml assigns it automatically to the prefix units.

This is added automatically, so attempts to add it manually will fail.

The contents of all of these dictionaries, plus the wcml dictionary, may be viewed at: http://www.uszla.me.uk/unitsviz/units.cgi.

Otherwise, you should feel at liberty to construct your own namespace; declare it using cmlAddNamespace, and markup all your units as:

 units="myNamespace:myunit"

Functions for manipulating the CML file:

This takes care of all calls to open a CML output file.

This takes care of all calls to close an open CML output file, once you have finished with it. It is compulsory to call this - if your program finished without calling this, then your CML file will be invalid.

This adds a namespace to a CML file.
NB This may only ever be called immediately after a cmlBeginFile call, before any output has been performed. Attempts to do otherwise will result in a runtime error.

This will be needed if you are adding dictionary references to your output. Thus for siesta, we do:

call cmlAddNamespace(xf, 'siesta', 'http://www.uam.es/siesta')

and then output all our properties and parameters with dictRef="siesta:something".

This pair of functions begin and end the CML output to an existing CML file. It takes care of namespaces.

Note that unless specified otherwise, there will be a convention attribute added to the cml tag specifying FoX_wcml-2.0 as the convention. (see http://www.uszla.me.uk/FoX for details)

Start/End sections

This pair of functions open & close a metadataList, which is a wrapper for metadata items.

This pair of functions open & close a parameterList, which is a wrapper for input parameters.

This pair of functions open & close a propertyList, which is a wrapper for output properties.

Start/end a list of bands (added using cmlAddBand below)

Start/end a list of k-points (added using cmlAddKpoint below)

Note that in most cases where you might want to use a serial number, you should probably be using the cmlStartStep subroutine below.

This pair of functions open & close a module of a computation which is unordered, or loosely-ordered. For example, METADISE uses one module for each surface examined.

This pair of functions open and close a module of a computation which is strongly ordered. For example, DLPOLY uses steps for each step of the simulation.

Adding items.

This adds a single item of metadata. Metadata vocabulary is completely uncontrolled within WCML. This means that metadata values may only be strings of characters. If you need your values to contain numbers, then you need to define the representation yourself, and construct your own strings.

This function adds a tag representing an input parameter

This function adds a tag representing an output property

Outputs an atomic configuration.

Outputs information about a unit cell, in lattice-vector form

Outputs information about a unit cell, in crystallographic form

Output eigenvalues for a band.

Output a k-point

Output a set of eigenvalues and eigenvectors

Common arguments

All cmlAdd and cmlStart routines take the following set of optional arguments:


Debugging with FoX.

Following experience integrating FoX into several codes, here are a few tips for debugging any problems you may encounter.

Compilation problems

You may encounter problems at the compiling or linking stage, with error messages along the lines of: 'No Specific Function can be found for this Generic Function' (exact phrasing depending on compiler, of course.)

If this is the case, it is possible that you have accidentally got the arguments to the offending out of order. If so, then use the keyword form of the argument to ensure correctness; that is, instead of doing:

call cmlAddProperty(file, name, value)

do:

call cmlAddProperty(xf=file, name=name, value=value)

This will prevent argument mismatches, and is recommended practise in any case.

Runtime problems

You may encounter run-time issues. FoX performs many run-time checks to ensure the validity of the resultant XML code. In so far as it is possible, FoX will either issue warnings about potential problems, or try and safely handle any errors it encounters. In both cases, warning will be output on stderr, which will hopefully help diagnose the problem.

Sometimes, however, FoX will encounter a problem it can do nothing about, and must stop. In all cases, it will try and write out an error message highlighting the reason, and generate a backtrace pointing to the offending line. Occasionally though, the compiler will not generate this information, and the error message will be lost.

If this is the case, you can either investigate the coredump to find the problem, or (if you are on a Mac) look in ~/Library/Logs/CrashReporter to find a human-readable log.

If this is not enlightening, or you cannot find the problem, then some of the most common issues we have encountered are listed below. Many of them are general Fortran problems, but sometimes are not easily spotted in the context of FoX.

Incorrect formatting.

Make sure, whenever you are writing out a real number through one of FoX's routines, and specifying a format, that the format is correct according to StringFormatting. Fortran-style formats are not permitted, and will cause crashes at runtime.

Array overruns

If you are outputting arrays or matrices, and are doing so in the traditional Fortran style - by passing both the array and its length to the routine, like so:

 call xml_AddAttribute(xf=file, name=name, value=array, nvalue=n)

then if n is wrong, you may end up with an array overrun, and cause a crash.

We highly recommend wherever possible using the Fortran-90 style, like so:

 call xml_AddAttribute(xf=file, name=name, value=array)

where the array length will be passed automatically.

Uninitialized variables

If you are passing variables to FoX which have not been initialized, you may well cause a crash. This is especially true, and easy to cause if you are passing in an array which (due to a bug elsewhere) has been partly but not entirely initialized. To diagnose this, try printing out suspect variables just before passing them to FoX, and look for suspiciously wrong values.

Invalid floating point numbers.

If during the course of your calculation you accidentally generate Infinities, or NaNs, then passing them to any Fortran subroutine can result in a crash - therefore trying to pass them to FoX for output may result in a crash.

If you suspect this is happening, try printing out suspect variables before calling FoX.


SAX

SAX stands for Simple API for XML, and was originally a Java API for reading XML. (Full details at http://saxproject.org). SAX implementations exist for most common modern computer languages.

FoX includes a SAX implementation, which translates most of the Java API into Fortran, and makes it accessible to Fortran programs, enabling them to read in XML documents in a fashion as close and familiar as possible to other languages.

SAX is a stream-based, event callback API. Conceptually, running a SAX parser over a document results in the parser generating events as it encounters different XML components, and sends the events to the main program, which can read them and take suitable action.

Events

Events are generated when the parser encounters, for example, an element opening tag, or some text, and most events carry some data with them - the name of the tag, or the contents of the text.

The full list of events is quite extensive, and may be seen below. For most purposes, though, it is unlikely that most users will need more than the 5 most common events, documented here.

Given these events and accompanying information, a program can extract data from an XML document.

Invoking the parser.

Any program using the FoX SAX parser must a) use the FoX module, and b) declare a derived type variable to hold the parser, like so:

   use FoX_sax
   type(xml_t) :: xp

The FoX SAX parser then works by requiring the programmer to write a module containing subroutines to receive any of the events they are interested in, and passing these subroutines to the parser.

Firstly, the parser must be initialized, by passing it XML data. This can be done either by giving a filename, which the parser will manipulate, or by passing a string containing an XML document. Thus:

  call open_xml_file(xp, "input.xml", iostat)

The iostat variable will report back any errors in opening the file.

Alternatively,

  call open_xml_string(xp, XMLstring)

where XMLstring is a character variable.

To now run the parser over the file, you simply do:

 call parse(xp, list_of_event_handlers)

And once you're finished, you can close the file, and clean up the parser, with:

 call close_xml_t(xp)

Receiving events

To receive events, you must construct a module containing event handling subroutines. These are subroutines of a prescribed form - the input & output is predetermined by the requirements of the SAX interface, but the body of the subroutine is up to you.

The required forms are shown in the API documentation below, but here are some simple examples.

To receive notification of character events, you must write a subroutine which takes as input one string, which will contain the characters received. So:

module event_handling
  use FoX_sax
contains

  subroutine characters_handler(chars)
    character(len=*), intent(in) :: chars

    print*, chars
  end subroutine
end module

That does very little - it simply prints out the data it receives. However, since the subroutine is in a module, you can save the data to a module variable, and manipulate it elsewhere; alternatively you can choose to call other subroutines based on the input.

So, a complete program which reads in all the text from an XML document looks like this:

module event_handling
  use FoX_sax
contains

  subroutine characters_handler(chars)
    character(len=*), intent(in) :: chars

    print*, chars
  end subroutine
end module

program XMLreader
  use FoX_sax
  use event_handling
  type(xml_t) :: xp
  call open_xml_file(xp, 'input.xml')
  call parse(xp, characters_handler=characters_handler)
  call close_xml_t(xp)
end program

Attribute dictionaries.

The other likely most common event is the startElement event. Handling this involves writing a subroutine which takes as input three strings (which are the local name, namespace URI, and fully qualified name of the tag) and a dictionary of attributes.

An attribute dictionary is essentially a set of key:value pairs - where the key is the attributes name, and the value is its value. (When considering namespaces, each attribute also has a URI and localName.)

Full details of all the dictionary-manipulation routines are given in AttributeDictionaries(#AttributeDictionaries), but here we shall show the most common.

So, a simple subroutine to receive a startElement event would look like:

module event_handling

contains

 subroutine startElement_handler(URI, localname, name,attributes)
   character(len=*), intent(in)   :: URI  
   character(len=*), intent(in)   :: localname
   character(len=*), intent(in)   :: name 
   type(dictionary_t), intent(in) :: attributes

   integer :: i

   print*, name

   do i = 1, len(attributes)
      print*, getKey(attributes, i), '=', getValue(attributes, i)
   enddo

  end subroutine startElement_handler
end module

program XMLreader
 use FoX_sax
 use event_handling
 type(xml_t) :: xp
 call open_xml_file(xp, 'input.xml')
 call parse(xp, startElement_handler=startElement_handler)
 call close_xml_t(xp)
end program

Again, this does nothing but print out the name of the element, and the names and values of all of its attributes. However, by using module variables, or calling other subroutines, the data could be manipulated further.

Error handling

The SAX parser detects all XML well-formedness errors. By default, when it encounters an error, it will simply halt the program with a suitable error message. However, it is possible to pass in an error handling subroutine if some other behaviour is desired - for example it may be nice to report the error to the user, and carry on with some other task.

In any case, once an error is encountered, the parser will finish. There is no way to continue reading past an error.

An error handling suubroutine works in the same way as any other event handler, with the event data being an error message. Thus, you could write:

subroutine error_handler(msg)
  character(len=*), intent(in) :: msg

  print*, "The SAX parser encountered an error:"
  print*, msg
  print*, "Never mind, carrying on with the rest of the calcaulation."
end subroutine

Full API

Derived types

There is one derived type, xml_t. This is entirely opaque, and is used as a handle for the parser.

Subroutines

There are four subroutines:

This opens a file. xp is initialized, and prepared for parsing. string must contain the name of the file to be opened. iostat reports on the success of opening the file. A value of 0 indicates success.

This closes down the parser (and closes the file, if input was coming from a file.) xp is left uninitialized, ready to be used again if necessary.

(Advanced: By default, this will be done in a non-validating way, testing only for well-formedness errors. However, if validate is set to true. FoX will attempt to diagnose validation errors. Note that FoX is not a full validating parser, and will not read external entities, so do not rely on this behaviour)

The full list of event handlers is in the next section. To use them, the interface must be placed in a module, and the body of the subroutine filled in as desired; then it should be specified as an argument to parse as:
name_of_event_handler = name_of_user_written_subroutine
Thus a typical call to parse might look something like:

  call parse(xp, startElement_handler = mystartelement, endElement_handler = myendelement, characters_handler = mychars)

where mystartelement, myendelement, and mychars are all subroutines written by you according to the interfaces listed below.


Callbacks.

All of the callbacks specified by SAX 2 are implemented. Documentation of the SAX 2 interfaces is available in the JavaDoc at http://saxproject.org, but as the interfaces needed adjustment for Fortran, they are listed here.

For documentation on the meaning of the callbacks and of their arguments, please refer to the Java SAX documentation.

Triggered when some character data is read from between tags.

NB Note that all character data is reported, including whitespace. Thus you will probably get a lot of empty characters events in a typical XML document.

NB Note also that it is not required that large chunks of character data all come as one event - they may come as multiple consecutive events.

Triggered when the parser reaches the end of the document.

Triggered by a closing tag.

Triggered when a namespace prefix mapping goes out of scope.

Triggered when whitespace is encountered within an element declared as EMPTY. (Only active in validating mode.)

Triggered by a Processing Instruction

Triggered when either an external entity, or an undeclared entity, is skipped.

Triggered when the parser starts reading the document.

Triggered when an opening tag is encountered. (see LINK for documentation on handling attribute dictionaries.

Triggered when a namespace prefix mapping start.

Triggered when a NOTATION declaration is made in the DTD

Triggered when an unparsed entity is declared

Triggered when a normal parsing error is encountered. Parsing will cease after this event.

Triggered when a fatal parsing error is encountered. Parsing will cease after this event.

Triggered when a parser warning is generated. Parsing will continue after this event.

Triggered when an attribute declaration is encountered in the DTD.

Triggered when an element declaration is enountered in the DTD.

Triggered when a parsed external entity is declared in the DTD.

Triggered when an internal entity is declared in the DTD.

Triggered when a comment is encountered.

Triggered by the end of a CData section.

Triggered by the end of a DTD.

Triggered at the end of entity expansion.

Triggered by the start of a CData section.

Triggered by the start of a DTD section.

Triggered by the start of entity expansion.


Exceptions.

Although FoX tries very hard to work to the letter of the XML and SAX standards, it falls short in a few areas.

(This includes non-ASCII characters present only by character reference.)

It will, however, happily accept documents labelled as UTF-8 encoded.

Beyond this, any aspects of XML and SAX which FoX fails to do justice to are bugs.

Note that (as permissable within XML) FoX acts primarily as a non-validating parser, and thus all constraints marked as Validity Constraints by XML-1.0/1.1 are ignored by default. A subset of them will be picked up by FoX's validation mode, but only a small subset.

Note also that FoX will not read external entities when processing an XML document.


What of Java SAX 2 is not included in FoX?

The difference betweek Java & Fortran means that none of the SAX APIs can be copied directly. However, FoX offers data types, subroutines, and interfaces covering a large proportion of the facilities offered by SAX. Where it does not, this is mentioned here.

org.sax.xml:

org.sax.xml.ext:

org.sax.xml.helpers:


Attributes dictionaries.

When parsing XML using the FoX SAX module, attributes are returned contained within a dictionary object.

All of the attribute dictionary objects and functions are exported through FoXcommon and FoXsax - you must USE the module to enable them. The dictionary API is described here.

An attribute dictionary consists of a list of entries, one for each attribute. The entries all have the following pieces of data:

and for namespaced attributes:


Derived types

There is one derived type of interest, dictionary_t.

It is opaque - that is, it should only be manipulated through the functions described here.

Functions

Returns an integer with the length of the dictionary, ie the number of dictionary entries.

Returns an integer with the length of the dictionary, ie the number of dictionary entries. Identical to the len function.

Returns a logical value according to whether the dictionary contains an attribute named key or not.

Returns a logical value according to whether the dictionary contains an attribute with the correct URI and localname.

The following functions may be used to retrieve data from a dictionary

Return the full name of the ith dictionary entry.

If an integer is passed in - the value of the ith attribute.

If a single string is passed in, the value of the attribute with that name.

If two strings are passed in, the value of the attribute with that uri and localname.

Returns a string containing the nsURI of the ith attribute.

Returns a string containing the localName of the ith attribute.


UTILS

FoX_utils is a collection of general utility functions that the rest of FoX depends on, but which may be of independent use. They are documented here.

All functions are accessible from the FoX_utils module.

NB Unlike the APIs of WXML, WCML, and SAX, the UTILS APIs may not remain constant between FoX versions. While some effort will be expended to ensure they don't change unnecessarily, no guarantees are made.

For any end-users interested in the code who are worried about interface changes, it is recommended that the relevant code (all found in the utils/ directory be lifted directly and imported into other projects, rather than accessed through the FoX interfaces.

Currently only one utility function is provided, generate_UUID.

UUID

UUIDs (see RFC 4122) are Universally Unique IDentifiers. They are a 128-bit number, represented as a 36-character string. For example:

 f81d4fae-7dec-11d0-a765-00a0c91e6bf6

The intention of UUIDs is to enable distributed systems to uniquely identify information without significant central coordination. Thus, anyone can create a UUID and use it to identify something with reasonable confidence that the identifier will never be unintentionally used by anyone for anything else.

This property also makes them useful as Uniform Resource Names, to refer to a given document without requiring a position in a particular URI scheme. Thus the above UUID could be referred to as

urn:uuid:f81d4fae-7dec-11d0-a765-00a0c91e6bf6

UUIDs are used by WCML to ensure that every document generated has a unique ID. This enables users to go back later on and have confidence that they are examining the same document, regardless of where it might have ended up in file-system hierarchies or databases.

In addition, UUIDs come in several flavours, one of which stores the time of creation to 100-nanosecond accuracy. This can later be extracted (see, for example this service) to verify creation time.

This may well be useful for other XML document types, or indeed in non-XML applications. Thus, UUIDs may be generated by the following function, with one optional argument.

This function returns a 36-character string containing the UUID.

version identifies the version of UUID to be used (see section 4.1.3 of the RFC). Only versions 0, 1, and 4 are supported. Version 0 generates a nil UUID; version 1 a time-based UUID, and version 4 a pseudo-randomly-generated UUID.

Version 1 is the default, and is recommended.

(Note: all pseudo-random-numbers are generated using the high-quality Mersenne Twister algorithm, using the Fortran implementation of Scott Robert Ladd.)


Further information

FoX evolved from the initial codebase of xmlf90, which was written largely by Alberto Garcia <albertog@icmab.es> and Jon Wakelin <jon.wakelin@bristol.ac.uk>.

FoX is the work of Toby White <tow21@cam.ac.uk>, and all bug reports/complaints/bouquets of roses should be sent to him.

There is a FoX website at http://www.uszla.me.uk/software/FoX/.

There is also a mailing list for announcements/queries/bug reports. Information on how to subscribe may be found at http://www.uszla.me.uk/cgi-bin/mailman/listinfo/FoX/,

This manual is © Toby White 2006.


Licensing

FoX is licensed under the agreement below. This is intended to make it as freely available as possible, subject only to retaining copyright notices and acknowledgements.

If for any reason this license causes issues with your intended use of the code, please contect the author.

The license can also be found within the distributed source, in the file FoX/LICENSE

Copyright:
© 2003, 2004, Alberto Garcia, Jon Wakelin
© 2005, 2006, 2007, Toby White
All rights reserved.

Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:

THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE

Third-party code.

In addition, FoX includes a random number library, written by Scott Robert Ladd, which is licensed as follows:

! This computer program source file is supplied "AS IS". Scott Robert
! Ladd (hereinafter referred to as "Author") disclaims all warranties,
! expressed or implied, including, without limitation, the warranties
! of merchantability and of fitness for any purpose. The Author
! assumes no liability for direct, indirect, incidental, special,
! exemplary, or consequential damages, which may result from the use
! of this software, even if advised of the possibility of such damage.
!
! The Author hereby grants anyone permission to use, copy, modify, and
! distribute this source code, or portions hereof, for any purpose,
! without fee, subject to the following restrictions:
!
! 1. The origin of this source code must not be misrepresented.
!
! 2. Altered versions must be plainly marked as such and must not
! be misrepresented as being the original source.
!
! 3. This Copyright notice may not be removed or altered from any
! source or altered source distribution.
!
! The Author specifically permits (without fee) and encourages the use
! of this source code for entertainment, education, or decoration. If
! you use this source code in a product, acknowledgment is not required
! but would be appreciated.