Serving XML: Record Streams

Daniel Parker


The Record
Record Readers and Writers
Adapting a Record Stream to XML Content
Adapting XML Content to a Record Stream
Performing Tasks Repeatedly within a Record Filter

This is the second of three articles describing the ServingXML pipeline language.

This article discusses pipelines where the input or output (or both) are sequences of records.

The Record

ServingXML supports the notion of records that have fields, possibly multi-valued, and nested subrecords, possibly repeating.

A record may be represented in BNF as follows:


Record ::= name (Field+) (Record*) |
  name (Field*) (Record+)

Field:= name (value*)

Here is a sample XML representation of a record.

        
<Employee>
  <Employee-No>0001</Employee-No>
  <Employee-Name>Matthew</Employee-Name>
  <Children>Joe</Children>
  <Children>Julia</Children>
  <Children>Dave</Children>
</Employee>

      

The record is of type "Employee" and has three fields named Employee-No, Employee-Name and Children. Children is a multivalued field.

Record Readers and Writers

The example below illustrates the idea of record readers and writers with a flat file reader that reads a stream of records from a positional flat file, and a flat file writer that writes the stream to a delimited flat file. Here, we pair a sx:flatFileReader with a sx:flatFileWriter, but we could just as easily pair a sx:flatFileReader with a sx:sqlWriter, or a sx:sqlReader with a sx:flatFileWriter.

Figure 1. Record readers and writers


<sx:resources xmlns:sx="http://www.servingxml.com/core">
   
  <sx:service id="new-books"> 
    <sx:recordStream>
      <sx:flatFileReader>
        <sx:flatFile ref="oldBooksFlatFile"/>
      </sx:flatFileReader>
      
      <sx:flatFileWriter>
        <sx:flatFile ref="newBooksFlatFile"/>
      </sx:flatFileWriter>
    </sx:recordStream>
  </sx:service>
  
  <sx:flatFile id="newBooksFlatFile">
    <sx:flatFileHeader>
      <sx:flatRecordType ref="newBookType"/>
    </sx:flatFileHeader>
    <sx:flatFileBody>
      <sx:flatRecordType ref="newBookType"/>
    </sx:flatFileBody>
  </sx:flatFile>      

  <sx:flatRecordType id="newBookType" name="newBookType">
    <sx:fieldDelimiter value="|"/>
    <sx:delimitedField name="author" label="Author"/>
    <sx:delimitedField name="category" label="Category"/>
    <sx:delimitedField name="title" label= "Title"/>
    <sx:delimitedField name="price" label="Price"/>
  </sx:flatRecordType>
  
  <sx:flatFile id="oldBooksFlatFile">
    <sx:flatFileHeader>
      <sx:flatRecordType ref="oldBookType"/>
      <sx:annotationRecord/>
    </sx:flatFileHeader>
    <sx:flatFileBody>
      <sx:flatRecordType ref="oldBookType"/>
    </sx:flatFileBody>
    <sx:flatFileTrailer>
      <sx:annotationRecord></sx:annotationRecord>
      <sx:annotationRecord>This is a trailer record</sx:annotationRecord>
    </sx:flatFileTrailer>
  </sx:flatFile>      

  <sx:flatRecordType id="oldBookType" name="oldBookType">
    <sx:positionalField name="category" width="1"/>
    <sx:positionalField name="author" width="30"/>
    <sx:positionalField name="title" width="30"/>
    <sx:positionalField name="price" width="10" justify="right"/>
  </sx:flatRecordType>
  
</sx:resources>


Adapting a Record Stream to XML Content

The next example shows how to use a sx:recordContent element to adapt a record stream to XML content. Once we have XML content, we can apply all of the XML pipeline instructions described in Serving XML: Pipeline Language.

Figure 2. Adapting a record stream to XML content


<sx:resources xmlns:sx="http://www.servingxml.com/core"
                         xmlns:myns="http://mycompany.com/mynames/">

  <sx:service id="books"> 
    <sx:serialize>
      <sx:transform>
        <sx:content ref="books"/>
      </sx:transform>
    </sx:serialize>
  </sx:service>
  
  <sx:recordContent id="books">
    <sx:flatFileReader>
      <sx:flatFile ref="oldBooksFlatFile"/>
    </sx:flatFileReader>
    <sx:recordMapping ref="booksToXmlMapping"/>
  </sx:recordContent>

  <sx:recordMapping id="booksToXmlMapping">
    <myns:books>
      <sx:onRecord>
        <myns:book>
          <sx:fieldElementMap field="title" element="myns:title"/>  
          <sx:fieldAttributeMap field="category" attribute="categoryCode"/>
          <sx:fieldElementMap field="author" element="myns:author"/>
          <sx:fieldElementMap field="price" element="myns:price"/>
        </myns:book>  
      </sx:onRecord>
    </myns:books>
  </sx:recordMapping>  
   
  <sx:flatFile id="oldBooksFlatFile">
    <sx:flatFileHeader>
      <sx:flatRecordType ref="oldBookType"/>
      <sx:annotationRecord/>
    </sx:flatFileHeader>
    <sx:flatFileBody>
      <sx:flatRecordType ref="oldBookType"/>
    </sx:flatFileBody>
    <sx:flatFileTrailer>
      <sx:annotationRecord></sx:annotationRecord>
      <sx:annotationRecord>This is a trailer record</sx:annotationRecord>
    </sx:flatFileTrailer>
  </sx:flatFile>      

  <sx:flatRecordType id="oldBookType" name="oldBookType">
    <sx:positionalField name="category" width="1"/>
    <sx:positionalField name="author" width="30"/>
    <sx:positionalField name="title" width="30"/>
    <sx:positionalField name="price" width="10" justify="right"/>
  </sx:flatRecordType>
  
</sx:resources>


Adapting XML Content to a Record Stream

The next example shows how to use an sx:xmlRecordReader element to adapt XML content to a record stream. Once we have a record stream, we can apply any record writer, including the sx:flatFileWriter or the sx:sqlWriter.

Figure 3. Adapting XML content to a record stream


<sx:resources xmlns:sx="http://www.servingxml.com/core"
                        xmlns:myns="http://mycompany.com/mynames/">
   
  <sx:service id="books2pos"> 
    <sx:recordStream>
      <sx:xmlRecordReader>
        <sx:inverseRecordMapping ref="booksToFileMapping"/>
        <sx:transform>
          <sx:document/>
        </sx:transform>
      </sx:xmlRecordReader>
      <sx:flatFileWriter>
        <sx:flatFile ref="booksFile"/>
      </sx:flatFileWriter>
    </sx:recordStream>
  </sx:service>

  <sx:inverseRecordMapping id="booksToFileMapping">
    <sx:onSubtree path="/myns:books/myns:book">
      <sx:flattenSubtree recordType="book">
        <sx:subtreeFieldMap select="myns:title" field="title"/>
        <sx:subtreeFieldMap select="@categoryCode" field="category"/>
        <sx:subtreeFieldMap select="myns:author" field="author"/>
        <sx:subtreeFieldMap select="myns:price" field="price"/>
        <sx:subtreeFieldMap select="myns:reviews/myns:review[1]" field="review1"/>
        <sx:subtreeFieldMap select="myns:reviews/myns:review[2]" field="review2"/>
      </sx:flattenSubtree>
    </sx:onSubtree>
  </sx:inverseRecordMapping>

  <sx:flatFile id="booksFile">
    <sx:flatFileHeader>
      <sx:flatRecordType ref="bookType"/>
      <sx:annotationRecord/>
    </sx:flatFileHeader>
    <sx:flatFileBody>
      <sx:flatRecordType ref="bookType"/>
    </sx:flatFileBody>
    <sx:flatFileTrailer>
      <sx:annotationRecord></sx:annotationRecord>
      <sx:annotationRecord>This is a trailer record</sx:annotationRecord>
    </sx:flatFileTrailer>
  </sx:flatFile>      

  <sx:flatRecordType id="bookType" name="bookType">
    <sx:positionalField name="category" label="Category" width="1"/>
    <sx:positionalField name="author" label="Author" width="30"/>
    <sx:positionalField name="title" label="Title" width="30"/>
    <sx:positionalField name="price" label="Price" width="10" justify="right"/>
  </sx:flatRecordType>
  
</sx:resources>


Performing Tasks Repeatedly within a Record Filter

In the previous examples we pair a record reader and a record writer inside a sx:recordStream element. The reader reads a stream of records and the writer writes out the records.

A record reader can contain record filters that do some processing on the records as they pass through. Normally the records go on to a writer, but a writer is optional, the processing can take place entirely within the filters. The example below shows a lone record reader inside a sx:recordStream element. This record reader is a sx:directoryReader, which reads all the file names in the data directory, skipping any that do not match the pattern "(books.*)[.]txt". The resulting stream of file names passes through another sx:recordStream element, which reads each books file and writes out the records to a similiarly named file with a _new suffix in the output directory.

Figure 4. Processing selected files in a directory


<sx:resources xmlns:sx="http://www.servingxml.com/core">

  <sx:service id="all-books"> 
    <sx:recordStream>
      <sx:directoryReader directory="data"/>
      <sx:restrictRecordFilter>
        <sx:fieldRestriction field="name" pattern="books.*[.]txt"/>
      </sx:restrictRecordFilter>
      <sx:processRecord>
        <sx:parameter name="output-file">
          <sx:replace match="(books.*)[.]txt" replaceWith ="$1-new.txt"><sx:toString value="{name}"/></sx:replace>
        </sx:parameter>   
        <sx:recordStream>
          <sx:flatFileReader>
            <sx:fileSource directory="{parentDirectory}" file="{name}"/>
            <sx:flatFile ref="oldBooksFlatFile"/>
          </sx:flatFileReader>
          <sx:flatFileWriter>
            <sx:fileSink directory="output" file="{$output-file}"/> 
            <sx:flatFile ref="newBooksFlatFile"/>
          </sx:flatFileWriter>
        </sx:recordStream>
      </sx:processRecord>
    </sx:recordStream>
  </sx:service>

  <sx:flatFile id="newBooksFlatFile">
    <sx:flatFileHeader>
      <sx:flatRecordType ref="newBookType"/>
    </sx:flatFileHeader>
    <sx:flatFileBody>
      <sx:flatRecordType ref="newBookType"/>
    </sx:flatFileBody>
  </sx:flatFile>      

  <sx:flatRecordType id="newBookType" name="newBookType">
    <sx:fieldDelimiter value="|"/>
    <sx:delimitedField name="author" label="Author"/>
    <sx:delimitedField name="category" label="Category"/>
    <sx:delimitedField name="title" label= "Title"/>
    <sx:delimitedField name="price" label="Price"/>
  </sx:flatRecordType>

  <sx:flatFile id="oldBooksFlatFile">
    <sx:flatFileHeader>
      <sx:flatRecordType ref="oldBookType"/>
      <sx:annotationRecord/>
    </sx:flatFileHeader>
    <sx:flatFileBody>
      <sx:flatRecordType ref="oldBookType"/>
    </sx:flatFileBody>
    <sx:flatFileTrailer>
      <sx:annotationRecord></sx:annotationRecord>
      <sx:annotationRecord>This is a trailer record</sx:annotationRecord>
    </sx:flatFileTrailer>
  </sx:flatFile>      

  <sx:flatRecordType id="oldBookType" name="oldBookType">
    <sx:positionalField name="category" width="1"/>
    <sx:positionalField name="author" width="30"/>
    <sx:positionalField name="title" width="30"/>
    <sx:positionalField name="price" width="10" justify="right"/>
  </sx:flatRecordType>
  
</sx:resources>