org.biojava.bio.program.sax.blastxml
Class BlastXMLParser

java.lang.Object
  extended by org.biojava.utils.stax.StAXContentHandlerBase
      extended by org.biojava.bio.program.sax.blastxml.BlastXMLParser
All Implemented Interfaces:
StAXContentHandler

public class BlastXMLParser
extends StAXContentHandlerBase

This class parses NCBI Blast XML output.

It has two modes:- i) single output document mode: this takes a document containing a single BlastOutput element and parses it. This is generated when a single query is searched against a sequence database.

ii) multiple query document mode: unfortunately, NCBI BLAST concatenates the results of multiple searches in one file. This leads to an ill-formed document that violates every XML format known to the human race and other nearby civilisations. This parser will take a bowdlerised version of this output that is wrapped in a blast_aggregate element.

The massaged form is generated by stripping the XML element and DOCTYPE elements and wrapping all the classes in a single blast_aggregate element. In Linux, this can be done with:-

 #!/bin/sh
 # Converts a Blast XML output to something vaguely well-formed
 # for parsing.
 # Use: blast_aggregate  

 # strips all <?xml> and <!DOCTYPE> tags
 # encapsulates the multiple <BlastOutput> elements into <blast_aggregator>

 sed '/>?xml/d' $1 | sed '/<!DOCTYPE/d' | sed '1i\
 <blast_aggregate>
 $a\
 </blast_aggregate>' > $2

Author:
David Huen

Field Summary
 org.biojava.bio.program.sax.blastxml.StAXFeatureHandler staxenv
          Nesting class that provides callback interfaces to nested class
 
Constructor Summary
BlastXMLParser()
           
 
Method Summary
protected  void addHandler(ElementRecognizer rec, org.biojava.bio.program.sax.blastxml.StAXHandlerFactory handler)
          Adds a feature to the Handler attribute of the StAXFeatureHandler object
 void endElement(String nsURI, String localName, String qName, StAXContentHandler handler)
          Handles basic exit processing.
 void endElementHandler(String nsURI, String localName, String qName, StAXContentHandler handler)
          Element specific exit handler Subclass to do anything useful.
 ContentHandler getListener()
          get the SeqIOListener for this parser
 void setContentHandler(ContentHandler listener)
          sets the ContentHandler for this object
 void startElement(String nsURI, String localName, String qName, Attributes attrs, DelegationManager dm)
          we override the superclass startElement method so we can determine the the start tag type and use it to set up delegation for the superclass.
 
Methods inherited from class org.biojava.utils.stax.StAXContentHandlerBase
characters, endPrefixMapping, endTree, ignorableWhitespace, processingInstruction, setDocumentLocator, skippedEntity, startPrefixMapping, startTree
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

staxenv

public org.biojava.bio.program.sax.blastxml.StAXFeatureHandler staxenv
Nesting class that provides callback interfaces to nested class

Constructor Detail

BlastXMLParser

public BlastXMLParser()
Method Detail

setContentHandler

public void setContentHandler(ContentHandler listener)
sets the ContentHandler for this object


startElement

public void startElement(String nsURI,
                         String localName,
                         String qName,
                         Attributes attrs,
                         DelegationManager dm)
                  throws SAXException
we override the superclass startElement method so we can determine the the start tag type and use it to set up delegation for the superclass.

Specified by:
startElement in interface StAXContentHandler
Parameters:
nsURI - Description of the Parameter
localName - Description of the Parameter
qName - Description of the Parameter
attrs - Description of the Parameter
dm - Description of the Parameter
Throws:
SAXException - Description of the Exception

endElementHandler

public void endElementHandler(String nsURI,
                              String localName,
                              String qName,
                              StAXContentHandler handler)
                       throws SAXException
Element specific exit handler Subclass to do anything useful.

Parameters:
nsURI - Description of the Parameter
localName - Description of the Parameter
qName - Description of the Parameter
handler - Description of the Parameter
Throws:
SAXException - Description of the Exception

addHandler

protected void addHandler(ElementRecognizer rec,
                          org.biojava.bio.program.sax.blastxml.StAXHandlerFactory handler)
Adds a feature to the Handler attribute of the StAXFeatureHandler object

Parameters:
rec - The feature to be added to the Handler attribute
handler - The feature to be added to the Handler attribute

getListener

public ContentHandler getListener()
get the SeqIOListener for this parser


endElement

public void endElement(String nsURI,
                       String localName,
                       String qName,
                       StAXContentHandler handler)
                throws SAXException
Handles basic exit processing.

Specified by:
endElement in interface StAXContentHandler
Overrides:
endElement in class StAXContentHandlerBase
Parameters:
nsURI - Description of the Parameter
localName - Description of the Parameter
qName - Description of the Parameter
handler - Description of the Parameter
Throws:
SAXException - Description of the Exception