DOM, SAX and StAX
SAX is a uni-directional, read only API and follows push model for reading. Read below to know about push reading. SAX and StAX are relatives compared to DOM as DOM uses a completely different approach for XML processing.
DOM parser is created using the concept of trees. XML document object model will be completely constructed as a tree and stored in memory. Then the XML document can be parsed by traversing the tree. This requires lot of memory and processing power. When working with small documents are fine with this kind of DOM processing but when you have a long document and then we will have performance issues.
StAX follows a streaming model, it can both read and write. Imagine feeding a whole XML document via a tube. At every moment one XML element will be the focus and then we move on to the next element either in forward or backward direction of our choice. This is kind of processing is not something new for us, we have seen ResultSet of JDBC API. Streaming has its advantage when we want to process large documents sequentially. Irrespective of size of the document the performance will be good. When mobile phones and apps are getting popular we also need to think of processing in terms of smaller scale compared to desktops and servers.
StAX follows a streaming model, it can both read and write. Imagine feeding a whole XML document via a tube. At every moment one XML element will be the focus and then we move on to the next element either in forward or backward direction of our choice. This is kind of processing is not something new for us, we have seen ResultSet of JDBC API. Streaming has its advantage when we want to process large documents sequentially. Irrespective of size of the document the performance will be good. When mobile phones and apps are getting popular we also need to think of processing in terms of smaller scale compared to desktops and servers.
The sample XML considered in the examples is:
And the obejct into which the XML content is to be extracted is defined as below:
class Employee{String id;String firstName;String lastName;String location;@Overridepublic String toString() { return firstName+" "+lastName+"("+id+")"+location;}}
|
- DOM Parser
- SAX Parser
- StAX Parser
Using DOM Parser
DOM parser implementation that comes with the JDK using JDK 7. The DOM Parser loads the complete XML content into a Tree structure. And we iterate through the Node and NodeList to get the content of the XML. The code for XML parsing using DOM parser is given below.
public class DOMParserDemo {public static void main(String[] args) throws Exception { //Get the DOM Builder Factory DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance(); //Get the DOM Builder DocumentBuilder builder = factory.newDocumentBuilder(); //Load and Parse the XML document //document contains the complete XML as a Tree. Document document = builder.parse( ClassLoader.getSystemResourceAsStream("xml/employee.xml")); List empList = new ArrayList<>(); //Iterating through the nodes and extracting the data. NodeList nodeList = document.getDocumentElement().getChildNodes(); for (int i = 0; i < nodeList.getLength(); i++) { //We have encountered an tag. Node node = nodeList.item(i); if (node instanceof Element) { Employee emp = new Employee(); emp.id = node.getAttributes(). getNamedItem("id").getNodeValue(); NodeList childNodes = node.getChildNodes(); for (int j = 0; j < childNodes.getLength(); j++) { Node cNode = childNodes.item(j); //Identifying the child tag of employee encountered. if (cNode instanceof Element) { String content = cNode.getLastChild(). getTextContent().trim(); switch (cNode.getNodeName()) { case "firstName": emp.firstName = content; break; case "lastName": emp.lastName = content; break; case "location": emp.location = content; break; } } } empList.add(emp); } } //Printing the Employee list populated. for (Employee emp : empList) { System.out.println(emp); }}}class Employee{ String id; String firstName; String lastName; String location; @Override public String toString() { return firstName+" "+lastName+"("+id+")"+location; }
The output:
Rakesh Mishra(111)Bangalore
John Davis(112) Delhi
Rajesh Sharma(113)Pune
Using SAX Parser
SAX Parser is different from the DOM Parser where SAX parser doesn’t load the complete XML into the memory, instead it parses the XML line by line triggering different events as and when it encounters different elements like: opening tag, closing tag, character data, comments and so on. This is the reason why SAX Parser is called an event based parser.
Along with the XML source file, we also register a handler which extends the DefaultHandler class. The DefaultHandler class provides different callbacks out of which we would be interested in:
- startElement() – triggers this event when the start of the tag is encountered.
- endElement() – triggers this event when the end of the tag is encountered.
- characters() – triggers this event when it encounters some text data.
The code for parsing the XML using SAX Parser is given below:
import java.util.ArrayList;import java.util.List;import javax.xml.parsers.SAXParser;import javax.xml.parsers.SAXParserFactory;import org.xml.sax.Attributes;import org.xml.sax.SAXException;import org.xml.sax.helpers.DefaultHandler; public class SAXParserDemo {public static void main(String[] args) throws Exception { SAXParserFactory parserFactor = SAXParserFactory.newInstance(); SAXParser parser = parserFactor.newSAXParser(); SAXHandler handler = new SAXHandler(); parser.parse(ClassLoader.getSystemResourceAsStream("xml/employee.xml"), handler); //Printing the list of employees obtained from XML for ( Employee emp : handler.empList){ System.out.println(emp); }}}/** * The Handler for SAX Events. */class SAXHandler extends DefaultHandler { List empList = new ArrayList<>(); Employee emp = null;String content = null;@Override//Triggered when the start of tag is found.public void startElement(String uri, String localName, String qName, Attributes attributes) throws SAXException { switch(qName){ //Create a new Employee object when the start tag is found case "employee": emp = new Employee(); emp.id = attributes.getValue("id"); break; }}@Overridepublic void endElement(String uri, String localName, String qName) throws SAXException { switch(qName){ //Add the employee to list once end tag is found case "employee": empList.add(emp); break; //For all other end tags the employee has to be updated. case "firstName": emp.firstName = content; break; case "lastName": emp.lastName = content; break; case "location": emp.location = content; break; }}@Overridepublic void characters(char[] ch, int start, int length) throws SAXException { content = String.copyValueOf(ch, start, length).trim();}} class Employee { String id; String firstName; String lastName; String location; @Override public String toString() { return firstName + " " + lastName + "(" + id + ")" + location; }}}
The output:
Rakesh Mishra(111)BangaloreJohn Davis(112)DelhiRajesh Sharma(113)Pune
Using StAX Parser
StAX stands for Streaming API for XML and StAX Parser is different from DOM in the same way SAX Parser is. StAX parser is also in a subtle way different from SAX parser.
- The SAX Parser pushes the data but StAX parser pulls the required data from the XML.
- The StAX parser maintains a cursor at the current position in the document allows to extract the content available at the cursor whereas SAX parser issues events as and when certain data is encountered.
XMLInputFactory and XMLStreamReader are the two class which can be used to load an XML file. And as we read through the XML file using XMLStreamReader, events are generated in the form of integer values and these are then compared with the constants inXMLStreamConstants. The below code shows how to parse XML using StAX parser:
import java.util.ArrayList;import java.util.List;import javax.xml.stream.XMLInputFactory;import javax.xml.stream.XMLStreamConstants;import javax.xml.stream.XMLStreamException;import javax.xml.stream.XMLStreamReader; public class StaxParserDemo {public static void main(String[] args) throws XMLStreamException { List empList = null; Employee currEmp = null; String tagContent = null; XMLInputFactory factory = XMLInputFactory.newInstance(); XMLStreamReader reader = factory.createXMLStreamReader( ClassLoader.getSystemResourceAsStream("xml/employee.xml")); while(reader.hasNext()){ int event = reader.next(); switch(event){ case XMLStreamConstants.START_ELEMENT: if ("employee".equals(reader.getLocalName())){ currEmp = new Employee(); currEmp.id = reader.getAttributeValue(0); } if("employees".equals(reader.getLocalName())){ empList = new ArrayList<>(); } break; case XMLStreamConstants.CHARACTERS: tagContent = reader.getText().trim(); break; case XMLStreamConstants.END_ELEMENT: switch(reader.getLocalName()){ case "employee": empList.add(currEmp); break; case "firstName": currEmp.firstName = tagContent; break; case "lastName": currEmp.lastName = tagContent; break; case "location": currEmp.location = tagContent; break; } break; case XMLStreamConstants.START_DOCUMENT: empList = new ArrayList<>(); break; } } //Print the employee list populated from XML for ( Employee emp : empList){ System.out.println(emp); }}} class Employee{String id;String firstName;String lastName;String location;@Overridepublic String toString(){ return firstName+" "+lastName+"("+id+") "+location;}}
The output:
Rakesh Mishra(111) BangaloreJohn Davis(112) DelhiRajesh Sharma(113) Pune
With this I have covered parsing the same XML document and performing the same task of populating the list of
Employee
objects using all the three parsers namely:- DOM Parser
- SAX Parser
- StAX Parser
No comments:
Post a Comment