分类: Java
2010-08-04 17:43:58
Hist:
20100804, draft
SAX是Simple API for XML的简写. SAX采用流扫描和事件驱动的方式处理xml文档, 特别适合对xml的读入处理. 区别于SAX的另一种流行xml方案是DOM(Document Object Model), DOM的思想是将xml在内存中构造一个等价树, 对xml的各种操作转化成对内存中树和各节点的操作, 进而实现各种功能.
SAX读入XML时, 当碰到一定的分界时,就会调用回调函数(interface ContentHandler), 程序员在回调函数中完成实现自己的功能. 回调事件有: document的开始和结束(startDocument(), endDocument()), 名字空间前缀的开始和结束(startPrefixMapping(), endPrefixMapping()),每个节点的开始和结束(startElement(), endElement()), 字符内容的回调(characters()), 及其它.
SAX中, 回调的顺序很重要, 按照xml文档中的的先后顺序触发. 因此, 回调中得到的信息, 如果需要在以后的回调中使用, 程序员需要自行保存备用.
SAX的使用不复杂, 主要分成几步,其中是2,3,4次序是不重要的.
1. 创建XMLReader parser对象,
2. 设置parser的ContentHandler, 实现自己的ContentHandler定义回调方法
3. (optional)设置parser的errorHandler, 这样程序员就可以给出不正常或异常时的处理, 如不设定errorHandler, 则SAX会依赖于异常抛出机制.
4. (optional)设置parser的feature, 其中两个feature(, 和) 需要特别注意. 后文有述.
5. 调用parse(), 进行解析.
ContentHandler的startElement原型
void startElement(String uri,String localName,String qName,Attributes atts)
throws SAXException
其中
uri是名字空间的URI
localName是不含前缀的本地名.
qName是带前缀的完整名
atts是属性列表
想要得到前缀, 可以解析qName, ‘:’之前的就是前缀.
前面述及的两个feature会影响到回调时这三个入口参数的值.
|
true时, uri和localName有意义, 返回有值. false时, uri和localName不保证有值, 可以是空. (JDK默认实现中是返回空) |
|
true时, atts中包含名字空间的属性信息, qName有值. false时, atts中不包含名字空间的属性信息, qName不保证有值. |
import java.io.FileInputStream; import java.io.FileNotFoundException; import java.io.IOException; import java.io.InputStream; import javax.xml.parsers.ParserConfigurationException; import javax.xml.parsers.SAXParser; import javax.xml.parsers.SAXParserFactory; import org.xml.sax.InputSource; import org.xml.sax.SAXException; import org.xml.sax.XMLReader; import org.xml.sax.helpers.XMLReaderFactory; /** * @author zcatt * */ public class DemoSax { public static void main(String[] args) throws SAXException, IOException/*, ParserConfigurationException*/ { /* SAXParserFactory factory = SAXParserFactory.newInstance(); SAXParser parserWrapper = factory.newSAXParser(); XMLReader parser=
parserWrapper.getXMLReader(); */ XMLReader parser=XMLReaderFactory.createXMLReader(); MyContentHandler contentHandler=new MyContentHandler(); parser.setContentHandler(contentHandler); parser.setFeature("", true); parser.setFeature("", true); InputStream inputStream = new FileInputStream("zcatt.xml"); InputSource inputSource = new InputSource(inputStream); parser.parse(inputSource); System.out.println("DemoSax
game over!"); } } |
import
java.io.IOException; import
org.xml.sax.Attributes; import
org.xml.sax.InputSource; import org.xml.sax.Locator; import
org.xml.sax.SAXException; import
org.xml.sax.SAXParseException; import
org.xml.sax.helpers.DefaultHandler; public class
MyContentHandler extends DefaultHandler { //////////////////////////////////////////////////////////////////// // Default implementation of
ContentHandler interface. //////////////////////////////////////////////////////////////////// public void setDocumentLocator (Locator
locator) { // no op } public void startDocument () throws SAXException { System.out.println("startDocument
event"); } public void endDocument () throws SAXException { System.out.println("endDocument
event"); } public void startPrefixMapping (String
prefix, String uri) throws SAXException { System.out.println("startPrefixMapping
event. prefix="+prefix+" uri=" + uri); } public void endPrefixMapping (String
prefix) throws SAXException { System.out.println("endPrefixMapping
event"); } public void startElement (String uri,
String localName, String qName, Attributes attributes) throws SAXException { System.out.println("startElment
event. uri="+uri+" localName="+localName+" qName="+qName); System.out.println("attributes:
"); for(int i = 0;i { System.out.print("qname="+attributes.getQName(i)+"
value="+attributes.getValue(i)+"; "); } System.out.println(""); } public void endElement (String uri, String
localName, String qName) throws SAXException { System.out.println("endElement
event."); } public void characters (char ch[], int
start, int length) throws SAXException { System.out.println("characters
event."); String s=new String(ch, start,
length); System.out.println("--<<<<<<<<<<<<<"); System.out.println(s); System.out.println("-->>>>>>>>>>>>>"); } public void ignorableWhitespace (char
ch[], int start, int length) throws SAXException { System.out.println("ignorableWhitespace
event."); } public void processingInstruction (String
target, String data) throws SAXException { // no op } public void skippedEntity (String name) throws SAXException { // no op } //////////////////////////////////////////////////////////////////// // Default implementation of the
ErrorHandler interface. (not used here) //////////////////////////////////////////////////////////////////// public void warning (SAXParseException e) throws SAXException { System.out.println("warning
event."); } public void error (SAXParseException e) throws SAXException { System.out.println("error
event."); } public void fatalError (SAXParseException
e) throws SAXException { System.out.println("fatalError
event."); throw e; } } |
处理的xml样本
This interface allows access to a list of
attributes in three different ways: 1. attribute index; 2. Namespace-qualified name; or 3. qualified (prefixed) name. This
class allows a SAX application to encapsulate information about an input
source in a single object, which may include a public identifier, a system
identifier, a byte stream (possibly with a specified encoding), and/or a
character stream. This class is available as a convenience
base class for SAX2 applications.
|
下面是namespaces和namespace-prefixes分别设(true,true), (true,false)和(false,false)时的结果
namespaces=true, namespaces-prefixes=true
startDocument
event startElment
event. uri= localName=project qName=project attributes: qname=name
value=DemoSax; qname=default value=main; characters event. --<<<<<<<<<<<<< -->>>>>>>>>>>>> startElment
event. uri= localName=interface qName=interface attributes: qname=name
value=org.xml.sax.attributes; characters event. --<<<<<<<<<<<<< This interface allows access to a list of
attributes in three different ways: 1. attribute index; -->>>>>>>>>>>>> characters event. --<<<<<<<<<<<<< 2. Namespace-qualified name; or 3. qualified (prefixed) name. -->>>>>>>>>>>>> characters event. --<<<<<<<<<<<<< -->>>>>>>>>>>>> endElement event. characters event. --<<<<<<<<<<<<< -->>>>>>>>>>>>> startElment
event. uri= localName=class qName=class attributes: qname=name
value=org.xml.sax.InputSource; characters event. --<<<<<<<<<<<<< This
class allows a SAX application to encapsulate information about an input
source in a single object, which may include a public identifier, a system
identifier, a byte stream (possibly with a specified encoding), and/or a
character stream. -->>>>>>>>>>>>> endElement event. characters event. --<<<<<<<<<<<<< -->>>>>>>>>>>>> startPrefixMapping
event. prefix=qn uri=org.xml.sax.helpers startPrefixMapping
event. prefix=parent uri=org.xml.sax startElment
event. uri=org.xml.sax.helpers
localName=class qName=qn:class attributes: qname=xmlns:qn
value=org.xml.sax.helpers; qname=xmlns:parent value=org.xml.sax; qname=name
value=DefaultHandler; characters event. --<<<<<<<<<<<<< This class is available as a convenience
base class for SAX2 applications. -->>>>>>>>>>>>> startElment
event. uri=org.xml.sax.helpers localName=subclass qName=qn:subclass attributes: qname=name
value=DefaultHandler2; endElement event. characters event. --<<<<<<<<<<<<< -->>>>>>>>>>>>> startElment
event. uri=org.xml.sax.helpers localName=interface qName=qn:interface attributes: qname=name
value=ContentHandler; endElement event. characters event. --<<<<<<<<<<<<< -->>>>>>>>>>>>> endElement event. endPrefixMapping
event endPrefixMapping
event characters event. --<<<<<<<<<<<<< -->>>>>>>>>>>>> characters event. --<<<<<<<<<<<<< -->>>>>>>>>>>>> endElement event. endDocument event DemoSax game
over! |
namespaces=true, namespaces-prefixes=false
startDocument
event startElment
event. uri= localName=project qName=project attributes: qname=name
value=DemoSax; qname=default value=main; characters event. --<<<<<<<<<<<<< -->>>>>>>>>>>>> startElment
event. uri= localName=interface qName=interface attributes: qname=name
value=org.xml.sax.attributes; characters event. --<<<<<<<<<<<<< This interface allows access to a list of
attributes in three different ways: 1. attribute index; -->>>>>>>>>>>>> characters event. --<<<<<<<<<<<<< 2. Namespace-qualified name; or 3. qualified (prefixed) name. -->>>>>>>>>>>>> characters event. --<<<<<<<<<<<<< -->>>>>>>>>>>>> endElement event. characters event. --<<<<<<<<<<<<< -->>>>>>>>>>>>> startElment
event. uri= localName=class qName=class attributes: qname=name
value=org.xml.sax.InputSource; characters event. --<<<<<<<<<<<<< This
class allows a SAX application to encapsulate information about an input
source in a single object, which may include a public identifier, a system identifier,
a byte stream (possibly with a specified encoding), and/or a character
stream. -->>>>>>>>>>>>> endElement event. characters event. --<<<<<<<<<<<<< -->>>>>>>>>>>>> startPrefixMapping
event. prefix=qn uri=org.xml.sax.helpers startPrefixMapping
event. prefix=parent uri=org.xml.sax startElment
event. uri=org.xml.sax.helpers
localName=class qName=qn:class attributes: qname=name
value=DefaultHandler; characters event. --<<<<<<<<<<<<< This class is available as a convenience base
class for SAX2 applications. -->>>>>>>>>>>>> startElment
event. uri=org.xml.sax.helpers localName=subclass qName=qn:subclass attributes: qname=name
value=DefaultHandler2; endElement event. characters event. --<<<<<<<<<<<<< -->>>>>>>>>>>>> startElment
event. uri=org.xml.sax.helpers localName=interface qName=qn:interface attributes: qname=name
value=ContentHandler; endElement event. characters event. --<<<<<<<<<<<<< -->>>>>>>>>>>>> endElement event. endPrefixMapping
event endPrefixMapping
event characters event. --<<<<<<<<<<<<< -->>>>>>>>>>>>> characters event. --<<<<<<<<<<<<< -->>>>>>>>>>>>> endElement event. endDocument event DemoSax game
over! |
namespaces=false, namespaces-prefixes=false
startDocument
event startElment
event. uri= localName= qName=project attributes: qname=name
value=DemoSax; qname=default value=main; characters event. --<<<<<<<<<<<<< -->>>>>>>>>>>>> startElment
event. uri= localName= qName=interface attributes: qname=name
value=org.xml.sax.attributes; characters event. --<<<<<<<<<<<<< This interface allows access to a list of
attributes in three different ways: 1. attribute index; -->>>>>>>>>>>>> characters event. --<<<<<<<<<<<<< 2. Namespace-qualified name; or 3. qualified (prefixed) name. -->>>>>>>>>>>>> characters event. --<<<<<<<<<<<<< -->>>>>>>>>>>>> endElement event. characters event. --<<<<<<<<<<<<< -->>>>>>>>>>>>> startElment
event. uri= localName= qName=class attributes: qname=name
value=org.xml.sax.InputSource; characters event. --<<<<<<<<<<<<< This
class allows a SAX application to encapsulate information about an input
source in a single object, which may include a public identifier, a system
identifier, a byte stream (possibly with a specified encoding), and/or a
character stream. -->>>>>>>>>>>>> endElement event. characters event. --<<<<<<<<<<<<< -->>>>>>>>>>>>> startElment
event. uri= localName=
qName=qn:class attributes: qname=xmlns:qn
value=org.xml.sax.helpers; qname=xmlns:parent value=org.xml.sax; qname=name value=DefaultHandler;
characters event. --<<<<<<<<<<<<< This class is available as a convenience
base class for SAX2 applications. -->>>>>>>>>>>>> startElment
event. uri= localName= qName=qn:subclass attributes: qname=name
value=DefaultHandler2; endElement event. characters event. --<<<<<<<<<<<<< -->>>>>>>>>>>>> startElment
event. uri= localName= qName=qn:interface attributes: qname=name
value=ContentHandler; endElement event. characters event. --<<<<<<<<<<<<< -->>>>>>>>>>>>> endElement event. characters event. --<<<<<<<<<<<<< -->>>>>>>>>>>>> characters event. --<<<<<<<<<<<<< -->>>>>>>>>>>>> endElement event. endDocument event DemoSax game
over! |
1. Java 2 SE 6 Document