I'm trying to find a way to validate a large XML file against an XSD. I saw the question ...best way to validate an XML... but the answers all pointed to using the Xerces library for validation. The only problem is, when I use that library to validate a 180 MB file then I get an OutOfMemoryException.
Are there any other tools,libraries, strategies for validating a larger than normal XML file?
EDIT: The SAX solution worked for java validation, but the other two suggestions for the libxml tool were very helpful as well for validation outside of java.
-
Instead of using a DOMParser, use a SAXParser. This reads from an input stream or reader so you can keep the XML on disk instead of loading it all into memory.
SAXParserFactory factory = SAXParserFactory.newInstance(); factory.setValidating(true); factory.setNamespaceAware(true); SAXParser parser = factory.newSAXParser(); XMLReader reader = parser.getXMLReader(); reader.setErrorHandler(new SimpleErrorHandler()); reader.parse(new InputSource(new FileReader ("document.xml")));
From jodonnell -
Use libxml, which performs validation and has a streaming mode.
From John Millikin -
Personally I like to use XMLStarlet which has a command line interface, and works on streams. It is a set of tools built on Libxml2.
From dlamblin -
SAX and libXML will help, as already mentioned. You could also try increasing the maximum heap size for the JVM using the -Xmx option. E.g. to set the maximum heap size to 512MB:
java -Xmx512m com.foo.MyClass
From GaZ -
XML ValidatorBuddy from http://www.xml-tools has an own command to validate huge XML files (multiple GB). It uses the Xerces-C SAX parser for this purpose.
The tool also allows to specify a certain XSD for validation so you don't need to edit the large XML file (to add the schema reference).
From xml-tools.com
No comments:
Post a Comment