Home » 2014 » October » 26 » XMLLINT command in linux : a validating XML parser

9:06 AM
XMLLINT command in linux : a validating XML parser

xmllint - command line XML tool

       xmllint [--format --dtdvalid ..]
{XML-FILE(S)... -}

For help:       xmllint --help



  • The xmllint program parses one or more XML files, specified on the command line as XML-FILE (or the standard input if the filename provided is - ).
  • It prints various types of output, depending upon the options selected. It is useful for detecting errors both in XML code and in the XML parser itself.


We will see examples on the same file testxml that we had test in xmlwf article . We have introduced some error as highlighted. We will test using xmllint command

[email protected]:/home/shanky/test:> cat testxml.xml
<?xml version="1.0" standalone="yes"?>

[email protected]:/home/shanky/test:> xmllint  testxml.xml
testxml.xml:3: parser error : error parsing attribute name
testxml.xml:3: parser error : attributes construct error
testxml.xml:3: parser error : Couldn't find end of Start Tag student line 2
testxml.xml:3: parser error : Extra content at the end of the document

The commands says that the parser could not find start tag for student at line 2. Now we will correct the error and retry.

[email protected]:/home/shanky/test:> cat testxml.xml
<?xml version="1.0" standalone="yes"?>

Here you may notice that we have rectified the parser error. But additionally we have put all content like a story in one line i.e. the the content is not well formatted. Now we again try the command but this time with "--format" option

[email protected]:/home/shanky/test:> xmllint testxml.xml
<?xml version="1.0" standalone="yes"?>
[email protected]:/home/shanky/test:> xmllint --format testxml.xml
<?xml version="1.0" standalone="yes"?>



Can you see the difference b/w the two output??

Here one is just the content of xml file and the other is well formatted. This is what --format option does.

If we have a very big xml file which is not formatted, its difficult to look into it, so we can use this command for better visibility:

xmllint --format xml-file

--validdtd and --valid options:

We can supply the DTD file to be validated against, with this command using --validdtd option and --valid option. See the command below:

xmllint --dtdvalid dtdfile --valid xmlfile

--path and --valid options:

We can give the source paths od dtd files to be validated against, with this command, also. The paths of dtd files should be enclosed in single or double quotes separated by space. See the command below:

xmllint --path 'path1 path2' --valid xmlfile


All OPTIONS: xmllint accepts the following options 

       --auto Generate a small document for testing purposes.

              Use the SGML catalog(s) from SGML_CATALOG_FILES. Otherwise XML catalogs starting from /etc/xml/catalog are used by default.

              Turn on node registration. Useful for developers testing libxml(3) node tracking code.

              Turn on gzip(1) compression of output.

       --copy Test the internal copy implementation.

       --c14n Use the W3C XML Canonicalisation (C14N) to serialize the result of parsing to stdout. It keeps comments in the result.

        --dtdvalidfpi FPI
              Use the DTD specified by a Formal Public Identifier FPI for validation, note that this will require a catalog exporting that Formal Public Identifier to work.

              Parse a file and output an annotated tree of the in-memory version of the document.

              Debug the entities defined in the document.

              Remove DTD from output.

              Fetch external DTD and populate the tree with inherited attributes.

       --encode ENCODING
              Output in the given encoding.

       --html Use the HTML parser.

              Output results as an HTML file. This causes xmllint to output the necessary HTML tags surrounding the result tree output so the results can be displayed/viewed in a browser.

              Test for valid insertions.

              Fetch an external DTD.

              Display all the documents loaded during the processing to stderr.

       --maxmem NNBYTES
              Test the parser memory support.  NNBYTES is the maximum number of bytes the library is allowed to allocate.
              This can also be used to make sure batch processing of XML files will not exhaust the virtual memory of the server running them.

              Parse from memory.

              Drop ignorable blank spaces.

              Do not use any catalogs.

              Substitute CDATA section by equivalent text nodes.

              Substitute entity values for entity references. By default, xmllint leaves entity references in place.

              Do not use the Internet to fetch DTDs or entities.

              Suppress output. By default, xmllint outputs the result tree.

              Do not emit warnings from the parser and/or validator.

              Do not output HTML doc wrapper.

              Do XInclude processing but do not generate XInclude start and end nodes.

              Remove redundant namespace declarations.

       --output FILE
              Define a file path where xmllint will save the result of parsing. Usually the programs build a tree and save it on stdout, with this option the result XML instance will be saved onto a file.

       --pattern PATTERNVALUE
              Used to exercise the pattern recognition engine, which can be used with the reader interface to the parser.
              It allows to select some nodes in the document based on an XPath (subset) expression. Used for debugging.

              Validate after parsing has completed.

       --push Use the push mode of the parser.

              Output any parsable portions of an invalid document.

       --relaxng SCHEMA
              Use RelaxNG file named SCHEMA for validation.

              Repeat 100 times, for timing or profiling.

              Run a navigating shell. Details on available commands in shell mode are below (see the section called "SHELL

              Use streaming API - useful when used in combination with --relaxng or --valid options for validation of files that are too large to be held in memory.

              Test user input/output support.

              Output information about the time it takes xmllint to perform the various steps.

              Determine if the document is a valid instance of the included Document Type Definition (DTD). A DTD to be validated against also can be specified at the command line using the --dtdvalid option. By default, xmllint also checks to determine if the document is well-formed.

              Display the version of libxml(3) used.

              Test the walker module, which is a reader interface but for a document tree, instead of using the reader API on an unparsed document it works on an existing in-memory tree. Used for debugging.

              Do XInclude processing.

              Used in conjunction with --html. Usually when HTML is parsed the document is saved with the HTML serializer.
              But with this option the resulting document is saved with the XML serializer. This is primarily used to generate XHTML from HTML input.


Category: Open System-Linux | Views: 2984 | Added by: shanky | Tags: xmllint command examples, xmllint command in linux, validating parser, xmllint examples, xmllint command in unix, xmllint command with examples | Rating: 5.0/1

Related blogs

You may also like to see:

[2014-12-16][Open System-Linux]
CUT command in Linux : cut the file vertically
[2016-02-05][Open System-Linux]
Lets try to understand sticky bit concept in Linux!
[2014-09-24][Open System-Linux]
Set, unset, display shell behaviour options : SHOPT command in Linux
[2014-01-09][Open System-Linux]
How to send a mail using unix/linux terminal?
[2019-09-27][Open System-Linux]
How to generate private/public key from PEM file With PuttyGen

Total comments: 0