This is the mail archive of the
docbook-apps@lists.oasis-open.org
mailing list .
Re: Validate XML documents within Emacs
- From: Michael Smith <smith at xml-doc dot org>
- To: Sidney Lu <lu at sinica dot edu dot tw>
- Cc: docbook-apps <docbook-apps at lists dot oasis-open dot org>,Dave Pawson <dpawson at nildram dot co dot uk>
- Date: Fri, 29 Nov 2002 14:55:59 +0900
- Subject: Re: DOCBOOK-APPS: Validate XML documents within Emacs
- References: <3DE5C326.8050807@sinica.edu.tw>
Hi Sidney,
[cc'ing DaveP to give him a heads-up that this might be worth adding to
the FAQ]
You wrote:
> How can I set the XML_CATALOG_FILES to XML catalog file within Emacs
> such that I can evaliate(C-c C-v) XML documents using OASIS XML
> Catalogs(XML syntax catalog files)? There's no problems using
> SGML_CATALOG_FILES with TR9401:1997 syntax.
For one thing, you'll need to make sure that C-c C-v invokes a tool
like xmllint that groks XML catalogs.
By default, xmllint looks for an XML catalog in /etc/xml/catalog. So,
assuming that:
* you've got xmllint installed and it's in your path somewhere
* from within Emacs/psgml, you don't any want/need to invoke any
nsgmls/onsgmls other any other SGML tools (like jade/openjade) that
might rely on the Emacs/psgml "sgml-declaration" variable
then, all you should need to do is change the values of a couple of
psgml variables:
1. Set the value of the "sgml-declaration" variable to nil.
The reason you need to do this is that psgml uses this value as a
parameter in the "sgml-validate-command" variable, which it then
feeds to the external command it invokes to do the validation;
nsgmls needs this value, but xmllint has no use for it at all.
2. If you have the XML catalog you want to use set up (or pointed to)
in /etc/xml/catalog, just set "sgml-validate-command" to:
"xmllint --noout --postvalid %s %s"
- or -
2. If you have the XML catalog you want to use set up somewhere other
than in /etc/xml/catalog, then set "sgml-validate-command" to:
"XML_CATALOG_FILES=/foo/bar xmllint --noout --postvalid %s %s"
where /foo/bar is the path to the catalog file you want to use.
Note that you have a couple of different choices for where to set the
values of those variables:
* in your .emacs file:
;; validiting with xmllint, so xml decl. not needed
(setq sgml-declaration nil)
;; invoke xmllint for external validation.
;; if you have /etc/xml/catalog properly set up,
;; omit the XML_CATALOG_FILES=/foo/bar part
(setq sgml-validation-command
"XML_CATALOG_FILES=/foo/bar xmllint --noout --postvalid %s %s")
The advantage of putting those in your .emacs is that xmllint will
automatically be used for external validation of anything you edit
with psgml; disadvantage is, well, that xmllint will automatically
be used to validate any files you edit with psgml -- including SGML
files, which xmllint can't process. No big deal if you only work
with XML and don't need to work with any SGML files in psgml.
- or -
* put the following lines at the end of each XML file you want to
be able to externally validate from within Emacs/psgml:
<!--
Local variables:
sgml-declaration: nil
sgml-validate-command:
"XML_CATALOG_FILES=/foo/bar xmllint \-\-noout \-\-postvalid %s %s"
End:
-->
(You need the backslashes because the string "--" isn't allowed in a
comment in a compliant XML file; if you try it, a validating XML
parser will bark at you about it.)
A few caveats:
* you might see some unexpected side effects from setting
sgml-declaration to nil, but I doubt it
The psgml docs make a point of saying the psgml itself "does not
understand the SGML declaration". I think the only thing psgml
itself uses sgml-declaration for is as a parameter to pass to the
command in sgml-validate.
So the only side effect I can see is if you're using some other
customization to Emacs/psgml that needs the sgml-declaration to be
set (maybe something that invokes jade or openjade, for example).
* the validation error messages that xmllint emits are a little
different from the ones that nsgmls emits. It might take you a while
to get accustomed to the differences.
Example:
If you try to validate the following (invalid) DocBook document:
<!DOCTYPE article
PUBLIC "-//OASIS//DTD DocBook XML V4.2//EN"
"http://www.oasis-open.org/docbook/xml/4.2/docbookx.dtd"
>
<article>
</article>
then, xmllint will give you the following error message:
Element article content does not follow the DTD
Expecting [content model for article here -- omitted for brevity]
Document foo.xml does not validate
while nsgmls will just give you this:
nsgmls:foo.xml:6:9:E: end tag for "article" which is not finished
A couple of big differences are:
- for an element that's missing required content, xmllint emits
the element's entire content model from the DTD (which can be
really helpful, but for DocBook documents, can also be a biiig
chunk of info, and make the error messages look really verbose)
- nsgmls emits the line number of the place in your doc where it
figures the error is; as far as I know, there's no option for
getting xmllint to do something similar
* the change to sgml-validate-command only affects the command that
Emacs/psgml invokes to do external validation -- it doesn't have any
effect at all on the interactive error-checking "validation" that
psgml does internally.
So, just because you're now able to invoke an external command that
understands XML catalogs, it doesn't mean that psgml itself can do
anything with them (though hopefully XML catalog support will be
added to psgml in the future).
* the --postvalid switch (instead of --valid) is what's usually the
best to use with xmllint for doing validation; there might be some
cases where you'd want to use --valid instead, but I can't think of many
HTH,
--Mike