This is the mail archive of the docbook-apps@lists.oasis-open.org mailing list .


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Validate XML documents within Emacs


Hi Sidney,

[cc'ing DaveP to give him a heads-up that this might be worth adding to
the FAQ]

You wrote:

> How can I set the XML_CATALOG_FILES to XML catalog file within Emacs 
> such that I can evaliate(C-c C-v) XML documents using OASIS XML 
> Catalogs(XML syntax catalog files)? There's no problems using 
> SGML_CATALOG_FILES with TR9401:1997 syntax.

For one thing, you'll need to make sure that C-c C-v invokes a tool
like xmllint that groks XML catalogs.

By default, xmllint looks for an XML catalog in /etc/xml/catalog. So,
assuming that:

  * you've got xmllint installed and it's in your path somewhere

  * from within Emacs/psgml, you don't any want/need to invoke any
    nsgmls/onsgmls other any other SGML tools (like jade/openjade) that
    might rely on the Emacs/psgml "sgml-declaration" variable

then, all you should need to do is change the values of a couple of
psgml variables:

  1. Set the value of the "sgml-declaration" variable to nil.

     The reason you need to do this is that psgml uses this value as a
     parameter in the "sgml-validate-command" variable, which it then
     feeds to the external command it invokes to do the validation;
     nsgmls needs this value, but xmllint has no use for it at all.

  2. If you have the XML catalog you want to use set up (or pointed to)
     in /etc/xml/catalog, just set "sgml-validate-command" to:

       "xmllint --noout --postvalid %s %s"

     - or -

  2. If you have the XML catalog you want to use set up somewhere other
     than in /etc/xml/catalog, then set "sgml-validate-command" to:

       "XML_CATALOG_FILES=/foo/bar xmllint --noout --postvalid %s %s"

     where /foo/bar is the path to the catalog file you want to use.

Note that you have a couple of different choices for where to set the
values of those variables:

  * in your .emacs file:

      ;; validiting with xmllint, so xml decl. not needed
      (setq sgml-declaration nil)

      ;; invoke xmllint for external validation.
      ;; if you have /etc/xml/catalog properly set up,
      ;; omit the XML_CATALOG_FILES=/foo/bar part
      (setq sgml-validation-command
        "XML_CATALOG_FILES=/foo/bar xmllint --noout --postvalid %s %s")

    The advantage of putting those in your .emacs is that xmllint will
    automatically be used for external validation of anything you edit
    with psgml; disadvantage is, well, that xmllint will automatically
    be used to validate any files you edit with psgml -- including SGML
    files, which xmllint can't process. No big deal if you only work
    with XML and don't need to work with any SGML files in psgml.

    - or -

  * put the following lines at the end of each XML file you want to
    be able to externally validate from within Emacs/psgml:

      <!--
      Local variables:
      sgml-declaration: nil
      sgml-validate-command:
        "XML_CATALOG_FILES=/foo/bar xmllint \-\-noout \-\-postvalid %s %s"
      End:
      -->

    (You need the backslashes because the string "--" isn't allowed in a
    comment in a compliant XML file; if you try it, a validating XML
    parser will bark at you about it.)

A few caveats:

  * you might see some unexpected side effects from setting
    sgml-declaration to nil, but I doubt it
    
    The psgml docs make a point of saying the psgml itself "does not
    understand the SGML declaration". I think the only thing psgml
    itself uses sgml-declaration for is as a parameter to pass to the
    command in sgml-validate.
    
    So the only side effect I can see is if you're using some other
    customization to Emacs/psgml that needs the sgml-declaration to be
    set (maybe something that invokes jade or openjade, for example).

  * the validation error messages that xmllint emits are a little
    different from the ones that nsgmls emits. It might take you a while
    to get accustomed to the differences.

    Example:

      If you try to validate the following (invalid) DocBook document:

        <!DOCTYPE article
          PUBLIC "-//OASIS//DTD DocBook XML V4.2//EN"
            "http://www.oasis-open.org/docbook/xml/4.2/docbookx.dtd";
        >
        <article>
        </article>
      
      then, xmllint will give you the following error message:

        Element article content does not follow the DTD
        Expecting [content model for article here -- omitted for brevity]
        Document foo.xml does not validate

      while nsgmls will just give you this:

        nsgmls:foo.xml:6:9:E: end tag for "article" which is not finished

    A couple of big differences are:
    
      - for an element that's missing required content, xmllint emits
        the element's entire content model from the DTD (which can be
        really helpful, but for DocBook documents, can also be a biiig
        chunk of info, and make the error messages look really verbose)

      - nsgmls emits the line number of the place in your doc where it
        figures the error is; as far as I know, there's no option for
        getting xmllint to do something similar

  * the change to sgml-validate-command only affects the command that
    Emacs/psgml invokes to do external validation -- it doesn't have any
    effect at all on the interactive error-checking "validation" that
    psgml does internally.
    
    So, just because you're now able to invoke an external command that
    understands XML catalogs, it doesn't mean that psgml itself can do
    anything with them (though hopefully XML catalog support will be
    added to psgml in the future).

  * the --postvalid switch (instead of --valid) is what's usually the
    best to use with xmllint for doing validation; there might be some
    cases where you'd want to use --valid instead, but I can't think of many

HTH,

  --Mike


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]