Neel Smith on github Openly available work in digital classics

Automatic XML validation simutaneously in two validation systems

Contributors to the Homer Multitext project use software preinstalled in a virtual machine running Ubuntu 14.04 to validate and verify their work. The VM includes a very nice XML editor, XML Copy Editor, but some members are already accustomed to editing XML using oXygen in their host machine. No problem sharing files between the VM and host machine, but with some complex Relax NG schemas, XML Copy Editor seems to hang.

If our project was about developing an XML editor, we would dig into this, but it’s not: we’re creating an archive of scholarly material about the Iliad, so an easy workaround is just as good. I’m sharing this quick kludge since others might find it helpful, too.

XML Copy Editor performs flawlessly on the same complex texts when given a DTD. Since we’re working with TEI compliant XML, we’re rescued by the ODD approach: we can use Relax NG schemas in oXygen, and DTDs in XML Copy Editor with some confidence that we’ll get identical syntax validation (and in either case, will then apply our-project specific content validation software suite to evaluate the contents of the documents’ text nodes).

Oxygen has a preference setting to “Ignore the DTD for validation if a schema is specified”.

oXygen preferences

In my installation, it was already checked. (Whether I made that choice manually ages ago, or it was checked by default, I’m not sure.) When it is checked, we can include both the XML processing instructions oXygen recognizes, and a PUBLIC DTD statement to validate with the DTD in XML Copy Editor. To keep the example simple, here is an opening set of declarations to work with the TEI’s tei_all customization:

<?xml version="1.0" encoding="UTF-8"?>
<?xml-model href="http://www.tei-c.org/release/xml/tei/custom/schema/relaxng/tei_all.rng" type="application/xml" schematypens="http://relaxng.org/ns/structure/1.0"?>
<?xml-model href="http://www.tei-c.org/release/xml/tei/custom/schema/relaxng/tei_all.rng" type="application/xml"
schematypens="http://purl.oclc.org/dsdl/schematron"?>
<!DOCTYPE TEI PUBLIC "-//TEI P5//DTD Main Document Type//EN" "http://www.tei-c.org/release/xml/tei/custom/schema/dtd/tei_all.dtd">

Now we can open the same XML file in oXygen on the host OS and XML Copy Editor in the VM, and validate without changing either the processing instructions of the DOCTYPE declaration. Syntax validation, context-aware popups, and all the other end-user support that makes these editors so nice to work in, function correctly in both editors.