Dealing with XML Documents
Evaluation and discussion of various approaches for using XML documents to provide SKEN content.
The primary goal of SKEN is to provide content that is accessible both
via web browsers and digital data mining efforts such as NSDL and
OAI-PMH. To these ends, we will use XML as the primary protocol for
delivery of SKEN content.
Plone and Zope do not currently deal directly with native XML documents, however, there are a number of tools and add-on products that allow Plone to properly handle XML. The following discussions are in the order of discovery, therefore, the order does not indicate preference.
This was the first approach we investigated. It solves the initial display problem quickly and, perhaps, even elegantly ... BUT ... it makes it virtually impossible to control edits because Plone treats the content as one big text string which it delivers to the browser. It is the browser that does all of the work to format the content and make it "pretty".
Internet discussion suggests that the Zope text file type was created in order to provide a way to transition existing web sites with many pages of stable, complex html content into Plone/Zope. It was anticipated that these types of pages would continue to be generated/updated in tools outside Plone/Zope.
You can't get to the editor at all from the XML file's display page (try it with the example at development:examples:OrgChart.xml ). You have to go up one level to it's container (in this case examples), select the contents tab, choose the XML file from the list. If the XML file was displayed directly from a link, you can't edit it at all.
When you access the XML file from it's container's content view, you get a "view" of the XML that is no longer pretty, much less readable by a normal human, and the edit tab shows up. The edit form puts up a message that says "This is a text file, you can edit its contents directly" and displays the default plain text editor. It won't let you use any other editor, such as Epoz, on the file. So, you must know the proper use of the XML tags if you want to add stylized content, though you may make simple textual changes relatively easily.
When you have saved your changes and are done editing, you return to the text content view with XML tags and all, not the formatted browser view. In order to see the formatted view, you have to go up to the browser's location bar and delete the /file_view part of the path or go back to the parent (i..e. examples) navigation option (left side nav section) and reselect the page.
The edit process probably makes this approach too unfriendly to be a long term solution.
Another drawback of treating XML as a Zope file is repurposing the content ... like getting clutch sizes for all birds in the Catskill Mtns. This type of functionality requires broad access to data from many individual elements in many accounts/files. It would be far more efficient to have element-based objects to query, both from a retrieval and display point of view.
In this approach we would customize the the existing Plone/Zope File object with new display and edit functionality. It fixes many of the problems of approach 1A.
Using this approach, when a browsers asks for the document, we first pipe the XML document through an XSLT processor producing real HTML. This HTML could be further styled using CSS. In addition we should be able to use Plone/Zope templating to better integrate the document into the normal Plone view. The removes the XSL processing from the browser and puts it on the server, so browser support would be more universal than in approach 1A.
Editing of this type of content could be done by downloading the document from the site to the client machine, editing with a tool such as Authentic or InfoPath, just like Connexions. The issues in approach 1A with "edit" and "view" tabs should be able to be fixed during the customization of the Plone file object. This editing option is type 3 in the SKEN document editing option diagram.
This approach does not solve the problem holding the content in large chucks, thus making it harder to find and reuse small pieces.
This approach may work fine for an interim solution.
XML Document allows you to use XML documents as objects in the Zope environment. It was written by Zope Corp. in 1999 and 2000 and hasn't been updated since. It is still used in a number of small sites. It was superseded by ParsedXML due to a change in author and architecture (and partly due to a name conflict with Microsoft.NET's XMLDocument object).
It provides some valuable insight's into how Zope works, but isn't a very robust candidate. Lack of formal support is a major issue.
ParsedXML is a follow-on to the XMLDocument product.
There are two groups working on it :
Archetypes provides a simple, extensible framework that simplifies the creation of new content types in CMF (the Zope Content Management Framework) and Plone. It is currentlly available from the Sourceforge Archetypes Project. Documentation is available from the Plone Collective doc site.
Creation of a new Type using Archetypes involves creating a schema file (uses Python syntax, not XML) that defines the fields and other objects within the new Type, including all properties and behaviors. Archetypes strengths are :
Further investigation is ongoing.
Approach 4 : use ArchGenXML generator for Archetypes
ArchGenXML is a command line utility that generates Plone Products based on the Archetypes framework from UML models using XMI and XSD (XML Schema) files. It is also available from the SourceForge Archetypes Project. Documentation is available from the Plone Collective doc site.
XMI is a standard format for Metadata interchange that combines UML and XML to provide the metadata necessary to exchange object definitions among appllications. ArchGenXML takes advantage of this. If an XML Schema is used, it must be based on a MOF (Meta-Object Facility) model. XMI files are commonly generated by UML modelling tools according to the XMI spec. Both MOF anf XMI are published by OMG (the Object Management Group).
Adoption of an Archetypes solution is a prerequisite to use of ArchGenXML.
Our limited testing has shown that there are some undocumented requirements on the format of the XML Schema file ... further investigation is ongoing.
will know more after the conference call on Dec 17.
Approach 6 : use Zope 3 schema ?
Zope 3 will supposedly have native support for XML, but it's sometimes hard to find the necessary info for an evaluation.
Plone and Zope do not currently deal directly with native XML documents, however, there are a number of tools and add-on products that allow Plone to properly handle XML. The following discussions are in the order of discovery, therefore, the order does not indicate preference.
Approach 1A : insert XML as a text file and use embedded XSLT or CSS to control display
This was the first approach we investigated. It solves the initial display problem quickly and, perhaps, even elegantly ... BUT ... it makes it virtually impossible to control edits because Plone treats the content as one big text string which it delivers to the browser. It is the browser that does all of the work to format the content and make it "pretty".
Internet discussion suggests that the Zope text file type was created in order to provide a way to transition existing web sites with many pages of stable, complex html content into Plone/Zope. It was anticipated that these types of pages would continue to be generated/updated in tools outside Plone/Zope.
You can't get to the editor at all from the XML file's display page (try it with the example at development:examples:OrgChart.xml ). You have to go up one level to it's container (in this case examples), select the contents tab, choose the XML file from the list. If the XML file was displayed directly from a link, you can't edit it at all.
When you access the XML file from it's container's content view, you get a "view" of the XML that is no longer pretty, much less readable by a normal human, and the edit tab shows up. The edit form puts up a message that says "This is a text file, you can edit its contents directly" and displays the default plain text editor. It won't let you use any other editor, such as Epoz, on the file. So, you must know the proper use of the XML tags if you want to add stylized content, though you may make simple textual changes relatively easily.
When you have saved your changes and are done editing, you return to the text content view with XML tags and all, not the formatted browser view. In order to see the formatted view, you have to go up to the browser's location bar and delete the /file_view part of the path or go back to the parent (i..e. examples) navigation option (left side nav section) and reselect the page.
The edit process probably makes this approach too unfriendly to be a long term solution.
Another drawback of treating XML as a Zope file is repurposing the content ... like getting clutch sizes for all birds in the Catskill Mtns. This type of functionality requires broad access to data from many individual elements in many accounts/files. It would be far more efficient to have element-based objects to query, both from a retrieval and display point of view.
Approach 1B : insert XML as a custom file object type and use external XSLT to control display
In this approach we would customize the the existing Plone/Zope File object with new display and edit functionality. It fixes many of the problems of approach 1A.
Using this approach, when a browsers asks for the document, we first pipe the XML document through an XSLT processor producing real HTML. This HTML could be further styled using CSS. In addition we should be able to use Plone/Zope templating to better integrate the document into the normal Plone view. The removes the XSL processing from the browser and puts it on the server, so browser support would be more universal than in approach 1A.
Editing of this type of content could be done by downloading the document from the site to the client machine, editing with a tool such as Authentic or InfoPath, just like Connexions. The issues in approach 1A with "edit" and "view" tabs should be able to be fixed during the customization of the Plone file object. This editing option is type 3 in the SKEN document editing option diagram.
This approach does not solve the problem holding the content in large chucks, thus making it harder to find and reuse small pieces.
This approach may work fine for an interim solution.
Approach 2 : use XMLDocument add-on product
XML Document allows you to use XML documents as objects in the Zope environment. It was written by Zope Corp. in 1999 and 2000 and hasn't been updated since. It is still used in a number of small sites. It was superseded by ParsedXML due to a change in author and architecture (and partly due to a name conflict with Microsoft.NET's XMLDocument object).
It provides some valuable insight's into how Zope works, but isn't a very robust candidate. Lack of formal support is a major issue.
Approach 3: use ParsedXML add-on product
ParsedXML is a follow-on to the XMLDocument product.
There are two groups working on it :
- Members of Zope.org deliver a version at http://www.zope.org/Members/faassen/ParsedXML. This verison is a couple of years old.
- Infrae, an Open Sounrce company in the Netherlands, deliver more current
releases of ParsedXML. The Infrae site
also has a companion product,XMLWidgets
, which provides display templates for ParsedXML files. These add-ons are an integral part of the company's
Silva content management platform. Because of this, we suspect that support for these two products is very good.
Approach 4: use Archetypes add-on product
Archetypes provides a simple, extensible framework that simplifies the creation of new content types in CMF (the Zope Content Management Framework) and Plone. It is currentlly available from the Sourceforge Archetypes Project. Documentation is available from the Plone Collective doc site.
Creation of a new Type using Archetypes involves creating a schema file (uses Python syntax, not XML) that defines the fields and other objects within the new Type, including all properties and behaviors. Archetypes strengths are :
1. Automatically generates forms and pages needed to add, edit, and view the dataThe idea of using genrated code is appealing for two reasons.
2. Properly registers content types with the CMF tools - a non-trival task w/out Archetypes
3. Easy installation of the generated content objects as a CMF/Plone product
4. Configurability of CMF actions
5 .Basic storage transparency - if you choose to use the ZODB
Further investigation is ongoing.
Approach 4 : use ArchGenXML generator for Archetypes
ArchGenXML is a command line utility that generates Plone Products based on the Archetypes framework from UML models using XMI and XSD (XML Schema) files. It is also available from the SourceForge Archetypes Project. Documentation is available from the Plone Collective doc site.
XMI is a standard format for Metadata interchange that combines UML and XML to provide the metadata necessary to exchange object definitions among appllications. ArchGenXML takes advantage of this. If an XML Schema is used, it must be based on a MOF (Meta-Object Facility) model. XMI files are commonly generated by UML modelling tools according to the XMI spec. Both MOF anf XMI are published by OMG (the Object Management Group).
Adoption of an Archetypes solution is a prerequisite to use of ArchGenXML.
Our limited testing has shown that there are some undocumented requirements on the format of the XML Schema file ... further investigation is ongoing.
Approach 5 : use Connexions repository ?
will know more after the conference call on Dec 17.
Approach 6 : use Zope 3 schema ?
Zope 3 will supposedly have native support for XML, but it's sometimes hard to find the necessary info for an evaluation.