DITA documentation distro specification Part II: A quicksilver-like RDFa editor

Introduction
If we want to make it as easy as possible to edit the DITA documentation we need to make it easier and as fool proof as possible.
We previously implemented a WYSIWYG editor for RDFa in Drupal, which basically made it possible to add custom tags around selected text. The classic WYSIWYG editor however had a couple of downsides when used for this purpose:
- Limited set of markup that starts being unaccessible once there are more than 10 properties
- Intuitiveness of what is where in the RDFa subject predicate object model
- Difficulty to recognize the parent entity to which a property is being added
- Validation issues
We can make the WYSIWYG editor smarter so that it only let's you add valid markup and we can add a display that gives meta information about where you are in the XML tree. But there if there is sufficient budget we could also implement Quicksilver like UI for managing triples.

Many people swear by Quicksilver (or tools similar to it like Gnome Do) as the ideal shortcut tool: it allows users to select a tool, an action for the tool and where available a parameter for that action. This 3-panel interface gave me the idea to use the Quicksilver UI for triples. In this post I will further explore this idea and explain how we could extend existing Drupal modules to implement such a system.
Existing modules
Descriptions here are adapted from the respective module descriptions on the project pages at Drupal.org.
Cobalt
Cobalt is a keybord shortcut driven access system for Drupal 6 made in the image of Quicksilver for Mac (if you use Gnome you might be familiar with Gnome Do, another Quicksilver clone). It requires Google Gears because Cobalt needs to store data locally for fast access.
Cobalt is invoked through pressing [Alt+space] or [Ctrl+space]. Then you only have to start typing to quickly get what you want. Use the arrow up/down to select one of the suggestions and [Alt+left/right] to switch between the pages. Cobalt will learn from your choices so the amount of selecting and paging will decrease over time. If you are satisfied with the default action (Go to for menu items) you can just press enter to execute it. If you want to do something else (maybe you would like to assign a shortcut) just press [tab] and select one of the actions using the arrow keys or typing part of the action name and then press enter to execute the selected action.
RDF module
The RDF module for Drupal is an API that provides a series of basis functionalities that can be used by our modules.
Especially of interest are:
- Output an XHTML+RDFa DOCTYPE (e.g. Declaring used RDFa namespaces in the header of pages)
- declaration of custom namespaces
- possiblity to have local repositories
Neologism installation profile
Neologism is a simple web-based RDF Schema vocabulary editor and publishing system. It can be used to create RDF classes and properties, which are needed to publish data on the Semantic Web. Its main goal is to dramatically reduce the time required to create, publish and modify vocabularies for the Semantic Web. Neologism is currently in alpha and it's project page on Drupal.org doesn't show much activity. If you check the project page at Googlecode there is however a lot of activity.
The latest release can be downloaded from:
http://code.google.com/p/neologism/downloads/list
We could use this to build a Drupal documentation vocabulary (that defines things like modules, common issues, functionality groups).
RDF external vocabolary importer (evoc)
Being able to reuse RDF vocabularies across sites is one of the key elements for the semantic web to take off. The RDF external vocabulary importer module (evoc) will cache any external RDF vocabulary in Drupal, and expose its classes and properties to other modules.
We could use this to import our vocabulary built in neologism into a Drupal site. We could then use the internally cached vocabulary to build valid RDFa triplets.
Editor use cases
User calls cobalt, text selected in text field form
Defaults
When a user calls cobalt after selecting text, it normally means that the user wants to make the selected text the literal (piece of text) object of a triple. The selected text is therefore pre-filled in the 3rd box of the cobalt UI.
The first box of the UI is pre-filled with the parent resource in the DOM tree that would be used as the subject of the RDFa triplet when parsed.
Object and Subject are correct
When both object and subject are what the user intended them to be, she can just select one of the properties that has the first box as domain and a literal (what's in the 3rd box) as range.
When the command is submitted a span is added around the selected text with the appropriate property.
Result:
<span property=“property_XYZ”>the selected text</span>
Object is not the desired one
If the user realizes that displayed object in the interface (the first valid parent resource in the DOM tree) is not right for the triple, the user can replace this with a literal, or one of the available resources (these could be resources that were defined previously in the document).
When the command is submitted 2 spans are added: the outer one will declare a parent resource using the “about” property and the inner one will be the actual declared property. The result would be something like:
In case this was a new resource:
<span about=“/URL_to_current_node#subject_resource_name”>
<a id=“subject_resource_name”></a>
<span property=“property_XYZ”>the selected text</span>
</span>
In case this was an existing resource:
<span about=“/URL_to_current_node#subject_resource_name”>
<span property=“property_XYZ”>the selected text</span>
</span>
Subject is not the desired one
If the user wants to add a different literal than the selected text into the triple, she can change it in the 3rd box.
Result:
<span property=“property_XYZ” value=“overwritten_text”>the selected text</span>
Defining/identifying resources
Both “about” and “typeof” can be used to declare a (new) data item. This means that the selected text (if there is) should become the subject of the triple.
One way to do this would be to move the selected text to the first box when about or typeof are selected in the predicate box. The user would then still have to fill in the object box. This could however be confusing (switching between 3rd and 1st box), so we'll need some user testing to be done on this.
Result:
- about
- if object is a literal:
<span about=“/URL_to_current_node#object_resource_name”>the selected text</span> - if object is a URI:
<span about=“URI”>the selected text</span>
- if object is a literal:
- typeof
- object has to be a predefined class
<span typeof=“namespace:type”>the selected text</span>
- object has to be a predefined class
User clicks on triplet icon of markup element
This is a mechanism to view and update triplets already declared in RDFa. Normally it should be possible that more than 1 triplet is defined in 1 xhtml element (e.g. Link element with rel and property), afaik this is however not the case with the here described system, so we will leave this as an unanswered edge case for the time being.
When Cobalt is used as a markup editor we would also have 2 things that could be displayed. We could however decide to only use the triplet icon for the manipulation of the RDFa triples.
Marking up urls and images
Since more than 1 value needs to be set, we can keep using a dedicated multi-field form in overlays for this.
User calls cobalt, cursor in text field form, no text selected
Since there is no literal that can be used as object, the user needs to define one.
Result:
<span property=“property_XYZ” value=“entered_text”></span>
Remarks on the implementation
Extending cobalt
Hierarchical structure
Quicksilver has a hierarchical command selection system: you can use the autocomplete function to find an element that you know and then use the left and right arrows to browse to the element you actually want to use. This is very valuable for calling menu items, but it could also be of value for finding entities or classes or even properties (go to namespace and then browse for properties).
RDF
We would need a extension that adds resources (instances of classes) and properties to cobalt (much like the nodes and taxonomy sub-modules of Cobalt).
RDF triple validation
Cobalt's multi pane UI is a perfect fit for creating RDF triples for RDFa markup but also for RDF tagging (e.g. When on a node call cobalt, and add a triple to the node. This could be a perfect tool for adding content to a repository that you can than query in graphmind).
What is needed is some way to provide only relevant options in the 3 panes for that we would need to be able to interpret range and domain for properties and use them to filter the options in the panes.
So when resources are selected for pane 1 and or 3 this should be used to filter the properties that can be selected.
When a property is selected only resources that lay in the domain and range of the property should be available for selection.
Alternatively we could allow the selection of a property that doesn't fit the ranges, but drop object and/or subject if they don't fit the range/domain respectively of the property.
- Subject: Domain (valid subject resource classes for a given predicate)
- Predicate
- Object: Range (valid object resource classes for a given predicate)
Cobalt as a markup editor
Default XHTML markup elements can also be added to the options in the UI, so that you could call these in the same manner (this is probably a lot faster than the button system).
The above is a part of a first proposal for the specification of the Documentation system we want to build as part of the modulecraft project. It is by no means complete, and it strongly needs your feedback. This is our first encounter with DITA and our ideas should really be proof checked by technical writers that have extensive experience using DITA. The actual specification is being built as a wiki at groups.drupal.org. comments can be added here or you can edit the wiki directly any feedback will be incorporated into the wiki.









Great job on such topic. In my opinion, documentation is more of a people problem than a technological one, and I'm not sure this will help solve it.
I greatly appreciate all the effort and thinking that goes into this, and Drupal documentation definetly needs improving.
But isn't this getting a little overcomplicated? Processes, tools and standards can be useful, but introducing all this overhead: DITA, RDF, XML fingerprints, editing widgets, a new support platform (modulecraft).
In short, I think documentation is more of a people problem than a technological one, and I'm not sure this will help solve it.
Again, I think it's awesome that you are tackling this problem, I'm just not so sure about the approach.