Wednesday, April 17, 2013

GUT

"Grand Unification Theory" may be a touch grandiose, but the underlying libraries used in the Homer Muiltitext project now generate RDF statements that fully express all three types of CITE-architecture information:  textual archives, archives of data collections, and indices relating citable objects to other citable objects or to raw data.  There will be lots of interesting connections to explore in the resulting unified graph of scholarly material.

In parallel with this, I've now implemented the CTS protocol, the CITE Collections Service protocol, and its extension with the CHS Image protocol in servlets drawing on a SPARQL endpoint, so creating a complete CITE environment can be reduced to:

- build all RDF (automatically), and import into a triple store
- drop the three servlets for CITE services into a servlet container
- install the iipsrv fastcgi for working with binary image data.  This is the most troublesome step on many platforms, but happily iipsrv is now available as a package under debian.

Not bad.  Chris Blackwell is preparing an image for the < $50 raspberry pi with these requirements preinstalled:  a complete CITE Box roughly the size of an Altoids container.

As we review the schemas used in the services this month, we'll begin looking at defining a more permanent RDF vocabulary.  I'm not sure at this point if we need to break out a generic CITE vocabulary distinct from a specific HMT vocabulary, or whether one ontology will suffice.  We'll be looking at other projects' work:  thanks to Joel Kavlesmaki for pointing to the useful list here.





No comments: