CDS/ISIS example


This collection is built on a CDS/ISIS database of bibliography entries. Here is an example record.

How the collection works

The collection configuration file, etc/collectionConfig.xml specifies the ISISPlugin plugin, which processes CDS/ISIS databases. These databases have several files, but ISISPlugin uses just three: CDS.fdt (where CDS is the name of the database), containing the field names used in the database, CDF.xrf (a cross-reference file), and CDS.mst, containing the actual records. Whenever ISISPlugin encounters an ".mst" file, it looks for the corresponding ".fdt" and ".xrf" files. In this case the plugin has been given an input_encoding argument because some entries in the database contain extended characters (in a form that was used in early versions of the DOS operating system). It has also been given a subfield separator argument, whose purpose is explained below. The -OIDtype incremental plugin option was used to give identifiers that are consistent across different operating systems (which may not happen with HASH identifiers), so that we can link to a document in this description.

Like the bibliography collection, this collection incorporates a form-based search interface that allows fielded searching. This is specified by the line format SearchTypes "form,plain" in the configuration file; the plain argument ensures that there is a plain textual full-text search feature as well (which can be selected from the Preferences page). The <importOption name="groupsize" value="100"/> line in the collectionConfig.xml file puts documents together into groups of 100 (as explained in the bibliography collection).

Some fields in CDS/ISIS databases have subfields. For example, in this case the Imprint field has subfields Imprint.a for place, Imprint.b for publisher and Imprint.c for date. For each field and subfield, ISISPlugin generates a metadata element -- in this case there will be metadata called Imprint^a, Imprint^b and Imprint^c. (There could be a field called just Imprint, although in this case there is not.) ISISPlugin also generates a metadata element called Imprint^all that gives all subfields concatenated together, separated by the character string that was specified as a plugin argument (in this case ", ").

The designer of this collection has decided to create searchable indexes on all the ^all metadata fields, as well as one on text which makes the raw records searchable too. Of course, the designer could have created searchable indexes on any of the subfields instead -- or as well.

There are two browsing classifiers, an AZList based on Title metadata and an AZCompactList based on Keyword metadata. Recall that the AZCompactList classifier is like AZList but generates a bookshelf for duplicate items. The VList format specification applies to both the search results list and the Title classifier, while the CL2VList puts the number of documents associated with each keyword as described in the MARC example collection. In Greenstone, and in CDS/ISIS, any metadata item can have several different values. The VList specification sibling(All'; ') gathers together all the values, separated (in this case) by semicolon.

The DocumentContent format specification incorporates the same mechanism for hiding and showing raw records as explained for the Bibliography collection, using the DocumentHeading to show the formatted record and DocumentContent to show (or hide) the original database entry.