Simple image collection

This is a basic image collection that contains no text and no explicit metadata. Several JPEG files are placed in the import directory prior to importing and building the collection, that's all.

The images in this collection have been produced by members of the Department of Computer Science, University of Waikato. The University of Waikato holds copyright. They may be distributed freely, without any restrictions.

How the collection works

Here is a sample document in the collection. The configuration file, collectionConfig.xml, specifies no indexes, so the search button is suppressed.

There is only one plugin, ImagePlugin, aside from the others that are always present (crucially GreenstoneXMLPlugin, MetadataXMLPlugin, ArchivesInfPlugin, DirectoryPlugin). ImagePlugin relies on the existence of two programs from the ImageMagick suite (www.imagemagick.org): convert and identify. Greenstone 3 binaries come bundled with Imagemagick as one of the components that can be optionally installed. Greenstone will not be able to build the collection correctly unless an ImageMagick is installed on your computer.

ImagePlugin automatically creates a thumbnail and generates the following metadata for each image in the collection:

ImageName of file containing the image
ImageWidthWidth of image (in pixels)
ImageHeightHeight of image (in pixels)
Thumb Name of gif file containing thumbnail of image
ThumbWidthWidth of thumbnail image (in pixels)
ThumbHeightHeight of thumbnail image (in pixels)
thumbiconFull pathname specification of thumbnail image
assocfilepathPathname of image directory in the collection's assoc directory

The image is stored as an "associated file" in the assoc subdirectory of the collection's index directory. (Index is where all files necessary to serve the collection are placed, to make it self-contained.) For any document, its thumbnail and image are both in a subdirectory whose filename is given by assocfilepath. The metadata element thumbicon is set to the full pathname specification of the thumbnail image, and can be used in the same way as srcicon (see the MSWord and PDF demonstration collection).

The browse format statement in the collection configuration file, collectionConfig.xml, dictates how the document will appear, and this is the result. There is no document text (if there were, it would be producible by <xsl:call-template name="documentNodeText"/> in format statements). What is shown is the image itself, along with some metadata extracted from it.

The configuration file specifies one classifier, a List based on Image metadata, shown here. The format statement shows the thumbnail image along with some metadata. (Any other classifiers would have the same format, since this statement does not name the classifier.)

You may wonder why the thumbnail image is generated and stored explicitly, when the same effect would be obtained by using the original image and scaling it (as would happen if you did not have an Imagemagick installed).

The reason is to save communication bandwidth by not sending large images when small ones would do.

For a more comprehensive image collection, see the kiwi aircraft images in the New Zealand Digital Library. The structure of this collection is quite different, however: it is a collection of web pages that include many images along with the text. The HTML plugin HTMLPlugin also processes image files, but it does so in a different way from ImagePlugin (for example, it does not produce the metadata described above).