GREENSTONE DIGITAL LIBRARY FROM PAPER TO COLLECTION
Chapter 5 Creating an electronic collection
Three important aspects should be kept in mind when deciding to create digital collections. First, the collection must be organized. The more content there is, the greater the need for indexes and powerful search systems. For collections of 3000 to 5000 pages or more, indexes and search systems are essential. Second, the needs of end-users must prevail. The target groups that will use the collection should be identified, and a process of regular consultation set up. Third, the available budget will determine how much can be done.
5.1 Methods of collection building
There are many examples of excellent CD-ROMs that are created on the web-page model. HTML, PDF or Word documents are added and linked using hyperlinks. Navigation is made simple and attractive by the use of hyperlinks, frames, keywords, indexes and so on. Such systems work well up to a few thousand pages, but from 3000 to 5000 pages onwards it is important to have a well-structured collection and a powerful search facility. This is where the Greenstone software can help.
The Greenstone Digital Library software creates a structured digital library including a very powerful search and retrieval engine. Up to 150,000 pages can be indexed on a single CD-ROM. Every CD-ROM can become an Internet server. Greenstone is open-source software, and is freely available under the GNU license.
The companion manuals describe how to build Greenstone collections. There are essentially three different ways of building collections:
The first method is the “librarian” interface, described in the Greenstone Digital Library User's Guide(Chapter 3, “Making Greenstone Collections”). This is a comprehensive interactive facility for collection-building. With it, you can collect sets of documents, import or assign metadata, and build them into a Greenstone collection. The second method is the “Collector” subsystem, described in Chapter 4 of the User's Guide. This is an older facility that provides an alternative way of building collections of web pages or other documents. It guides you through a sequence of interactive web pages that request the information needed. However, it does not provide any way of adding metadata to the documents, and—because it is a web interface—it is not really suitable for collections that take more than a few minutes to build. The third method is to run the programs for collection-building directly from the command line; this is in the Greenstone Digital Library Developer's Guide(Chapter 1). This gives more flexibility in running programs individually and saving intermediate results, which may be desirable for collections that take many hours to build. You will also need to read Chapter 2 of the Developer's Guide in order to harness the full power of Greenstone to build advanced collections.
There is a fourth method for creating and editing the material associated with a collection, a program called the Collection Organizer. However, its functionality has been superseded by the librarian interface mentioned above. It is described in a legacy document entitled Using the Organizer.
5.2 Getting started in seven steps and 15 minutes
The best way of getting the look and feel of the librarian interface is to actually create a small test library. If you have 15 minutes please follow these steps and you will understand this program much better.
Before getting started, first install Greenstone (see the Greenstone Installer's Guide) which includes the Demo collection in DLS format and its source files. Note, if you wish to be able to add to your collection any of the 140 documents in the DLS collection (instead of just the 11 of these documents in the Greenstone Demo collection), you should install DLS as one of the sample Greenstone libraries. The Demo and DLS collections will be installed in C:\Program Files\gsdl\collect, in subdirectories demo and dls respectively. If you previously installed Greenstone without DLS and wish to install it, then you may re-insert your Greenstone CD-ROM and add this collection. It is not necessary to uninstall Greenstone first.
We suggest that you print the instructions below and follow them step by step:
Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation; with no Invariant Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license is included in the section entitled “GNU Free Documentation License.”