Chapter 2 Using Greenstone Collections


Using a Greenstone CD-ROM
Finding information
Changing the preferences

The Greenstone software is designed to be easy to use. Web-based and CD-ROM collections have interfaces that are identical. Installing the Greenstone software from CD-ROM on any Windows or Linux computer is very easy indeed; a standard installation setup program is used in conjunction with pre-compiled binaries. A collection can be used locally on the computer where it is installed; also, if this computer is connected to a network, the software automatically and transparently allows all other computers on the network to access the same collection.

The next section describes how to install a Greenstone CD-ROM. Then we look at the searching and browsing facilities offered by a typical Greenstone collection, the “Demo” collection that is supplied with the Greenstone software. Other collections offer similar facilities; if you can use one, you can use them all. The following section explains how to customize the interface for your own requirements using the Preferences page.

2.1 Using a Greenstone CD-ROM

The Greenstone digital library software itself comes on a CD-ROM, and you or your system manager have probably installed it on your system, following the instructions in the Greenstone Digital Library Installer's Guide. If so, Greenstone is already installed on your computer and you should skip the rest of this section.

Some Greenstone collections come on a self-contained Greenstone CD-ROM that includes enough of the software to run just that collection. To use it, simply put it into the CD-ROM drive on any Windows PC. Most likely (if “autorun” is enabled on your PC), a window will appear inviting you to install the Greenstone software. If not, find the CD-ROM disk drive (on current Windows systems you can get this by clicking on the My Computer icon on the desktop) and double-click it, then the Setup.exe file inside it. The Greenstone Setup program will be entered, which guides you through the setup procedure. Most people respond yes to all the questions.

When the installation procedure has finished, you'll find the library in the Programs submenu of the Windows Start menu, under the name of the collection (for example, “Development Library” or “United Nations University”).

Once the software has been installed, the library will be entered automatically every time you re-insert the CD-ROM if autorun is enabled.

2.2 Finding information

The easiest way to learn how to use a Greenstone collection is to try it out. Don't worry—you can't break anything. Click liberally: most images that appear on the screen are clickable. If you hold the mouse stationary over an image, most browsers will soon pop up a message that tells you what will happen if you click.

Experiment! Choose common words like “the” and “and” to search for—that should evoke some responses, and nothing will break.

Greenstone digital library systems usually comprise several separate collections—for example, computer science technical reports, literary works, internet FAQs, magazines. There will be a home page for the digital library system which allows you to access any publicly-accessible collection; in addition, each collection has its own “about” page that gives you information about how the collection is organized and the principles governing what is included in it. To get back to the “about” page at any time, just click on the “collection” icon that appears at the top left side of all searching and browsing pages.

Figure 1 shows a screenshot of the “Demo” collection supplied with the Greenstone software, which is a very small subset of the Development Library collection; we will use it as an example to describe the different ways of finding information. (If you can't find the Demo collection, use the Development Library instead; it looks just the same.) First, almost all icons are clickable. Several icons appear at the top of almost every page; Table 1 shows you what they mean.

Figure 1  Using the Demo collection

Table 1  What the icons at the top of each page mean

This takes you to the “about” page

This takes you to the Digital Library's home page, from which you can select another collection

This provides help text similar to what you are reading now

This allows you to set some user interface and searching options that will then be used henceforth

The “search … subjects … titles a-z … organization … how to” bar underneath gives access to the searching and browsing facilities. The leftmost button is for searching, and the ones to the right of it—four, in this collection—evoke different browsing facilities. These last four may differ from one collection to another.

How to find information

Table 2 shows the five ways to find information in the Demo collection.

Table 2  What the icons on the search/browse bar mean

Search for particular words

Access publications by subject

Access publications by title

Access publications by organization

Access publications by “how to” listing

You can search for particular words that appear in the text from the “search” page. (This is just like the “about” page shown in Figure 1, except that it doesn't contain the about this collection text.) The search page can be reached from other pages by pressing the search button. You can access publications by subject by pressing the subjects button. This brings up a list of subjects, represented by bookshelves that can be further expanded by clicking on them. You can access publications by title by pressing the titles a-z button. This brings up a list of books in alphabetic order. You can access publications by organization by pressing the organization button. This brings up a list of organizations. You can access publications by how to listing by pressing the how to button. This brings up a list of “how to” hints. All these buttons are visible in Figure 1.

How to read the documents

In the Demo collection, you can tell when you have arrived at an individual book because there is a photograph of its front cover (Figure 2). Beside the photograph is a table of contents: the entry in bold face marks where you are, in this case Introduction and Summary —Section 1 of the chosen book. This table is expandable: click on the folders to open them or close them. Click on the open book at the top to close it.

Underneath is the text of the current section (“The international demand for tropical butterflies …” in the example, beginning at the very bottom of the illustration). When you have read through it, there are arrows at the end to take you on to the next section or back to the previous one.

Below the photograph are four buttons. Click on detach to make a new browser window for this book. (This is useful if you want to compare books, or read two at once.) If you have reached this book through a search, the search terms will be highlighted: the no highlighting button turns this off. Click on expand text to expand out the whole text of the current section, or book. Click on expand contents to expand out the whole table of contents so that you can see the titles of all chapters and subsections.

In some collections, the documents do not have this kind of hierarchical structure. In this case, no table of contents is displayed when you get to an individual document—just the document text. In some cases, the document is split into pages, and you can read sequentially or jump about from one page to another.

Figure 2  A book in the Demo collection

What the icons mean

When you are browsing around the collection, you will encounter the items shown in Table 3.

How to search for particular words

From the search page, follow these simple steps to make a query:

  • Specify what units you want to search: in the Demo collection you can search section titles or the full text of the books.
  • Say whether you want to search for all or just some of the words
  • Type in the words you want to search for into the query box
  • Click the Begin Search button

When you make a query, the titles of up to twenty matching documents will be shown. There is a button at the end to take you on to the next twenty. From there you will find buttons to take you on to the third twenty or back to the first twenty, and so on. However, for efficiency reasons a maximum of 100 is imposed on the number of documents returned. You can change these numbers by clicking the preferences button at the top of the page.

Table 3  Icons that you will encounter when browsing

Click on a book icon to read the corresponding book

Click on a bookshelf icon to look at books on that subject

View this document

Open this folder and view contents

Click on this icon to close the book

Click on this icon to close the folder

Click on the arrow to go on to the next section ...

... or back to the previous section

Open this page in a new window

Expand table of contents

Display all text

Highlight search terms

Click the title of any document, or the little icon beside it, to open it. The icon may show a book, or a folder, or a page: it will be a book icon if you are searching books; otherwise if you are searching sections it will be a folder or page icon depending on whether or not the section found has subsections.

Search terms

Whatever you type into the query box is interpreted as a list of words called “search terms.” Each search term contains nothing but alphabetic characters and digits. Terms are separated by white space. If any other characters such as punctuation appear, they serve to separate terms just as though they were spaces. And then they are ignored. You can't search for words that include punctuation.

For example, the query

Agro-forestry in the Pacific Islands: Systems for Sustainability (1993)

will be treated the same as

Agro forestry in the Pacific Islands Systems for Sustainability 1993

Query type

There are two different kinds of query.

  • Queries for all the words. These look for documents (or chapters, or titles) that contain all the words you have specified. Documents that satisfy the query are displayed.
  • Queries for some of the words. Just list some terms that are likely to appear in the documents you are looking for. Documents are displayed in order of how closely they match the query. When determining the degree of match,
    • the more search terms a document contains, the closer it matches;
    • rare terms are more important than common ones;
    • short documents match better than long ones.

Use as many search terms as you like—a whole sentence, or even a whole paragraph. If you specify only one term, it doesn't much matter whether you use an all or a some query, except that in the second case the results will be sorted by the search term's frequency of occurrence.

Scope of queries

In most collections you can choose different indexes to search. For example, there might be author or title indexes. Or there might be chapter or paragraph indexes. Generally, the full matching document is returned regardless of which index you search.

If documents are books, they will be opened at the appropriate place.

Advanced search features

While the above is enough to meet most searching needs, some more advanced search features are provided. These are activated from the Preferences page, which is reached by clicking the preferences button at the top of the page—see Section 2.3 below. After changing your preferences, do not click your browser's Back button—that would undo the changes. Instead, click any of the buttons on the search/browse bar.

Case sensitivity and stemming

When you specify search terms, you can choose whether upper and lower case must match between the query and the document: this is called “case sensitivity.” You can also choose whether to ignore word endings or not: this is called “stemming.”

Under Search options on the Preferences page you will see a pair of buttons labeled ignore case differences and upper/lower case must match; these control the case sensitivity of your queries. Below is a pair of buttons labeled ignore word endings and whole word must match: these control stemming.

For example, if the buttons ignore case differences and ignore word endings are selected, the query

African building

will be treated the same as

africa builds

because the uppercase letter in “African” will be transformed to lowercase, and the suffixes “n” and “ing” will be removed from “African” and “building” respectively (also, “s” would be removed from “builds”).

Generally case differences and word endings should be ignored unless you are querying for particular names or acronyms.

Phrase searching

If your query includes a phrase in quotation marks, only documents containing that phrase, exactly as typed, will be returned.

If you want to use phrase searching, you need to learn a little about how it works. Phrases are processed by a post-retrieval scan. First the query is issued in the normal way—all the words in the phrase are included as search terms—and then the documents returned are scanned to eliminate those in which that phrase does not appear.

During the post-retrieval scan, phrases are checked just as they are, including any punctuation. For example, the query

what's a “post-retrieval scan?”

will first retrieve all documents that match all of the words

what s a post retrieval scan

and then the documents returned will be checked for the phrase

post-retrieval scan?

Phrase matches are case-insensitive if ignore case differences is set on the Preferences page.

Advanced query mode

In advanced query mode, which can be selected on the Preferences page, the queries for all of the words, described above, are actually Boolean queries. They consist of a list of terms joined by logical operators & (and), | (or), and ! (not). Absent operatorsbetween search terms are interpreted as & (and): thus a query without any operators returns documents that match all the terms.

If the words AND, OR, and NOT appear in your query they are treated as ordinary search terms, not operators. For operators you must use &, |, and !. In addition, parentheses can be used for grouping.

Using search history

When you switch on the “search history” feature on the Preferences page you will be shown your last few searches, along with a summary of how many results they generated. Click the button beside one of the previous searches to copy the text into the search box. This makes it easy to repeat slightly modified versions of previous queries.

2.3 Changing the preferences

Figure 3  The Preferences page

When you click the preferences button at the top of the page you will be able to change some features of the interface to suit your own requirements. The preferences depend on the collection; an example is shown in Figure 3. When you adjust your search preferences, you should press the set preferences button shown in Figure 3. After setting preferences, do not use your browser's “back” button—that would unset them! Instead, click one of the buttons on the access bar near the top of the page.

Collection preferences

Some collections comprise several subcollections, which can be searched independently or together, as one unit. If so, you can select which subcollections to include in your searches on the Preferences page.

Language preferences

Each collection has a default presentation language, but you can switch to a different language if you like. You can also alter the encoding scheme used by Greenstone for output to the browser—the software chooses sensible defaults, but with some browsers better visual results can be used by switching to a different encoding scheme. All collections allow you to switch from the standard graphical interface format to a textual one. This is particularly useful for visually impaired users who use large screen fonts or speech synthesizers for output.

Presentation preferences

Depending on the collection, there may be other options you can set that control the presentation. Collections of web pages allow you to suppress the Greenstone navigation bar at the top of each document page, so that once you have done a search you land at the exact web page that matches without any Greenstone header. To do another search you will have to use your browser's “back” button. These collections also allow you to suppress Greenstone's warning message when you click a link that takes you out of the digital library collection and on to the web itself. And in some web collections you can control whether the links on the “Search Results” page take you straight to the actual URL in question, rather than to the digital library's copy of the page.

Search preferences

Under Search preferences in Figure 3, the first pair of buttons allows you to get a large query box, so that you can easily do paragraph-sized searching. In Greenstone, it is surprisingly quick to search for large amounts of text. The next two pairs of buttons control the kind of text matching in the searches that you make. The first set (labeled “case differences”) controls whether upper and lower case must match. The second (“word endings”) controls whether to ignore word endings or not.

Using the next button pair you can switch to the “advanced” query mode described above, which allows you to specify more precise queries by combining terms using AND (&), OR (|), and NOT (!). You can turn the search history feature, described above, on and off. Finally, you can control the number of hits returned, and the number presented on each screenful, through the last entry in Figure 3.

