Waikato visit report from John Rose

admin. Wednesday, March 26th, 2008.

I have been a volunteer research associate in the Greenstone team for more than two years, and was very pleased to be able to visit the University of Waikato, at the invitation of Prof. Ian Witten, from 5 to 19 March 2008 (this was also my first visit to New Zealand).

I live in France and have been working, mainly through the internet, to promote the use of Greenstone in developing countries. As a corollary activity, I have also been collaborating with Anna Huang to improve and test the Greenstone language interfaces with emphasis on those needed in developing countries. I had met Ian several times in Paris, and also David Bainbridge, but this visit was my first opportunity to meet the other members of the team.

During my visit I was able to experiment with Greenstone functions which were new to me, discuss problems encountered and future improvements, and consider with the team our strategies for more effectively reaching and involving users in developing countries.

Here are some of the highlights of what was learned and discussed:

Possible problems with Windows XP Home edition

I had followed the instructions for setting up an Apache web server (file library.txt in the Greenstone home directory) under Greenstone 2.80, and found that access to existing collections from the same computer was only possible when the collect sub-directory was shared with all network users (a contradiction since only one user was concerned for client and server).

Similarly, I followed the instructions for installation of the GLI Client and could neither create new collections nor access existing collections.

These two problems were consistent and replicable on my computer for several days, but without explanation they both stopped. I personally feel that there is some interference with the file sharing system under Windows XP Home edition, which mysteriously ended with the many manipulations that were done to understand the problems (there seem to be some internal system user names which may have been involved). Kathy Don is experimenting with Greenstone on this version of Windows. Users who are having similar problems are invited to report them on the Greenstone users list.

The reason for the problem that I was having with the GLI applet was found: the directory where Java SDK was installed was not in the PATH environment variable, which prevented the keytool/jarsigner sequence from functioning. When it was added to PATH, the applet worked fine. I added a warning to this effect in the GLI applet installation instructions.

OAI-PMH

Open Archives Initiative – Protocol for Metadata Handling is a powerful method for open access sharing metadata on the web (see tutorial).

I tested the OAI server under Greenstone 2.80 and it works fine (this is documented only very briefly in the OAI Demo documented example collection, but it’s operation is simple: one needs to have the Web Library – not the Local Library – running and to have previously edited the etc/oai.cfg file according to the instructions found in it.. When this option is active, one or more specified collections serve OAI data to OAI harvesters while the normal web access to these collections continues normally.

I also tested the OAI downloading function as presented in a tutorial on the wiki. This function, potentially very useful for collecting external documents for local Greenstone collections, makes use of the fact that, although OAI-PMH is formally designed only to share metadata, this metadata normally provides information on the location of the original document in the dc.identifier metadata field. But two major constraints were identified:

  • The provision of simple url in this field (as done in the “Rocky” collection at Virginia Tech used in the OAI Demo documented example collection) is not widespread; most OAI repositories provide a handle reference (DSpace) or the url of a webpage containing a link to the original document (EPrints).
  • In the Greenstone version 2.80, the metadata imported under OAI-PMH cannot be edited, justifiable in the sense that they were assigned by the original creator, but inconvenient if documents are to be integrated into a new special collection.

While I was at Waikato, David Brainbridge improved the OAI download facilities to recover the original documents in a all of the above cases, and to convert the metadata to editable form if desired. These improvements will be included in version 2.81 of Greenstone.

Depositor

This undocumented function enables a remote user of a Greenstone web library to submit documents to a collection, and to assign metadata to them, through the web without installing Greenstone or GLI. One need only enable the depositor (by changing “disabled” to “enabled” in the main.cfg file in the etc directory); the Depositor can then be called from a button on the Greenstone home page.

This function should be very useful in creating institutional repositories with Greenstone. It will be documented in version 2.81 (careful: to test it now, you have to assign the user to the “colbuilder” group, even though this has now been replaced by “all-collections-editor” or “personal-collections-editor” for authentication in Greenstone.

Formatting Documents within GLI

If Greenstone users want to manage the formatting of documents in a collection, they are presently obliged to do it outside of GLI (either by reformatting the original document or by creating a formatted html document from the original). Anupama Krishnan has developed a prototype function enabling the user to convert the original document (e.g. in Word or pdf format) to html and subsequently edit it within GLI (for example to define section headings and sub-headings or to improve the style of presentation) before building the collection. This function, to be included in version 2.81, will enable users in many cases to reduce the size of their collections and/or improve the quality of presentation by eliminating the need to present both the original document for display and the html version for searching.

Greenstone3

I was able to install Greenstone3 without any difficulties. It currently performs most of the functions of Greenstone2. The main difference for the basic user is that the formatting language for displaying documents is different, and may appear, at least at first, more complicated than the formatting language of Greenstone2. Dave Nichols is preparing to develop a graphical user interface to facilitate the formatting process, but this will have to await the completion of the basic formatting interface. Given the substantial benefits of Greenstone3 for advanced programmers, and the substantial overhead in maintaining two versions, there is a consensus within the Greenstone team that Greenstone3 should be developed and stabilised as soon as possible to replace Greenstone2.

Updates and documentation

I was able to point out some shortcomings in the latest update (version 2.80):

  • Several of the language interfaces (including Malayalam, Tamil and Telugu) not activated upon installation (the user should add them to the main.cfg file if needed
  • Example collections not updated on Sourceforge (now fixed).

It was agreed that the checklist for issuing new versions should be tightened more closely controlled for future distributions.

In addition we discussed ways to:

Collaboration with users

The Greenstone team is consists overwhelmingly of faculty members who are doing research in the area of digital libraries. Some technical staff (one full time and several part-time, including Ph.D. students) are available to support the research effort, including as appropriate to help incorporate new research results into Greenstone, but resources to ensure support for the international Greenstone community are extremely modest. I participated, in some sense on behalf of the users, in discussions of the Greenstone team on how to improve user support and collaboration within the existing constraints.

The following ideas were expressed:

  • Users as well as developers should be encouraged to use the bug reporting system, which can be used to report interface presentation problems as well as technical problems.
  • The regional and linguistic user communities should be encouraged to participate more actively in helping users in their regions and beyond, while in turn the Greenstone team could work more closely follow and support organised user efforts, especially in the developing countries (already Kathy Don is providing technical support for the southern African network, Anuparma Krishnan for the South Asian network, and Anna Huang for the language interfaces, all with support from myself on the “soft” aspects.
  • The possibility of more closely involving institutions in developing countries in Greenstone research and development activities should be explored. For example, major research thrusts in digitisation of newspapers and in audio-visual collections could perhaps include the development and testing of relevant applications in developing countries.

3 Responses to “Waikato visit report from John Rose”

  1. K Rajasekharan Says:

    Dear Dr Rose,

    Thank you for your nice and informative description of prioblems, solutions and future course in brief.

    Best Regads,

    K Rajasekharan

  2. Misheck Nyaluso Says:

    Dear John,

    Your report on the your visit and meeting with the Greenstone Team at the University of Waikato is very informative.
    The problems and solutions that you discussed with the Greenstone will go a long way to enlighten the new users and appreciate the effort being made by the Greenstone developers to smoothen up some bugs.

    However, from the your report one is also left with an impression that if we are to help people learn and develop skills to use Greenstone effectively in our regions (i.e Southern Africa), we need more people with a technical understanding of Greenstone to be able to bridge up the knowledge and skill gap between the Developers and the novice users in our regions.
    The very fact that some problems are posted on the users lists more repeatedly shows that probably many of us dont have much time to devote to following what is happening on the lists. If that was happening, then probably many people would be learning from the solutions that are being suggested those that posted problems.

    It is pleasing to learn that Kathy Don is providing technical support to us in the southern African network to improve user support and collaboration within the existing constraints. I guess she will be able to fill up where the tutorials on the wiki seem to be too brief in certain areas.

    Now that you have learnt a few more skills and discussed problems encountered and future improvements, I can only hope for more of your continued support and guidance on strategies to effectively reach and involve users in developing countries whose many problems evolve from lack of resources and relevant technical skills to fully benefit from the use of Greenstone to build digital collections.

    Lastly, I would also like to request those that have experience in scanning and OCRing to share with us (both on hardware and software). This seems to be an exercise that people are taking so lightly but later realise that its a major aspect of the whole exercise of building digital collections.

    Best regards

    Misheck

  3. M.G. Sreekumar Says:

    Dear Prof. John:

    It is heartening to note your recent visit to Waikato has been very fuirtful and productive. You have practically touched upon almost every pressing issue Greenstone has been confronting and found a way out in addressing them with the help of the Greenstone team or by initiating action plans towards meeting them soon.

    Your interventions from the perspective of a user as well as a volunteer were very useful and timely in giving the Greenstone team a true feedback on these issues.

    We are all eagerly looking forward to the much powerful Greenstone3 soon becoming a DL production software.

    Kudos to Prof. Ian and team for that wonderful farsightedness in Greenstone. And also to the numerous volunteers lined up in strengthening it for the cause of science, humanity or philanthorpy. When we dive deep into Greenstone only we understand its real depth…

    Affly,
    Sreekumar
    UNESCO Coordinator, Greenstone Support, South Asia