Archive for March, 2012

Sam’s Greenstone Blog 30/3/2012

admin. Friday, March 30th, 2012.

Just a short entry today to say that this week has mostly consisted of working on minor features and fixes. All of the major new features are mostly done now and we are nearing the final testing stage before we can release Greenstone 3.

Anu’s blog entry for 5 March – 23 March

ak19. Friday, March 23rd, 2012.

The first two weeks involved:

  • generating some files for translation of the Greenstone interface (Mongolian, Bhutanese) and committing changes translators had submitted (Laotian)
  • fixing up the GS2 CORBA code, including bringing it up to speed with the rest of GS2’s runtime code, so that CORBA works again: it can now compile once more, and the corbaserver and corbarecptldd client program run well against each other when on the same machine. Running the server against the client in a remote situation does not yet work, but it did not work in the demo/hello-1 example of the now-updated MICO package either.
  • there was still a small error in the way the PDFBox extension tests for Java when Java is version 1.7 that made the extension not work with JDK1.7. The test for the presence of Java now has to run java -version rather than just java, since the return value in Java 7 is different from that in Java 6.
  • when testing the Powerpoint plugin, it was found that the OpenOffice extension needed to be corrected to make jodconverter use the same port as that which OpenOffice is run on. It was moreover discovered that users can’t already have the graphical user interface of OO running in the background, nor can they start this, during Greenstone’s processing of documents using the OO extension.

This week:

  • there was some issue with Greenstone 3’s tomcat server crashing on 64 bit Linux owing to a Java segmentation fault created by an error in the JNI code. Dr Bainbridge found out that the number of bytes to store pointers to data structures shared between Java and C++ code needed to be long rather than int, so MG’s and MGPP’s JNI code was updated. The error has not returned since, but debugging code has been left in for future debugging if required.
  • Dr Te Taka hoped to update the Maori translations for Greenstone’s interface using Google’s Translator Toolkit (GTT), and suggested that Greenstone’s translation process be expanded to allow this so that other translators too could benefit from the toolkit for translation if they wanted. He found out that the toolkit accepted an open-XML format called TMX, Translation Memory eXchange, and thus would need the strings that required translation to be converted into the TMX XML format (rather than into the usual spreadsheets versions of the .excel.xml format which we currently generate). Two new XSLT files have been written which Te Taka may kindly be testing for us: the first generates the TMX translation files that translators can load into Google’s Translator Toolkit. The second XSLT takes translated TMX files and converts them into an intermediary format that can be processed in the usual manner when submitting new and updated translations back into Greenstone.
  • currently looking at usersDB in GS3 having the correct values on startup.

Update: did not get much further with the GS3 usersDB as there was a lot more to be done with the translation files for GTT and their processing. The process became clearer thanks to Te Taka’s explanations and his testing at each stage. TMX files will only be needed the first time a translator migrates from GS’s usual translation procedure, which makes use of excel spreadsheet files, to Google’s toolkit. The TMX file will start them up with all the up-to-date translated strings that are available so far in GS3 for the selected language. For the strings that need to be translated and updated, the translator will get a text file that contains the unicode spreadsheet data (as comma separated values, but the file will have a .txt extension instead of .csv in order to preserve the unicode). The translator will then copy the English and <Language> columns of the spreadsheet into the GTT. Once their translation work is done, they can send these same columns back by way of the same spreadsheet.

Sam’s Greenstone Blog 23/3/2012

admin. Friday, March 23rd, 2012.

This week I have added the ability for users to register themselves for a Greenstone 3 site. To register a user must provide their username, password and email address (we may add more fields or the ability for an administrator to add custom fields) as well as match two words from an image (to make sure they’re not a bot). Users can now also modify their account settings themselves and next week we will probably look into adding the ability for Greenstone 3 to email the users (to confirm registration or to reset their passwords).

The paged-image widget I have discussed in the past also received several upgrades, such as the ability to filter pages based on their titles and it will now also show the page you are currently on within the widget. We are also planning on being able to specify number ranges to filter pages as well (e.g. typing “24-37” will show all the pages from page 24 to page 37).

I have made a list of all the things we need to complete before we can release Greenstone 3, it’s gradually getting smaller so hopefully it won’t be much longer.

Sam’s Greenstone Blog 16/3/2012

admin. Friday, March 16th, 2012.

With the new user-login capability of Greenstone3 I have been creating and improving various features that relate to this capability. For example, the previous administration capabilities (the ability for admin users to add/edit/remove users) were not very secure and have now had an overhaul to properly connect with the servlet security method I described in a previous post. By itself however, trying to use the current administration capabilities to manage a large-scale Greenstone 3 installation with many collections and many users would be a difficult task as each user would need to be added/modified by an administrator. To aid in this problem we will create the ability for users to register themselves and to change their own basic settings (password and details), the more powerful options such as group assignment will still be the job of an administrator.

We have also made significant progress on the RESTful URL feature. Servlets can have filters that requests are sent to before they reach the servlet itself, and this what we use to provide this functionality. The filter examines the URL before it reaches the servlet and digs out any parameters that have been written in the RESTful form (for example, it will set c=demo from

Sam’s Greenstone Blog 2/3/2012

admin. Friday, March 2nd, 2012.

This week has had a rather exciting development that several people have been wanting for quite a long time.  The 64-bit compatible versions of MG, MGPP and GDBM have been added to the main code, meaning that Greenstone 2 and 3 can now compile successfully on 64-bit systems. The reason this has taken a long time to be done is that the 32-bit and 64-bit versions of MG and MGPP produced seemingly different files when run over the same documents, which was a concerning for us as people might want to move their 32-bit MG/MGPP collections over to a 64-bit Greenstone installation and we suspected that this might not work given the different files. This week we discovered the cause of the difference and are now reassured that files from 32-bit and 64-bit installations can be interchanged without issue.

This week has seen more upgrades to Greenstone 3 as well. One of the features we have been working on for the Pei Jones collection is the ability to zoom “screen” images by using the mouse like a magnifying glass. We have added this into the default Greenstone 3 capabilities. In order for this to work however there needs to be a “screen” (small) and “source” (usually larger) version of the same image.

In general Greenstone 3 now handles paged-images much better. They are now properly displayed at the top of their specific sections. There is also an option to change between text-only, image-only and the default text and image modes, which is available in both the paged style collections as well as normal hierarchy style collections.

Next week will most likely involve more improvements like this as we continue to prepare Greenstone 3 for release.