greenstone.org greenstone wiki greenstone trac planet greenstone

Archive for the ‘Greenstone2’ Category

x

Anu’s entry for the week of 7-11 May 2012

ak19. Saturday, May 12th, 2012.

Over the week, have been working on the activate.pl script (and things that it needs). The details are at http://trac.greenstone.org/ticket/825

For the latest changes made today, need to retest these changes against GS3 on Windows.

Still need to test the entire process on Linux.

Anu’s entry for the weeks of 23 Apr - 4 May 2012

ak19. Friday, May 4th, 2012.
  • At the start of last week, finished off the task of the GS3 “debuginfo” button that now appears next to the login button.
  • The Greenstone tutorial xml files can now include a MajorVersion element with number attribute to specify if the instructions are for GS3 or GS2 and will get processed by the XSLT to display or hide such elements depending on the active version.
  • Joshua Scarsbrook discovered two bugs compiling GS3 on a Mac and has helped us fix these (but one of the fixes still needs to be tested on his machine). Unfortunately there were some issues with setting the Java preferences on my account on the Mac here. At present, GS3 can’t be compiled there because it requires Java 1.6.
  • After Dr Bainbridge fixed error handling and display of the PDFBox Extension, it became easier to debug a PDFBox Extension bug discovered by a member on the mailing list. She helped us to track it down and it turned out that the PDFBox extension did not try to first look for and use any JRE included in a GS2 binary when running the java -version test.
  • While trying to work out why searching 3 digit numbers crashed the server (when Diego wanted to try the ifl=1 parameter to the GS2 URL), I first found and tracked down a very troublesome bug that I had accidentally introduced into GS2. The documents in browse or search results would not display and their URLs looked strange (with the word handle in their path). It turned out that in January, I’d committed the -DDOCHANDLE option to CXXFLAGS in a win32.mak file that was meant for the experimental work Dr Bainbridge and Diego had been doing with REST URLs. I meant to commit only the RSS support code they had written. Dr Bainbridge then fixed the bug Diego had originally noticed to do with the ifl parameter.
  • Some translation work and looked at a few mailing list questions.
  • Currently started work on activate.pl which should perform in perl the task that GLI currently does of stopping the GS2 or GS3 server while moving the building to index and restarting the server again.

Anu’s blog entry for 5 March - 23 March

ak19. Friday, March 23rd, 2012.

The first two weeks involved:

  • generating some files for translation of the Greenstone interface (Mongolian, Bhutanese) and committing changes translators had submitted (Laotian)
  • fixing up the GS2 CORBA code, including bringing it up to speed with the rest of GS2’s runtime code, so that CORBA works again: it can now compile once more, and the corbaserver and corbarecptldd client program run well against each other when on the same machine. Running the server against the client in a remote situation does not yet work, but it did not work in the demo/hello-1 example of the now-updated MICO package either.
  • there was still a small error in the way the PDFBox extension tests for Java when Java is version 1.7 that made the extension not work with JDK1.7. The test for the presence of Java now has to run java -version rather than just java, since the return value in Java 7 is different from that in Java 6.
  • when testing the Powerpoint plugin, it was found that the OpenOffice extension needed to be corrected to make jodconverter use the same port as that which OpenOffice is run on. It was moreover discovered that users can’t already have the graphical user interface of OO running in the background, nor can they start this, during Greenstone’s processing of documents using the OO extension.

This week:

  • there was some issue with Greenstone 3’s tomcat server crashing on 64 bit Linux owing to a Java segmentation fault created by an error in the JNI code. Dr Bainbridge found out that the number of bytes to store pointers to data structures shared between Java and C++ code needed to be long rather than int, so MG’s and MGPP’s JNI code was updated. The error has not returned since, but debugging code has been left in for future debugging if required.
  • Dr Te Taka hoped to update the Maori translations for Greenstone’s interface using Google’s Translator Toolkit (GTT), and suggested that Greenstone’s translation process be expanded to allow this so that other translators too could benefit from the toolkit for translation if they wanted. He found out that the toolkit accepted an open-XML format called TMX, Translation Memory eXchange, and thus would need the strings that required translation to be converted into the TMX XML format (rather than into the usual spreadsheets versions of the .excel.xml format which we currently generate). Two new XSLT files have been written which Te Taka may kindly be testing for us: the first generates the TMX translation files that translators can load into Google’s Translator Toolkit. The second XSLT takes translated TMX files and converts them into an intermediary format that can be processed in the usual manner when submitting new and updated translations back into Greenstone.
  • currently looking at usersDB in GS3 having the correct values on startup.

Update: did not get much further with the GS3 usersDB as there was a lot more to be done with the translation files for GTT and their processing. The process became clearer thanks to Te Taka’s explanations and his testing at each stage. TMX files will only be needed the first time a translator migrates from GS’s usual translation procedure, which makes use of excel spreadsheet files, to Google’s toolkit. The TMX file will start them up with all the up-to-date translated strings that are available so far in GS3 for the selected language. For the strings that need to be translated and updated, the translator will get a text file that contains the unicode spreadsheet data (as comma separated values, but the file will have a .txt extension instead of .csv in order to preserve the unicode). The translator will then copy the English and <Language> columns of the spreadsheet into the GTT. Once their translation work is done, they can send these same columns back by way of the same spreadsheet.

Sam’s Greenstone Blog 2/3/2012

sjm84. Friday, March 2nd, 2012.

This week has had a rather exciting development that several people have been wanting for quite a long time.  The 64-bit compatible versions of MG, MGPP and GDBM have been added to the main code, meaning that Greenstone 2 and 3 can now compile successfully on 64-bit systems. The reason this has taken a long time to be done is that the 32-bit and 64-bit versions of MG and MGPP produced seemingly different files when run over the same documents, which was a concerning for us as people might want to move their 32-bit MG/MGPP collections over to a 64-bit Greenstone installation and we suspected that this might not work given the different files. This week we discovered the cause of the difference and are now reassured that files from 32-bit and 64-bit installations can be interchanged without issue.

This week has seen more upgrades to Greenstone 3 as well. One of the features we have been working on for the Pei Jones collection is the ability to zoom “screen” images by using the mouse like a magnifying glass. We have added this into the default Greenstone 3 capabilities. In order for this to work however there needs to be a “screen” (small) and “source” (usually larger) version of the same image.

In general Greenstone 3 now handles paged-images much better. They are now properly displayed at the top of their specific sections. There is also an option to change between text-only, image-only and the default text and image modes, which is available in both the paged style collections as well as normal hierarchy style collections.

Next week will most likely involve more improvements like this as we continue to prepare Greenstone 3 for release.

Anu’s entry for the week ending 2 Dec 2011

ak19. Friday, December 2nd, 2011.

Continued on the problem that I thought had been almost resolved last week: getting the batch files in GS2 to handle not only spaces but also brackets in the Greenstone filepaths. The batch files were done, but the perl code needed some correcting too. After inspecting many files in order to see whether they needed correcting, the GS2 code seems to work well on Windows even where Greenstone is installed in a path containing brackets.

This week, I was able to finally return to the problem of jodconverter not interacting well with the LibreOffice on the Ubuntu 11 whereas the same worked perfectly against an OpenOffice on the CentOs machine. We decided that perhaps OpenOffice had different behaviour for the signals sent by jodconverter. Installing OpenOffice turned out harder than expected and I think I botched it. I ended up having to uninstall all openoffice files and libreoffice files and then reinstalled all of libreoffice. At this stage, upon trying jodconverter again, it was found to work fine each time. This seemed to confirm the suspicion that some updates to Ubuntu may have messed up some libraries or something, breaking LibreOffice a little.

However, despite things now working again, Sam wondered, very correctly, whether a user’s experience would be this convoluted or whether it would work straight away for them. He suggested trying out a VM of Ubuntu 11. Which is what I did. It was my first VM installation and after installing a Ubuntu 11.10 VM on Sam’s Windows 7 (which comes with LibreOffice), Greenstone with the open-office extension fortunately worked fine on a sequence of word documents.

On Friday, got round to Diego’s long-standing question at last: about the possibility of a single metadata.xml at the import level which defines the metadata for all files in import’s subfolders. Dr Bainbridge had already confirmed earlier that this was indeed possible, but the question was of how the metadata.xml out to specify the path to the files in the subfolders, especially if there were spaces in the path. After a series of incremental tests, it was found out to be still possible and the solution rather straightforward. Hopefully it will work for Diego also.

There was some translation work, and a few further questions on the mailing list to look at, before I finally got round to considering Michael Goodwin’s complex question on the setup.exe generated by an Export To CD-ROM operation failing on Windows 7 on 64 bit. A preliminary successful test on a Windows 7 machine turned out to be misleading: I had assumed it was a 64 bit machine but it turned out to be 32 bit after all. I will have to get back to trying this out next week. All this fine-tuning is bound to pay off in the upcoming perfected release of Greenstone 2: version 2.86.

Anu’s entry for week ending 26 Nov 2011

ak19. Monday, November 28th, 2011.

For the last two weeks, I was mainly learning the practical side of how to handle the Greenstone translations. Mainly how to generate the spreadsheets for translators to use, though there was also the opportunity for learning to handle translated spreadsheets. Next to that, there were some questions on the mailing list that I had a go at answering and uploaded the updates to the ACKU and AREU collections.

On the final 3 days, got round to working on getting the batch files in GS2 to handle not only spaces but also brackets in the Greenstone filepaths. There is still a final problem to resolve before the changes can be committed, but the Greenstone web server is now back to working again, despite Greenstone being installed in a path with brackets (and spaces). There’s even some allowance made in the makegs2.bat script–which is used to compile up GS2–to get apache to compile up even in those instances of there being spaces or brackets in the filepaths it works with. Fortunately, the change could be made in the makegs2.bat itself: it sets the command prompt in which Greenstone is being compiled up to be in short-filenames mode. This then is the situation that the apache compile scripts inherit also, making any space/bracket in the long pathname irrelevant.

Anu’s blog entry for the week ending 11 Nov 2011

ak19. Friday, November 11th, 2011.

As several people had encountered issues in the recent 2.85 release, a lot of this week was spent looking at them so that we can get 2.86 out as soon as possible.

The bugs and oversights are not fatal and work-arounds are possible:

1) If you don’t have the PDF-box extension for Greenstone installed already, GLI will suggest where it can be obtained from. However, the URL it provides points to an olderversion of the PDF-box extension, which happens to be one that’s not functional. If you want the version of PDF-Box that works with 2.85, get it from

http://trac.greenstone.org/browser/main/tags/2.85/gs2-extensions/pdf-box/trunk/pdf-box-java.tar.gz

or http://trac.greenstone.org/browser/main/tags/2.85/gs2-extensions/pdf-box/trunk/pdf-box-java.zip

2)  The Greenstone demo collection in 2.85 contains HTML files that can’t get converted into XML properly enough to work well with the flash file generated by the Realistic Book feature. So if you’re thinking of testing out the realistic book option of the HTMLPlugin against the HTML files included in the Greenstone demo collection, rather than against your own HTML files, get the improved demo collection from SVN at http://svn.greenstone.org/main/trunk/greenstone2/collect/demo

3) On Vista, if your Greenstone is installed in a path containing brackets, such as “Program Files (x86)” as can happen on Windows 7 machines, then launching Greenstone is likely to fail. On Windows, spaces in Greenstone’s installation path are okay, but brackets aren’t handled well-enough yet. This will be fixed in a future release of Greenstone 2.

4) The fourth bug is more serious in that there is no work-around. It was found by a member on the mailing list when he was using the Datelist Classifier and discovered that references to [ex.srclink] or [srclink] in his Format statements did not get resolved to the URL of the source file. (However, the default browsing classifiers had no problem with such Format statements and would display the correct URL.) This has now been fixed by Dr Bainbridge and will be present in the next release of Greenstone.

5) Another discovery made is that Ubuntu now seems to have a problem with the open-office extension. This was not  the case some two months back when, after a bugfix, the extension was tested on the Ubuntu both here and by another dedicated member of the Greenstone family on his own Ubuntu. However, the new problem has been confirmed to now exist, including when run from the commandline, and even older versions of the Greenstone extension are performing similarly despite having worked at one point. Perhaps this has something to do with updates on the Ubuntu, but we’ll be investigating it further.

Official Greenstone 2.85 released!

ak19. Friday, November 4th, 2011.

At last, we did it. After a lot of testing, bug discovery and fixing, we’ve finally released Greenstone 2.85. It should be much improved from 2.84. There were also some last minute changes from release candidate version 2.

Please do grab a binary for your operating system by visiting the download page at http://www.greenstone.org/download and start using it!

The Release Notes can be found at http://wiki.greenstone.org/wiki/index.php/2.85_Release_Notes

Greenstone 2.85rc2 (release candidate 2) released

ak19. Friday, October 28th, 2011.

There was a lot of testing going on in the last 2 months, and I forgot all about writing blog entries.

The first stage of testing was to go through the Greenstone tutorials on Windows (Vista), Linux (Ubuntu) and Mac (Leopard). Some bugs were discovered and fixed, and after that RC1 of GS2.85 could be released.

Thereafter, further tests were conducted on all three OS: testing out combinations of the 3 indexers and 3 database types, processing of a range of file types including the use of Greenstone’s PDFBox and OpenOffice extensions, filenames with different encodings and HTML files that interlink with each other using different encodings, the remote Greenstone server and the GLI applet were tested out, as well as spaces in the filepath for Windows. This time, the tests were conducted on Windows XP, Linux CentOS as well as Mac Leopard again. A lot of bugs had still got through the net after the first stage of testing, but were caught this time around and fixed for the release of GS2.85 RC2.

Greenstone 2.85 RC2 was finally released on Wednesday 26 October 2011. The Greenstone Team invites all those interested to please test the new release binaries out, which can be obtained from http://www.greenstone.org/snapshots, and write back on any bugs or issues encountered. The updated release notes are at http://wiki.greenstone.org/wiki/index.php/2.85_Release_Notes

The release notes already contain instructions on a patch for a minor issue that Diego discovered in the earlier release and which had persisted into the current one.

Sam’s Greenstone Blog 3/10/2011

sjm84. Monday, October 3rd, 2011.

Those who are eagerly awaiting the release of the final version of 2.85 will not have to wait much longer. Anu has been working hard testing it on each of the platforms we support and for the most part things are looking good. Any assistance in testing is always greatly appreciated and if you would like to help us out then please download the 2.85 release candidate which is available here. If you find any problems then email us at greenstone_team@cs.waikato.ac.nz and let us know. The more you can tell us about the issue the better.

Work on the Document Basket functionality continues to go well. I am in the initial stages of connecting the front-end Javascript to the Java back-end. To transmit the operations we are using JSON (rather than XML) as it is a very simple to write in Javascript and we have found a good Java library (gson) that converts JSON back into an object. So hopefully this week we will start seeing some promising results.

x