Archive for December, 2011

Anu’s entry for the week ending 2 Dec 2011

ak19. Friday, December 2nd, 2011.

Continued on the problem that I thought had been almost resolved last week: getting the batch files in GS2 to handle not only spaces but also brackets in the Greenstone filepaths. The batch files were done, but the perl code needed some correcting too. After inspecting many files in order to see whether they needed correcting, the GS2 code seems to work well on Windows even where Greenstone is installed in a path containing brackets.

This week, I was able to finally return to the problem of jodconverter not interacting well with the LibreOffice on the Ubuntu 11 whereas the same worked perfectly against an OpenOffice on the CentOs machine. We decided that perhaps OpenOffice had different behaviour for the signals sent by jodconverter. Installing OpenOffice turned out harder than expected and I think I botched it. I ended up having to uninstall all openoffice files and libreoffice files and then reinstalled all of libreoffice. At this stage, upon trying jodconverter again, it was found to work fine each time. This seemed to confirm the suspicion that some updates to Ubuntu may have messed up some libraries or something, breaking LibreOffice a little.

However, despite things now working again, Sam wondered, very correctly, whether a user’s experience would be this convoluted or whether it would work straight away for them. He suggested trying out a VM of Ubuntu 11. Which is what I did. It was my first VM installation and after installing a Ubuntu 11.10 VM on Sam’s Windows 7 (which comes with LibreOffice), Greenstone with the open-office extension fortunately worked fine on a sequence of word documents.

On Friday, got round to Diego’s long-standing question at last: about the possibility of a single metadata.xml at the import level which defines the metadata for all files in import’s subfolders. Dr Bainbridge had already confirmed earlier that this was indeed possible, but the question was of how the metadata.xml out to specify the path to the files in the subfolders, especially if there were spaces in the path. After a series of incremental tests, it was found out to be still possible and the solution rather straightforward. Hopefully it will work for Diego also.

There was some translation work, and a few further questions on the mailing list to look at, before I finally got round to considering Michael Goodwin’s complex question on the setup.exe generated by an Export To CD-ROM operation failing on Windows 7 on 64 bit. A preliminary successful test on a Windows 7 machine turned out to be misleading: I had assumed it was a 64 bit machine but it turned out to be 32 bit after all. I will have to get back to trying this out next week. All this fine-tuning is bound to pay off in the upcoming perfected release of Greenstone 2: version 2.86.

Sam’s Greenstone Blog 2/12/2011

admin. Friday, December 2nd, 2011.

This week I have been tidying up the new paged-image functionality so that it dynamically loads each page (rather than doing a full page reload each time) and also added the functionality that allows the user to choose from “Text view” (which only shows the OCR’d text), “Image view” (which shows the original image) and “Default view” (which shows both the text and the image). These are also switched dynamically which is nice and are remembered if you leave a document page and go to a new one.

I also fixed up an annoying problem with GLI. One of the ways you can customise collections in Greenstone 3 is by writing Javascript in the collectionConfig.xml file and those familiar with XML will know that you cannot put ‘&’, ‘<‘ or ‘>’ into text nodes (you have to replace them with &amp;, &lt; and &gt; respectively). These special characters a relatively common in Javascript so each time they are used they have to be escaped. The problem we were having with GLI was that it would read in the file and replace the characters with their usual forms (&, < and >) and when it went to save the file it wouldn’t escape these characters. So the next time this file was read in GLI would produce an error because the file was no longer valid XML. We eventually tracked this problem down and fixed it.

Next week I will continue to work on the paged-image functionality (specifically the “next page” and “previous page” buttons) as well as adding some new code to HTMLPlugin that will add any files referred to in CSS files (e.g. background-image) as associated files of the HTML page.