Sam’s Greenstone Blog 18/11/2011

admin. Friday, November 18th, 2011.

This week has mostly been focused on bug fixing. One bug we discovered a while ago was that the code that highlights search terms in the text would also find occurrences of the terms inside tags (e.g. it would find the word farming in <a href=”farming.html”>farming</a>). The fix was to exclude the characters inside these tags from being considered by the highlight searching code by looking for the < character and ignoring all characters until we see a > character. You may be thinking “But what if there is a < in the document text?”, the answer is that this isn’t an issue as the document text will not contain any of these characters that don’t belong to tags as they will be escaped as &lt; and &gt;.

Another bug I fixed was to do with the Document Structure Editor. The bug was that it always wiped the contents of any images in the collection that was being built, leaving empty files, but the XML files were being preserved fine. The main bug was caused by the index directory not being deleted correctly. This was because the server still had the collection loaded in the runtime system (so that it can be viewed) while it tried to delete its index. So it required that the collection be briefly deactivated in the runtime system so that this replacement (the newly built index replacing the old one) could take place.

Another problem was with displaying paged-image collections. The system would only ever show the root level section and the top level sections and no sections lower than that. I tracked this down to the top levels sections being marked as “leaf” nodes instead of “internal” nodes. Whether this is a bug or whether this has been done deliberately I will try and figure out next week.

Also next week I will do some work on enabling a basic form of spatial searching (searching by locations) in any collections that contain documents with latitude and longitude information.

2 Responses to “Sam’s Greenstone Blog 18/11/2011”

  1. Michael Goodwin Says:

    I have had an issue with the index directory not deleting when I build a collection on a Windows 7 ultimate machine. Now I always delete the contents of the index directory prior to a build. I have also gotten into the habit of closing any open collection before I shut down Greenstone. I know I shouldn’t need to do this but it takes 4 or 5 hours to build a collection and if the index does not wipe properly it takes another 5 hours rebuilding it again. It will be nice when you fix this index bug.

  2. Mark B. Folnari Says:

    Here’s a bug for you — the JAVA bug. I have every known version of JAVA — including 1.4 installed, and neither 2.85 nor 3 will complete an install. Why can’t someone put up a message somewhere how to resolve this dilemma so I can get on with the business of building my DL?