It has been a long time since I last blogged about the Nepomuk user-query parser that I developed during the summer. The reason is that the university required all my time. Here is a quick blog post that summarizes what happened these last months.
Merging the Query Parser
For those who missed the news, Nepomuk has been "abandoned" and renamed to Baloo. The reason is that the Nepomuk developers thought that using RDF (a complete and complex system of ontologies, that require specialized databases like Virtuoso) wasn't the right thing to do in order to provide what the user want, that is to say a powerful yet lightweight search engine. RDF allows many things, but most of the users just want to look for documents matching a specific string.
I cannot better explain the reasons, but you can read this mailing-list thread if you want more accurate details.
What is important is that Nepomuk has been rewritten, and that breaks my user-query parser. Fortunately, the parser is pretty much self-contained, and the algorithms used don't need to change, but the way the parser provides its results to Nepomuk has changed. In Baloo, queries seem to be simpler (for what I've seen) and are not built out of trees anymore.
In Nepomuk, a query was a big AND, containing ORs, other ANDs and actuel comparisons. Getting every document containing "test" and tagged as "important" was done using a query like
AND(tagged_as(important), containing(test)) (this syntax is not used anywhere in Nepomuk but represents in-memory
Nepomuk2::Term subclasses). In Baloo, for what I've seen in the source code, a query is more "fixed". Every query consists of search terms ("test" in my example) and filters (date-time filters, maybe also tags, etc). There does not seem to be any recursive data structure. Because of that, the user-query parser cannot simply convert its abstract syntax tree to a tree of terms (a very simple operation), but needs to carefully analyze the user-query in order to produce the Baloo query (or queries) that will provide the expected results.
I'm still thinking about all that and I will contact the Nepomuk developers for advice, because porting the user-query parser to Baloo seems not to be that complex, and seems very interesting.
GroupedLineEdit gets used by Subsurface
Free Software and its communities are full of great surprises. One of them was when Aaron Seigo quoted me on Google+, or when the Nepomuk file indexer for MS Office 2003 files, that I developed in October, was mentioned in the KDE Commit Digest.
Several weeks ago, just before the FOSDEM, the Subsurface team announced that the project was ported to Qt. Subsurface is a divelog program started by Linus Torvalds and now developed by a fair number of contributors, some of them being well-known kernel hackers. When Subsurface 4 was announced, I read the news (even though I don't dive, I wanted to see how a GTK+ program could be ported to Qt and what that brings as new/improved features), and I discovered a very intriguing screenshot in the user manual :
Have you noticed that there is a "Tags" field? And that it contains tags highlighted in colors? When I noticed that, I thought that Linus (or someone else) had developed exactly the widget that I needed during the summer, the widget that is used to highlight query terms in user-queries, as shown in this image :
I downloaded the source code of Subsurface and started to look after this widget that does exactly what I want. Finally, I found it, and I was very surprised to see that it was actually my GroupedLineEdit widget that Subsurface used! I'm very pleased to see that something that took me several weeks to develop (the widget is less than 200-lines long, but I needed a great deal of time to figure out what was the nicest way to highlight text, and how to implement it properly) is used, and used in a program started by nobody else than Linus Torvalds.
So, if anyone wants to develop something : don't hesitate to do it! Even if it is a small widget or a toy application, what you develop may be of use to someone else, and it's always a great pleasure to see that code we have developed gets used by other developers.