SQL or RDF? Thoughts on Tellico’s Next Backend

One of the main goals of Tellico‘s development has been to be a simple application. I wanted to be able to keep track of my books without having to configure an SQL database, or create a schema, or worry about system daemons. To that end, while I thought about SQLite at the time (several years ago), I ended up writing Tellico to just store all its data in memory. The images are stored on disk, but all the field values for each entry are maintained in simple object containers (vectors and hashes…). The XML format is used only for serializing the data to save and reload.

In practice, I believe that has worked rather well. While I have received emails from folks who try to store 10,000 books in their database and find the performance lacking, by and large, I’ve seen many reviews note favorably that Tellico is simple and flexible to use and can be useful for the majority of people.

I do want to expand Tellico’s capabilities, however. One large goal is to get away from treating each collection as a flat list of entries. I want to be able to have books and movies in the same database, for example, and I want to be able to track TV episodes and seasons equally well. I want to be able to add information about authors and actors.

To that end, I need to rewrite Tellico’s backend. And in considering how I want to do that, I’ve come to a decision point about SQL vs. RDF.

Many highly-visible KDE applications use SQL, such as Amarok, Digikam, and Akonadi. I just read a blog post about using the same MySQL instance for all three of those applications.

On the other hand, the Nepomuk framework in KDE provides an interface to an RDF database. Bangarang and KMail 2 are both heavily using Nepomuk.

So I’m trying to work up a pro/con list.

Portability

 

I want to say that SQL wins. Embedding or linking against SQLite means a typical user would never need to worry about database permissions, daemon persistence, or username and port settings. At the same time, for power users, the added work to make MySql or PostgreSQL an option, would be reasonable. Akonadi and Digikam have taken this approach, and up until recent versions, so had Amarok.

Using Nepomuk, on the other hand, requires the full Soprano and Virtuoso tool chain. Most KDE desktops are running Virtusoso at this point, I guess, but I don’t want to shut out the GNOME users out there. And on my underpowered development box with 1 GB of RAM, I can’t even use Strigi and Nepomuk.

Development Maturity

Here again, I think SQL wins. SQL (and to some extent, SQLite) is used in so many places, I know a significant amount of work has gone into optimizing and improving its efficiency. In other words, if the database access is slow, it’s very likely that the problem is due to my poor programming knowledge rather than a fundamental flaw. I don’t have that reassurance with RDF/SPARQL and Nepomuk. i know Nepomuk is improving, but looking at the bug reports and development fits and starts in the KDE code, it still seems a bit rocky.

SPARQL also has some weird semantics, such as blank nodes, a need for custom Insert/Replace behavior, and a lack of aggregate functions. SPARQL is still rather immature, in that sense.

Interoperability

I feel like I should include this factor. RDF seems to be a bit of a buzzword with the semantic database push lately.A SQL schema would largely be opaque, while the RDF store, assuming the use of common ontologies, would allow for future interoperability with other databases. This is all rather fuzzy, though, and there’s nothing that says I can’t have some sort of RDF export or translation from the SQL.

If I did use Nepomuk and RDF, I might even have to try to write some sort of abstraction layer to use Tracker on GNOME.

Developer Interest

I’d call this a tie! I’ve messed around with some limited SQL and RDF/SPARQL both, and I’m interested in learning more about both.

Conclusion

These are mostly just unordered thoughts bouncing around in my head. I’ll all but decide to take a shot at implementing a SQL backend, and then change my mind an hour later. Plus, who’s to say I can even figure out how to do any of this! I only impersonate a programmer on TV! 🙂

Linux Identity Review of Tellico

The October issue of Linux Identity, a “Duo Pack” with Ubuntu 10.10, included a review of Tellico and GCstar. It’s a French magazine, but through the power of the Interwebs, I ordered a copy and received it in the mail recently.

It’s always interesting to me to read what specific features or workflows each review mentions. This particular review gave step by step instructions for installing and running Tellico on Kubuntu, and then basic steps for adding information about a book to Tellico. It also included information for creating a custom collection (which I don’t usually think about most users doing) and using a filter for finding items in a collection.

It didn’t appear to have any comments about drawbacks or bugs (other than a quick aside about needing to register for an Amazon API key). The article includes a nearly identical review of GCstar, basically walking through installation, running, and adding a book.

One note of interest to me, as well, was in comparing the screenshots between Tellico and GCstar and looking at the translations of the user interface in French. I guess it doesn’t always register to me that there are so many ways of saying the same thing, especially in English and no different in French. Slightly different verb tenses or phrasing…

New Tellico Version

I put together a new version of Tellico, version 2.3, and threw it out for release this past weekend. It is dedicated to my lovely wife, Kim, and we could code-name it, Hot Vegas!

The previous release was back in February, so this one has quite a few bug fixes and some new features. Freebase was added as a data source, which is rather useful. Freebase has tons of info available, and is constantly being updated. It even has some limited comic book information, too.

I messed up some of the links on the download page, so embarrassingly enough, for a few days after the release, the links were broken. Those should all be fixed now!

I also jumped off the deep end and downloaded Choqok and registered on identi.ca which is something like the open source version of Twitter. Much less popular, so by definition, much cooler! Software nerds use it a good bit, I’ve heard. Anyways, I took the handle stephero. (Strike that, since changed to AstroRobby for uniformity with Twitter.)

One of the primary features I’d like to work on for the next version of Tellico is better statistics. I’m looking forward to working with the KDE plot widgets.

Ultimate Guide to Cataloging Software

Richard Hemby at the Online Education Blog has a comprehensive list of cataloging software and ends up giving high honors to GCstar for Linux and Windows. A little tongue-in-cheek, I imagine, he mentions cataloging mini vehicles:

Other collection management software allows management of these items but GCStar has jumped out ahead of competitors because the software allows you to catalog your favorite television shows directly from TVBD channels and allows you to catalog mini vehicles. Cataloging mini vehicles will require some manual efforts but the detailing offered is priceless.

GCstar is a fantastic bit of software. In my opinion, one of its biggest strengths is the sheer number of websites that it can scrape for info. Tian, the primary GCstar author, even added a feature for using GCstar as a standalone data fetcher. As a result. Tellico can use any of the GCstar data sources directly. The interface is a bit slower, but it works pretty well. I’d like to make Tellico as modular and useful in return, but haven’t been able to yet.

Congratulations to GCstar!

Tellico and Yaz 4.0

Tellico uses the Yaz library for accessing z39.50 servers, which are used by many libraries for bibliographic access. Yaz 4.0 was just released, so I wanted to check to see if Tellico still compiled with the new version.

I’m happy to report that no source code changes are necessary for Tellico. While the bump in the Yaz version means a library ABI compatibility, it’s still source-compatible.

The best Linux collection managers compared | TuxRadar Linux

TuxRadar has an article comparing collection managers. Tellico comes out pretty good, with a grade of 8 out of 10, claiming second place to GCstar.

At first glance, Tellico seemed like the obvious winner of the bunch. It’s got built-in templates, it’s configurable and provides good documentation. The design is elegant, if not pretty, but it’s been superseded by a superior program, one that’s pushed the heights of what a collection manager can be.

Near as I can tell, Tellico loses out to GCstar’s shiny templates! Well, I can take that. It’s a pretty good article, though with not much in the way of substantial critiquing.

Even though you don’t have to fill in all, or even most, of the fields, the result is unappealing. The dialog boxes you use to fill in the information for an item are crowded, but there are also all the ugly empty spaces from fields you didn’t fill in.

I think the author is hitting two points there. I understand the crowded dialog complaint, though aside from doing in-place editing in the view, I can’t think of any other way to edit the data. The second point, about showing empty fields in the view, is easily fixed with some tweaks to the default. Maybe I should add a template for that as an option.

Tellico’s website provides a detailed illustrated guide in addition to the extensive documentation, but the drawback to having extensive built-in support is the in-your-face interface that comes with it, although this is more than offset by the program’s features. When filing our comic book collections, we honestly don’t want to enter the date we purchased the book, so we find it irritating that Tellico expects us to.

Definitely a valid point there at the end of that paragraph. That’s why Tellico 2.0 added a field for automatically storing the date that the comic book was first added to the collection as well as the date of the last modification of the comic book data. I decided not to remove the default field for year of purchase, though maybe I should have.

Compiling Tellico

I just added a page on KDE UserBase with basic instructions for compiling
Tellico. I know several people have emailed me about that. Typically, I would
recommend waiting for your distribution to provide pre-compiled packages,
but since the transition to KDE4 and Tellico 2.0 is still going on, you may
be impatient.

Feel free to edit or comment.

http://techbase.kde.org/Projects/Tellico/Compiling
http://userbase.kde.org/Tellico

Update: Compilation instructions are properly on KDE TechBase instead of UserBase so I moved them.

Changing Amazon terms hits LibraryThing

LibraryThing has had to modify the way they link to their sources of data for their book pages.

Everyone at LibraryThing disagrees with this decision. LibraryThing is not a social cataloging and social networking site for Amazon customers but for book lovers. Most of us are Amazon customers on Tuesday, and buy from a local bookstore or get from a library on Wednesday and Thursday! We recognize Amazon’s value, but we certainly value options.

Importanly, the decision is probably not even good for Amazon. Together with a new request-monitoring system, banning iPhone applications that use Amazon data, and much of their work on the Kindle, Amazon is retreating from its historic commitment to simplicity, flexibility and openness. They won through openness. Their data is all over the web, and with it millions of links to Amazon. They won’t benefit from a retreat here.

Tim Spalding and the gang at LibraryThing have always struck me as being very clear-minded in their goals and the best way to help their users. I agree with their assessment.

Tellico 2.0's first bug

OK, the prize for the first big goofball mistake in Tellico 2.0 is one that causes a crash when exporting to HTML in most cases.

The ironic thing is that I appeared to have created the problem when I updated the export code to allow me to write unit tests for it. HTML export works fine in the test, but Tellico, the app, was setting the configuration wrong. All the articles I’ve read about there being no such thing as a harmless code change are true!

So, my apologies if you download Tellico, created a collection, tried to export your HTML page and promptly get a crash. We’ll hire new Quality Assurance people immediately.