Log in

No account? Create an account

Previous Entry Share Next Entry
jducoeur wrote in querki_project
It's been quite a long time since I've posted anything here, but that doesn't mean the project's stopped moving -- quite the opposite, really. The complication has been that we've been running on two separate tracks for the past several months, and they are *finally* starting to come together.


The big news is that Querki is finally beginning to transition to Cassandra. From a technical perspective, this is by *far* the biggest change to Querki's architecture, one that we've been talking about for about three years and working on for the past six months. More technical details below the cut.

Querki is, in a sense, an in-memory database system, but it is built on top of a "real" database under the hood. Historically, that's been MySQL, solely because that's what we have the most experience with, but it's always been an awkward fit -- we haven't been using MySQL in a conventionally "relational" way. Instead, each Querki Space is actually composed of about half a dozen distinct MySQL tables. And I don't mean there are half a dozen tables that cover the Spaces: I mean there are half a dozen tables per Space. This has long made me nervous: I've never found any documentation specifying an upper limit on the total number of tables you can have in MySQL -- but I've never heard of anyone building an application with potentially billions of them, either.

I've long hypothesized, since the early days of the project, that Cassandra would be a better fit. Frankly, it *thinks* more like Querki: its key-centric approach suits our needs extremely well, it is designed for scale, and it is optimized well to a steady stream of writes that you occasionally read back in, which is pretty much how we operate.

The final straw came in the past year, as Akka Persistence has become reasonably solid. Akka Persistence is a delightfully sensible architecture for building stateful Actors. It is built on the theory that, since your Actor is a bundle of state, and that state is precisely the sum of the events that occur to that Actor, you should persist the Actor by persisting that event stream. When you need to reload the Actor, you just replay the events. (I'm simplifying, but that's the core notion.) That suits Querki *extremely* well. And it also suits Cassandra extremely well, so that is the DB that is best-supported by Akka Persistence.

So as of this release, we're beginning to make use of that approach. This should make it much easier to add a lot of enhancements I've wanted to make a long time -- since the Akka Persistence architecture doesn't require tricky database schema changes when I make data upgrades, problems that used to be challenging become pretty straightforward. It will introduce its own complexities, of course, but I think I have a handle on them -- much of the past several months has been spent on getting used to this approach and building the necessary infrastructure.

All that said: we are now entering one of the most dangerous phases for Querki. I'm testing everything heavily, but a change of this magnitude is always risky. So if you come across new bugs, please bring them to my attention immediately.

And now, on to the user-visible stuff...


Probably the most important enhancement from a user perspective is the addition of complexity modes.

The goal of Querki is to make it much, much easier to build and run your own little Spaces and Apps. But there's still a lot of complexity to it, especially to support the various power-Querki scenarios. We're always dealing with that tension between trying to keep the UI as simple as possible for the end user who just wants to participate, versus the experienced engineer who wants to build cool sites twenty times more quickly and easily than they can with traditional tools.

To deal with this, there is a new concept of which "mode" you want to run Querki in. There are three modes, initially defined like this (quoting the UI):
  • Participant: Appropriate for most people, who want to be able to participate in other peoples' Spaces and create Spaces based on Apps, but don't want to design their own Spaces from scratch. In Participant Mode, Querki keeps the complexity to a minimum while still giving you the tools you need to add and edit data, and participate in conversations.

  • Builder: For those who want to build Spaces that don't yet exist as Apps, tweak existing Apps to better suit their needs, or manage their Spaces in more detail. Builder Mode adds the Model Designer, so that you can define exactly the sort of data you need, as well as more powerful security tools.

  • Programmer: For the Querki power user. Programmer Mode adds all the bells and whistles, so that you can customize Spaces in more detail, write your own code in QL, design custom Types, and lots more.
I still don't love the name "Participant", but it's the best suggestion I've heard so far. Other suggestions welcomed. Also, note that the details of what is visible in which modes are still evolving, and this is just a first cut: observations and suggestions welcomed there as well.

Anyway, as of now everyone starts in Builder mode. Eventually, you'll choose when you're starting out, but we're not there yet. You can choose which mode you prefer from the login menu in the upper-right, and you can change it at any time -- it just takes a second. The differences are intentionally subtle; basically, it's just a matter of how many buttons and menu picks are available.

A very minor change, but one some folks might notice: object IDs are no longer simply sequential. You know those IDs that each object has, like ".3y28csa"? They used to grow simply monotonically, proceeding one at a time. (In base 36, but whatever.) This bottlenecked on the MySQL server, so the very first feature of the new Cassandra world involves each Querki application server having its own pool of them, and using them as needed. So they will mostly be larger than they used to be (the .3y2 series is now only being used for system-level objects like Users and Spaces), and they will jump around a lot. This is expected.

The Space and Page-loading experience has been slightly improved. Nothing dramatic, but we now provide some basic spinners. Most importantly, when loading a Space, you no longer spend a long time (it could be several seconds on mobile connections) staring at a blank white screen; you now quickly get *some* feedback that things are loading.

Notable Bugfixes

Photo Upload should now be more reliable: suffice it to say, the original implementation of Photo Upload wasn't terribly robust, and when we switched from a single machine to a cluster some months back it started becoming very erratic. This pipeline has now been heavily rewritten, and should now work reliably. Please tell me if you encounter problems with upload.

You can now dereference Properties through Tags: previously in QL, if you got to a Thing through a Tag, you couldn't immediately just refer to a Property on that Thing. Now that it's become clear that Tags should work just like Thing (Link) Properties, that seems silly -- I hit this bug myself and found it head-scratching that I couldn't do this. So now you can.

A slash at the end of a Space's URL is no longer required: several people got confused when they tried to go to a Querki Space and got an orange error page instead -- the problem was that they had forgotten the slash at the end of the Space's URL. This now works as expected.

Tabbing no longer gets messed up when you add a List element: previously, adding a new element resulted in all sorts of havoc in tabbing through the page. It now makes more sense. Note one other change that went in at the same time: the delete button is now intentionally omitted from the tab order. I had found it too easy to accidentally delete a list element by tabbing to the wrong place and hitting Enter. This is no longer possible. (This may require further examination from an accessibility POV -- we'll look into it in that context.)

You can now sort on Properties of Model Types: the _sort() function has always allowed you to sort a List of Things, but it hadn't allowed you to sort a list of complex values. This wasn't so much a bug as a feature that had never been implemented. That function has now been refactored to work properly here -- you should be able to sort a list of Model Values as expected.

Sorting of text is now properly case-insensitive: while fixing the previous, I discovered that in many cases Text fields were being sorted case-sensitive. As a rule, Querki always sorts case-insensitively, so I've fixed that.

That's it for now, but expect major releases to start coming a lot more frequently, now that we have our Cassandra cluster up and running for real, so I can start making some enhancements I've been planning for years now...