Previous Entry Share Next Entry
QL Stuff, part 1: The Querki Data Model
jducoeur wrote in querki_project
This week's project is starting to actually parse and process QL, the programming language that will be used to do interesting stuff in Querki. QL is, to a large degree, what makes Querki more than just a wiki. So it's time to start describing the language. First, though, let's talk in a bit more detail about how Querki *thinks*.

This post is kind of a glossary -- it describes the main user-visible concepts in Querki. I've referred to these many times before, but I don't know if I've ever defined them properly, and they're important for understanding what's what. For the engineers, this is probably the most crucial post to date, and outlines most of what I've been programming so far.

(Caveat: this whole series of posts are going to be increasingly technical -- they're mostly aimed at the programmers in the crowd.)

At the top level, we have the Space. If you think of a Space as a database, you're not too far off -- the structure's a little odd, and it's optimized for very small Spaces, but the concept's about right. A user can own any number of Spaces. (Although there will be some pragmatic limits on the total number of Things you can have in those Spaces.)

A Space inherits from one or more Apps. "Inheriting" from an App is all about visibility -- it means that the Space can see and use all of the Things, Properties, Types and Collections in its Apps. All Spaces inherit from at least the root System Space, which is hard-coded into Querki; everything can count on all Things in System. (The development philosophy is that Things in System are not allowed to change in ways that break existing systems. Breaking changes will deprecate and replace instead. This is important, since I expect stuff to evolve quickly.)

Apps have multiple-inheritance; due to the way the namespace works, we're not terribly subject to the namespace-collision problems that plague some OO languages. (Short version: each Thing has an optional Name and a globally unique OID. OIDs are what is mostly used internally, so inheriting from multiple Apps with the same name causes only relatively easy to resolve surface-level collisions.) In practice, most Spaces will inherit from one App, although I can see a bunch of Mix-ins that we'll be providing for specialized enhancements. The result is fairly similar to Scala's Trait concept.

Spaces are full of Things. Again, if you think of a Thing as being like a database row or an object in a program, that's not too far off. (The main reason I didn't go with the terms Object and Class are that Things and Models don't quite match up with some common assumptions about objects in OO, so I decided to invent some less loaded jargon.)

Note that absolutely *everything* is a Thing. Spaces, Properties, Types, Collections, Attachments, you name it -- they all inherit from the root Thing. This provides an enormous amount of conceptual consistency, and means that, eg, editing a Property looks very much like editing an ordinary Thing. It means, for instance, that displaying the documentation for a Property isn't *like* displaying a Thing, it *is* displaying a Thing; this reduces a lot of duplicate code.

Things have Properties, as you would expect. Any Thing can use *any* Property. Note that this is a bit weird: despite the OO-ness, Things are essentially arbitrary property bags. This is one of Querki's major heresies, but it is deliberate, to allow a good deal of sloppiness in how you use it. One of Querki's fundamental design principles is that, instead of demanding rigorous thinking, it instead does its best to cope with sloppy and evolutionary design. This makes a lot of edge cases surprisingly trivial to deal with, and avoids a lot of the theoretical complexities of traditional OO.

Another heresy: Properties are absolutely *not* fields in the usual OO sense. Specifically, Properties aren't scoped by classes. Instead, each Property is a first-class object, declared at the *Space* level -- all Things in the Space share the same set of Properties. The Architect in me occasionally twitches about the way that limits the namespace, but it's an extremely pragmatic compromise -- since Querki is aimed at average user, we're simply not permitting name collisions to exist. Properties share the same namespace as everything else, so any given name has at most one definition within a given Space; this simplifies all sorts of stuff, and makes the system easier to understand.

So a Thing contains a map of Property IDs to values of the appropriate Type. (Well, Collections of that Type -- we'll get to that in a second.) It winds up looking like ordinary OO at a casual glance, but is, again, very tolerant of sloppiness.

Each Thing inherits from exactly one Model, so there is a tree of Models leading up to the root UrThing. (Kind of like Scala's class hierarchy.) You can think of this as being a class, but it's more correct to think of it as a Javascript-style prototype. It doesn't just define Properties that the child Things must instantiate, it defines default *values* for those Properties. These defaults aren't just required, they are definitionally available -- there is no such thing as declaring the usage of a Property without implicitly or explicitly declaring a default at the same time. In other words, there is no such concept as "null" or "undefined" in Querki, and that informs much of the type system. (And should make programming easier.)

However, remember what I said about Things as property bags. A Thing's Model declares a bunch of Properties the Thing has, but the Thing can add whatever Properties it feels like. (In principle, anyway. In practice, we will almost certainly allow you to declare a Model as rigorous, forbidding the addition of more Properties in child Things. This is necessary for many public-Space use cases.)

Each Property declares a Type and a Collection.

A Type is pretty much what you expect -- a low-level data type of some sort. By far the most important Type is Text -- a block of text that can contain Wikitext and QL Expressions. My expectation is that most Properties will simply contain Text. But the system is designed to let us add more Types, and I expect the list to grow rapidly, including all sorts of things from specialized kinds of numbers to geolocations to dates. (In the long run there *may* be user-defined Types -- I've left room for that in the architecture -- but I'm really not sure what that means or how it would work, so I'm leaving it until we have some good concrete use cases.)

The common Types defined so far are:
  • Text, and LargeText (just like Text but with a larger input field)

  • Whole Number (aka Int)

  • YesNo (aka Boolean)

  • Name (used mainly for the names of Things; a short String with a highly restricted character set)

  • Link (pointer to another Thing; can be restricted to particular Kinds or Models)

  • PlainText (a String that may *not* contain Wikitext or QL, used for Display Name and such)
There is also a specialized CSS type declared in the Stylesheet module -- this is yet another kind of text, that will have its own specialized restrictions.

However, a Property Value is *not* simply defined by a Type -- it also necessarily declares a Collection. (This is probably the term I'm least happy about, and I'm still looking for a better one, but the correct technical term is Functor, and I'm not going there.) The Collection is the *structure* of the Property -- how *many* of this Type there are, and how they relate to each other.

So far, there are three Collections: ExactlyOne, Optional and List. They mean more or less what they say:
  • ExactlyOne means that this Property must contain one value of the specified Type.

  • Optional means that this Property *may* contain a value of this Type, or may contain None. (This is mainly useful for declaring in a Model -- it *suggests* that children may want to put a value here, without requiring it.)

  • List is a list of values of this Type. It may be empty, and order is preserved.
There will undoubtedly be more Collections to come -- I think it's only a matter of time before I add Set, for example -- but in my experience these are the crucial ones, and will probably account for 90+% of use.

Mind, most of this won't be terribly obvious to the typical end user. ExactlyOne means that there is an input field; Optional means that there is a checkbox and an input field. It probably seems a bit excessive, but it's actually critical for understanding QL. The key point here is that every Property is (in Scala parlance) an Iterable; in all likelihood, it will turn out that every property is fully Monadic, although I'm not going to commit to that yet. I'm basically building a bit of modern type theory into the Querki environment, but baking it so deeply that it'll mostly be invisible unless you look for it. The key thing to keep in mind as you read further is that every Property can be thought of as a Collection, and can be iterated over.

Those are the key concepts; most of the rest is details. Put together, Querki is, under the hood, a slightly strange, slightly object-oriented, slightly functional, somewhat high-level database. The description sounds complicated, but it's actually pretty rigorous and consistent, and that consistency should (if I'm right) make for an easier-to-understand UI in the long run.

Next time: Querki Explorer, the motivation for QL's syntax

  • 1
"One of Querki's fundamental design principles is that, instead of demanding rigorous thinking, it instead does its best to cope with sloppy and evolutionary design."

Huzzah! Ever since the Incompleteness Theorem, way too many programmers have focused on what is or isn't possible, as opposed to what was practically approximated.

"Whole Number (aka Int)"

Have you really got no use cases that need Floats? Or just none of your *early* use cases?

Huzzah! Ever since the Incompleteness Theorem, way too many programmers have focused on what is or isn't possible, as opposed to what was practically approximated.

More importantly, I am fairly sure that 90% of the populace isn't *going* to think rigorously. (Some can and won't bother; some just aren't wired that way.) So if Querki's going to be successful, I need to encourage an experimental usage model, centered on "It's hard to break it", that makes it easy to play around until you get what you want.

(And frankly, that sort of sloppiness is just plain useful for representing the real world. This ability to add Properties ad hoc has been *enormously* useful for my LARP development, since models of the world don't follow nice neat schemas.)

Have you really got no use cases that need Floats? Or just none of your *early* use cases?

The latter, but the truth is, I don't yet have any obvious requirements for them. Closest I have is the Recipe Database, and what that really wants is *fractions* more than floats. (At least, for those using Imperial measurements.)

I figure they'll come in eventually, but like everything in Querki, it's going to happen when there's a clear need...

There are remarkably few use cases for floats as a user visible type. Fixed point to two places is almost always sufficient, although people trend to misuse it for money when they should be counting integer pennies.

Yaas. I actually expect to wind up implementing Currency before I do Float...

Oh, but to my previous point: the one case I know of for user-visible float is *metric* cooking. I've observed a tendency for that to use decimals, where Imperial uses fractions.

I don't think I have identified any other use cases yet...

I wonder if it might make sense to create a sort of "Light Decimal" type that only allowed 2 places after the decimal point. And if someone tried to input a third, have an automatic message come up saying "Are you *sure* you need that much precision? The author of this App thinks that you don't..."

Possible, but it's all about the use cases. If what is desired is Currency (and my guess is that it usually is), I'd rather have a full-fledged Currency Type, with whatever semantics are appropriate for that.

(Of course, that gets into the whole Units of Measurement thing, which is a giant but important complication unto itself. The plan is to support a first-class notion of "Units" and "Systems", and to have conversation built into Querki itself. It should know that Volume is a unit of measure, that milliliter and quart are both representations of that unit in different systems, and automatically convert to my preferred system. Currency is also a unit of measure, but with a constantly-changing conversion scale, which should be amusing.)

Anyway, when we have use cases that need decimal points, we'll figure out the appropriate way to handle them. So far, there really isn't enough motivation there, so I suspect it'll be a ways down the line...

  • 1

Log in

No account? Create an account