Previous Entry Share Next Entry
Crowdsourcing and Communities
jducoeur wrote in querki_project
Today was a big long out-and-about -- I was driving a friend to a medical appointment, and decided to run All The Errands, so I had a lot of time in the car. And what do I do when I have a couple of hours of driving? I design Querki! (You didn't think that I am ever *actually* away from work, did you?) Today, I finally cracked the nut of Crowdsourcing, which I've been dancing around and trying to figure out for about nine months now.

Right now, that looks like a non-sequiteur for Querki, since we're so focused initially on building your own little Spaces by hand. But a little bit down the road we're going to adding Apps. Now, an "App" in Querki isn't like it is in other systems -- it's sort of a template for building a Space, a common structural framework rather than a unitary hard-coded program. But still, by the end of this year I expect people to be sharing Apps frequently with their friends -- once you solve a problem in Querki, you might as well help your friends solve it too.

And the thing is, once you get there, it begins to look an awful lot like crowdsourcing -- you have lots of people, entering data in a common format. But most crowdsourcing systems violate Querki's ideals badly -- they tend to be gigantic buckets, where you put in your information and then lose all control of it, rather than Querki's ideal of you owning your own data. And all of them force you to follow a single consistent structure, quite in tension with Querki's "tweak to your heart's content" ethic. Not to mention that crowdsourcing is architecturally all wrong for Querki's everything-in-memory architecture -- that works great for hundreds of records, but is totally inappropriate for millions of them.

Today, though, the light finally dawned. The right way to think about crowdsourcing in Querki isn't about one person or group controlling everything. It's about people working together to build a *community* of data. So here's a sketch of the design.

Querki will (and mind, we're talking a year from now) have App Communities. A "Community" is a collection of Spaces (and the Members of those Spaces) who are voluntarily contributing Things from their personal Spaces to be part of the Community Space.

Any given App may have one or more Communities built around it. I suspect we'll wind up wanting the notion of a "default Community" for an App, set up and led by the App's author, which you can opt into with just a click, but there's no reason to limit it to that. Any user of the App should be able to set up a Community around it, and different Communities will often have different standards for inclusion.

As a member of the Community, I am saying that some or all Things in my Space will be contributed towards it -- I can either say that all of my Public Things go in, or only specific ones that I designate. Only Public Things may be contributed to Communities; otherwise, the access control gets too weird and complicated. (Possibly, we might have the option of limiting access to other members of the community, but I'm unlikely to bother unless there is user demand.)

Communities are, by necessity, curated/moderated, the same as Spaces -- that's necessary for spam control. The owner of the Community (and there still must be a single clear owner) may designate any number of curators. (The default will probably be "all interested members of the Community", but you should be able to limit this to specific people.) When a contribution comes in, it gets sent to one or more of the curators for approval. (The number of approvals should probably be configurable by the owner.) The owner will probably be able to whitelist specific people as not needing curation, but this isn't something to be done casually.

Initially, the curation process will probably be pretty simple, but I suspect we'll wind up making this *very* rich over time. The owner should be able to state official Community policies and standards, and ideally these should be reified into the UI. For example, while all Communities will probably have a one-click "This is Spam" mechanism to reject a submission, an SCA Merchant listing ought to have a button for "Not Period Enough", and like that. The idea is to give the curators good tools to make it easy to help manage the Community. (In the medium term, we'll also want tools to help manage things like dispute resolution, if the curators disagree, as well as auto-detection of Spaces that seem to be full of spam.)

The Community itself isn't precisely a Space -- and this is quite important -- it is an *index* into the constituent Spaces. That is, all of the Things still have clear owners who are in charge of them; the Community is a collection of indexes that allow you to find particular Things *in* those constituent Spaces. Duplication is entirely legitimate, and there should be mechanisms whereby the Community can clearly say, "Yes, these three Things are all talking about the same thing", but in general, we will allow (and even encourage) duplication, while providing tools to manage that duplication.

The Community itself will probably allow meta-Properties on top of the actual Things. For example, a one-to-five-star Rating may be a non-sequiteur within a given Space, but quite relevant for navigating the amassed Community. So there will probably be Ratings, Discussions and things like that at the Community level; how this relates to those inside the Space is TBD.

The result is that, architecturally, communities probably look nothing whatsoever like Spaces. They won't be in-memory, they won't make *any* attempt to be real-time, and they won't have complete QL access. I suspect that the process of "contributing" to a Community will be asynchronous, and the Community will be represented by either a SQL table or maybe a NoSQL DB of some sort. I'm honestly unsure offhand which is the best way to build this -- we'll want indexing, but probably don't care at all about transactionality, so a MapReduce architecture may suit Communities rather well.

Since part of the idea is that a Community is, well, a community, there will probably be places for meta-discussions, and mechanisms for managing the membership of the Community. (There obviously need to be ways to detect and kick out spammers and other ill-behaved folks.)

Anyway -- that's a braindump of the idea. What do y'all think? I think this sounds pretty damned cool, but I wouldn't mind a sanity-check from the audience...


Log in

No account? Create an account