Getting our ducks in a row

I have plenty of ideas of how to proceed, but I like to actually implement things in a fairly methodical fashion, especially if I’m doing it in public.

I spent some of the day before yesterday getting a GitHub project set up; you can now find all the code associated with this project at https://github.com/mdorman/couch-simple. I was pleased to find that TravisCI provides access to a CouchDB instance, which means that our testing story can be very clear—unwanted dependencies on the configuration of my local CouchDB server won’t be an issue.

With that in mind, I pushed a skeleton project up night before last. Now real development can begin.

The core of the library

If you’re interacting with CouchDB, you have to speak HTTP+JSON.

Even if you send an Accept header that said you only read text/plain values—which, I should point out, the Couch documentation says is a valid option—it will send you back JSON…that is just marked as text/plain.

The HTTP requirement is even more inescapable—the only port open speaks HTTP, Full Stop.

With regard to the JSON requirement, I don’t think there’s any real competition: the aeson library is far and away the most popular JSON library, to the extent that one might be forgiven for not realizing that there were any others. Although I know the json package exists, I’ve never had occasion to so much as read its documentation.

What to use for HTTP is a little less clear, but not, to my eyes, much.

If we got back to my prior statement of requirements, I included:

  • choice in streaming library

While I am most familiar with the Conduit ecosystem (largely because I was using couchdb-conduit, so this might change!), others might prefer to work with Pipes, or, I suppose, io-streams.

Now that doesn’t necessarily mean anything—if I’m doing a request in some function being called as part of someone’s stream processing, I don’t necessarily need to be integrated with them—usually I just need to write a wrapper that waits for requests, makes them, then yields them.

And for the most part if I’m producing a streaming view, for instance, it’s the same thing—I just need to yield each result as I get it; I don’t need to be somehow intrinsically tied to the framework.

But still, the http-client package has streaming wrappers for all three of the mentioned libraries—which suggests that it’s easily compatible—it’s got baked-in support for connection pooling, and good support for incremental input (which is a lot of what we need at this low level).

So that is the basis upon which I’m going to build what I”m currently thinking I will call couchdb-simple.

Initiating a new project

I’ve been working on a personal project in my spare time for…an awfully long time now. Although a huge part of what it does is necessarily server-based, I want the UI of the project to be offline-mobile-first—that is, in addition to mobile first, it must be able to run offline seamlessly, etc.

I don’t have the time or the energy or expertise to seriously consider building my own infrastructure for doing such things, so I’ve elected to go with CouchDB+PouchDB as my data storage solution. I will admit to a tiny bit of concern about PouchDB—not so much whether it’s good, but whether it’s capable of handling the data storage needs I envision. Still, even if it’s not, it provides a starting point, and there are other options (TouchDB, or Couchbase-Lite or whatever it’s called right now) that I can consider.

On the server side—in the stuff I am definitely doing in Haskell, as opposed to the client-side where I would like to be able to use Haskell, but might compromise if necessary—I want, first and foremost, a well-maintained database library.

Unfortunately, of the five libraries to interface with CouchDB, four haven’t been updated in at least two years, and even the one that has is a major version behind on one of its primary dependencies. So none of these, IMHO, represent a well-maintained option.

I have, to this point, been using couchdb-conduit (with a bunch of patches I’ve maintained to keep it compatible with current libs), but I’ve recently run across an issue whose workaround is annoying enough—trying to handle exceptions when calling a routine from within a segment of a conduit—that I think I’m just going to write my own.

So, my first potentially-public Haskell library. It’s actually a little intimidating.

My first step, I think, is to identify what I want

  • easy access to the CouchDB API

    It’s actually pretty important to me that this mirror the official API—it allows me to refer to it as documentation, it gives me (and others) a good guide to relatively completeness, it makes tracking any changes easier, etc. It gives me a built in structure.

  • good type guarantees for correctness

    Especially when some of these calls end up feeling like log strings of parameters, I want to make sure the compiler will tell me when I leave one out.

  • to process streaming outputs in a streaming fashion

    Most access to individual records and what-not doesn’t require actual streaming—really, just incremental processing of what is ultimately going to be one result.

    But when you want to handle a bunch of records coming out of a view, or you want to hook into the _changes feed? Streaming must be available.

  • choice in streaming library

    At this point, I am more acquainted with Conduit, no question, but I would like to make sure not to exclude users of the Pipes library from being able to provide their own streaming option on top of this.

  • implicit parameters (host, port, database) most of the time

    Most of the time, I just want to stuff all the connection stuff into a Reader instance and never have to mention it again…

  • explicit parameters when I need it

    …except sometimes, when I really need to do something odd in the middle of a bunch of other stuff..

  • well maintaned

    Even if I have to do it myself.

So, that’s what I want to achieve. I think it is all achievable, but I’m going to start small and try and build up to it. The great thing is that I already have a body of code that’s currently using couchdb-conduit that I can use as a development testbed.