Choosing the API and defining some ADTs

As I said in the previous article, for wp2o2b, the plan is:

So, the task is to download all the articles in my existing sites, reformat them into org-mode files with appropiate metadata for org2blog, store them locally in a hierarchy that mirrors the one on the server.

WordPress implements an XML-RPC interface for accessing your blog programatically. It supports older legacy styles of access (Blogger, MovableType, and metaWeblog), but recommends that for new development you work with their new API which, incidentally, has the nice benefit of being well-documented.

(Much of this new API was introduced in version 3.4, released in June, 2012—so it’s only six months old at this point. Normally I would hesitate to depend on something that new, if I cared about wide applicability, but WordPress is one of those things where I think you should be keeping up with releases, if only for security reasons, so I don’t perceive it as too much of a limitation.)

The first thing we have to do is implement a data structure for holding a post. Where in a dynamic language, you’d might just get back a big wodge of XML and pick at it as necessary in Haskell, you need to define a data type to hold your results.

So we’ll start there.

Working from the definition of a post in the API documentation what we end up with is something like:

data WPPost = WPPost {
  pPostId :: String,
  pPostTitle :: String,
  pPostDate :: CalendarTime,
  pPostDateGmt :: CalendarTime,
  pPostModified :: CalendarTime,
  pPostModifiedGmt :: CalendarTime,
  pPostStatus :: String,
  pPostType :: String,
  pPostFormat :: String,
  pPostName :: String,
  pPostAuthor :: String,
  pPostPassword :: String,
  pPostExcerpt :: String,
  pPostContent :: String,
  pPostParent :: String,
  pPostMimeType :: String,
  pLink :: String,
  pGuid :: String,
  pMenuOrder :: Int,
  pCommentStatus :: String,
  pPingStatus :: String,
  pSticky :: Bool,
  pPostThumbnail :: [WPMediaItem],
  pTerms :: [WPTerm],
  pCustomFields :: [WPCustomField]
} deriving Show

This refers to a few other structs that we’ve defined—the process is pretty straightforward, so I’m not going to go over it.

Then we need to take this type and give Haskell a way to convert back and forth from it to XML-RPC. The HaXR page on the Haskell wiki link to some example code, and the Network.XmlRpc docs include a little explanation on how to do this.

If your XML-RPC structure has names that are sufficiently unique to map to record names without conflicts, and you’re comfortable with Template Haskell, you could just do:

$(asXmlRpcStruct ''WPPost)

However, if you have an aversion to Template Haskell, or (perhaps more likely) you have field names that are generic enough to present significant conflicts (id, or type or some such) you will have to do it by hand by defining an XmlRpcType instance for your constructor. That ends up looking like:

instance XmlRpcType WPPost where
  toValue struct = toValue $ [("post_id", toValue (pPostId struct)),
                              ("post_title", toValue (pPostTitle struct)),
                              ("post_date", toValue (pPostDate struct)),
                              ("post_date_gmt", toValue (pPostDateGmt struct)),
                              ("post_modified", toValue (pPostModified struct)),
                              ("post_modified_gmt", toValue (pPostModifiedGmt struct)),
                              ("post_status", toValue (pPostStatus struct)),
                              ("post_type", toValue (pPostType struct)),
                              ("post_format", toValue (pPostFormat struct)),
                              ("post_name", toValue (pPostName struct)),
                              ("post_author", toValue (pPostAuthor struct)),
                              ("post_password", toValue (pPostPassword struct)),
                              ("post_excerpt", toValue (pPostExcerpt struct)),
                              ("post_content", toValue (pPostContent struct)),
                              ("post_parent", toValue (pPostParent struct)),
                              ("post_mime_type", toValue (pPostMimeType struct)),
                              ("link", toValue (pLink struct)),
                              ("guid", toValue (pGuid struct)),
                              ("menu_order", toValue (pMenuOrder struct)),
                              ("comment_status", toValue (pCommentStatus struct)),
                              ("ping_status", toValue (pPingStatus struct)),
                              ("sticky", toValue (pSticky struct)),
                              ("post_thumbnail", toValue (pPostThumbnail struct)),
                              ("terms", toValue (pTerms struct)),
                              ("custom_fields", toValue (pCustomFields struct))]
  fromValue v = do
    struct <- fromValue v
    a <- getField "post_id" struct
    b <- getField "post_title" struct
    c <- getField "post_date" struct
    d <- getField "post_date_gmt" struct
    e <- getField "post_modified" struct
    f <- getField "post_modified_gmt" struct
    g <- getField "post_status" struct
    h <- getField "post_type" struct
    i <- getField "post_format" struct
    j <- getField "post_name" struct
    k <- getField "post_author" struct
    l <- getField "post_password" struct
    m <- getField "post_excerpt" struct
    n <- getField "post_content" struct
    o <- getField "post_parent" struct
    p <- getField "post_mime_type" struct
    q <- getField "link" struct
    r <- getField "guid" struct
    s <- getField "menu_order" struct
    t <- getField "comment_status" struct
    u <- getField "ping_status" struct
    v <- getField "sticky" struct
    w <- getField "post_thumbnail" struct
    x <- getField "terms" struct
    y <- getField "custom_fields" struct
    return WPPost {
      pPostId = a,
      pPostTitle = b,
      pPostDate = c,
      pPostDateGmt = d,
      pPostModified = e,
      pPostModifiedGmt = f,
      pPostStatus = g,
      pPostType = h,
      pPostFormat = i,
      pPostName = j,
      pPostAuthor = k,
      pPostPassword = l,
      pPostExcerpt = m,
      pPostContent = n,
      pPostParent = o,
      pPostMimeType = p,
      pLink = q,
      pGuid = r,
      pMenuOrder = s,
      pCommentStatus = t,
      pPingStatus = u,
      pSticky = v,
      pPostThumbnail = w,
      pTerms = x,
      pCustomFields = y }
  getType _ = TStruct

Yeah, so that’s the obvious way to do it. And boy is it tedious—I need to figure out a better way to make this happen. because that’s a lot of pointless boilerplate.

It seems to me that I should somehow be able to define a small data structure and then pull the necessary bits out just once, rather than having to repeat everything at least twice. I guess that’s the purpose that the Template Haskell code serves, but I need more power.

Oh, well, it’s done for the moment.


I want to emphasize here that at this point, I’m just trying to get things done. I am intrigued by the theoretical underpinnings of Haskell (although my understanding of most of them is…shallow at the very least), but I’m also a working programmer—I need to be able to be productive. I want the benefits that I think Haskell has to provide—static typing to keep me from making as many dumb mistakes, good performance—but I have to be able to produce actual code for those things to be worth anything.

At the same time, I recognize that what I’ve just done probably represents a small chunk of technical debt. I’d love to learn enough to be able to pay it off.