How I would start out with Emacs now…

I’ve been toting more or less the same emacs config file around for most of 20 years. I have occasionally sliced it up into multiple files—and then inevitably reintegrated it back into one file—and I have certainly added to it, removed from it, etc. The fundamental way I approached Emacs’ configuration, though, always stayed the same—a bunch of require statements, a gigantic setq invocation, a few miscellaneous functions copied from elsewhere, etc.

For most of that time I’ve also used , and once they existed, I still resisted the idea of using the emacs package facilities in preference to Debian packages—of which I’ve made a couple, but it’s a non-trivial amount of work; someone should write dh-make-emacs—so it had to be something that was really important to me if I was going to go to the trouble of getting it installed. So I didn’t install much.

What I’m trying to say is that I’ve led a fairly impoverished Emacs life.

About a month ago, I decided that that was stupid. The specific catalyst was that I wanted to take advantage of features in a newer version of haskell-mode, but packaging that—especially doing so in a way that I could easily take development snapshots—would have been an almighty amount of work; so I decided to capitulate, and move to using Emacs’ package facilities, and ELPA and MELPA and just go whole-hog.

This has made a huge difference in my productivity—I am taking such better advantage of Emacs now it startles me, and, even more astonishing, my init.el file is far more coherent and comprehensible, so much that I cannot imagine a reason to split it up at this point, even as I try out additional packages.

I’ll probably write more about specific stuff later—assuming I feel like I have something to contribute beyond, say, mere fanboyishness—but I wanted to lay out how I would approach bootstrapping from nothing, knowing what I know now.

My very first step would to be to get use-package installed. This lives in MELPA, so it does require a true bootstrapping step. I would pop open a *scratch* buffer, type

(progn (add-to-list 'package-archives '("melpa" . "http://melpa.org/packages/"))
 (package-list-packages))

Use ^X^E to run it, and install use-package.

Having bootstrapped use-package, my initial init.el would start out with:

;; Using packages exclusively now(package-initialize)

;; Easy correct loading and configuration support(require 'use-package)

(use-package package  :config  (progn    (add-to-list 'package-archives '("marmalade" . "https://marmalade-repo.org/packages/") t)
    (add-to-list 'package-archives '("melpa" . "http://melpa.org/packages/") t)))

Almost everything else will end up in a use-package invocation for its associated package.

Moving to the 8.0 exporter

One of the TODO items I wanted to finish before making an org-blog 1.0 release was to convert to the new exporter present in org-mode 8.0.

A couple of weeks ago, I looked at turning a buffer into a post structure. I commented at the time that I wasn’t 100% certain why the code even worked. I also commented in a later post that there was some unfortunate redundancy in the way certain things were being done. I’m pleased to say that both issues are addressed.

First off, I created a blog back-end for the new exporter:

(org-export-define-derived-backend 'blog 'html
                                   :options-alist org-blog-buffer-options-alist)

The org-blog-buffer-options-alist that it references is a structure that’s created on the fly:

(defconst org-blog-buffer-options-alist
  (reduce
   (lambda (l i)
     (let ((field (plist-get (cdr i) :attr)))
       (if (string-prefix-p "POST_" field t)
           (cons (list (car i) field nil nil t) l)
         l)))
   org-blog-post-mapping
   :initial-value nil))

That’s basically plucking all of the items out of the org-blog-post-mapping field whose text names start with “POST_”.

(defconst org-blog-post-mapping
'((:blog :attr "POST_BLOG" :from-buffer org-blog-property-strip)
  (:category :attr "POST_CATEGORY" :from-buffer org-blog-property-split)
  (:date :attr "DATE" :from-buffer org-blog-date-format)
  (:description :attr "DESCRIPTION" :from-buffer org-blog-property-strip)
  (:id :attr "POST_ID" :from-buffer org-blog-property-strip)
  (:keywords :attr "KEYWORDS" :from-buffer org-blog-property-split)
  (:link :attr "POST_LINK" :from-buffer org-blog-property-strip)
  (:name :attr "POST_NAME" :from-buffer org-blog-property-strip)
  (:parent :attr "POST_PARENT" :from-buffer org-blog-property-strip)
  (:status :attr "POST_STATUS" :from-buffer org-blog-property-strip)
  (:title :attr "TITLE" :from-buffer (lambda (v i) (org-blog-property-strip (car v) i)))
  (:type :attr "POST_TYPE" :from-buffer org-blog-property-strip)))

Then, to do the actual deed, we call org-export-as, and use a reduce loop to do any of the translations called out in the list above.

(defun org-blog-buffer-extract-post ()
  "Transform a buffer into a post.

We do as little processing as possible on individual items, to
retain the maximum flexibility for further transformation."
  (let ((content
         (org-export-as 'blog nil nil t '(:preserve-breaks nil
                                          :section-numbers nil
                                          :with-tags nil
                                          :with-toc nil
                                          :with-todo-keywords nil)))
        (attrs
         (org-export-get-environment 'blog)))
    (sort
     (reduce
      (lambda (l i)
        (let ((v (plist-get attrs (car i)))
              (filter (plist-get (cdr i) :from-buffer)))
          (if v
              (cons (cons (car i) (if filter
                                      (funcall filter v attrs)
                                    v)) l)
            l)))
      org-blog-post-mapping
      :initial-value (when content
                       (list (cons :content content))))
     (lambda (a b)
       (string< (car a) (car b))))))

Although the overall code hasn’t gotten signficantly shorter, it’s much clearer—it uses more centralized structures to gather together information, and it applies more general transformations defined in structures rather than conditional statements.

I love it when a plan comes together

I’m basically done.

OK, that’s a total lie—there’s plenty more to do. But I am to the point where I feel like (except for one bug I’ll mention later) doing an actual release—say a 0.90 release—would make sense.

Today, we’ll implement org-blog-save, which is all about getting what you wrote onto the server, and the server’s output integrated back into your post. It starts out with a couple of boilerplate XML-RPC wrappers:

(defun org-blog-wp-post-retrieve (xmlrpc blog-id username password post-id &optional fields)
  "Fetches a single post from the weblog system."
  (let ((params (list xmlrpc
                      'wp.getPost
                      blog-id
                      username
                      password
                      post-id)))
    (when fields
      (append params (list fields) nil))
    (apply 'xml-rpc-method-call params)))

(defun org-blog-wp-post-update (xmlrpc blog-id username password post-id post)
  "Edit an existing post on the blog."
  (xml-rpc-method-call xmlrpc
                       'wp.editPost
                       blog-id
                       username
                       password
                       post-id
                       (org-blog-post-to-wp post))
  post-id)

Simple, straightforward, still a little boilerplate-y, but I’ll address that down the road.

Then there’s org-blog-save which, I’m happy to say, is also pretty simple and straightforward. That such an important chunk of code is this easy to understand makes me feel that I’ve probably got the design mostly right:

(defun org-blog-save ()
  "Publish an article to a server and save locally.

By default org-blog will try and save the article in a hierarchy
that mirrors the permalink structure for the blog in question."
  (interactive)
  (condition-case failure
      (let* ((post (org-blog-buffer-extract-post))
             (blog (org-blog-post-to-blog post))
             (post-id (if (cdr (assq :id post))
                          (org-blog-call blog "post-update" (cdr (assq :id post)) post)
                        (org-blog-call blog "post-create" post)))
             (post (org-blog-call blog "post-retrieve" post-id)))
        (org-blog-buffer-merge-post (org-blog-wp-to-post post)))
    (error (apply 'message (cdr failure)))))

It uses condition-case to make sure that it catches any exceptions that bubble up, and then goes about its business fairly straightforwardly:

  • Get a post structure from the buffer
  • Get a blog structure from the post
  • Look to see if the post already has a post-id
    • If it does, update the post
    • If it doesn’t, create a new post
  • Either way, return the post-id
  • Retrieve the post using the post-id
  • Merge the post into the buffer

What could be easier? In fact, this is the code I’ve been using (with some small improvements) to write this series of posts so it works.

Oh, except for the bug I mentioned earlier. Timestamp handling is a little wonky—WordPress has a tendency to not add TZ information to their timestamps, or XML-RPC doesn’t actually pay attention to them, so the structure I get back from the server has a time stamp that is off by (in my case) four hours. I’ve tried to use the GMT versions of timestamp fields, but I fear I haven’t gotten it quite right.

The test I have for org-blog-save is still a little crude. It’s on github, and I intend to tweak it as well—that will probably be a necessary part of fixing the timestamp issue.

So, this weekend will probably be dedicated to cleanup and documentation, and then a release early next week. Whew.

The gateway to the server

Although I don’t have any immediate plans to implement a back-end for anything other than WordPress, I want to make sure it’s possible, so I have created a list of operations I want to be able to do on posts (well, OK, I haven’t actually written it down or anything, but it’s in my head. Somewhere). Each back-end needs to implement each call (even if it’s just a no-op).

If that was the only abstraction, though, we would end up with a lot of boilerplate code—or we would define some macros, which I’m not yet familiar with—because each operation would have to take care of pulling out whatever information it needed from the arguments it was handed—which will always include a blog structure as the first item.

Instead, I opted for a level of indirection. Instead of calling a back-end function directly, any time the back-end-agnostic code wants to perform an operation on an article, it goes through org-blog-call. org-blog-call figures out what engine the blog structure is using, constructs the name of the operation for the back-end associated with the blog, and then invokes it.

So it’s really pretty short—pulling out a single bit of info and checking for a function’s existence.

(defun org-blog-call (blog call &rest args)
  "Make the specified call to the appropriate blog engine.

This allows us to maintain multiple engines, with a set of
operations common to all, and call the appropriate function based
on the engine specification in the entry in `org-blog-alist'."
  (let ((entry (intern (concat "org-blog-" (cdr (assq :engine blog)) "-call"))))
    (if (fboundp entry)
        (apply entry blog call args)
      (error (format  "Can't find function %s" entry)))))

What this is invoking, in the case of a WordPress blog, is org-blog-wp-call. It doesn’t do whole heck of a lot, either, other than pull out the parameters that are common to each XML-RPC request, and invoking the actual function (whose name it has constructed, much like org-blog-call).

(defun org-blog-wp-call (blog call &rest args)
  "Easy calls to WordPress API functions.

This routine will take whatever information a user has available,
fill in the rest (if the user is willing to cooperate) and then
call the specified function and return the results."
  (let ((blog-id (cdr (assq :blog-id blog)))
        (func (intern (concat "org-blog-wp-" call)))
        (password (cdr (assq :password blog)))
        (username (cdr (assq :username blog)))
        (xmlrpc (cdr (assq :xmlrpc blog))))
    (if (fboundp func)
        (apply func xmlrpc blog-id username password args)
      (error "Can't find function %s" func))))

At this point, the actual XML-RPC call is pretty anticlimactic:

(defun org-blog-wp-post-create (xmlrpc blog-id username password wp)
  "Create a post on the blog."
  (xml-rpc-method-call xmlrpc
                       'wp.newPost
                       blog-id
                       username
                       password
                       (org-blog-post-to-wp wp)))

Hmmm. Looking back on this, I can see that I’m going to end up changing this at some point. I feel certain there are more direct, cleverer ways to implement this pattern—ones that would even allow me to pull out the open-coded XML-RPC call in org-blog-wp-params. So I’ll bee looking into this, though I still want to get through the initial implementation first.

Beginning to weave things together

This is really the last stop before the rubber starts to hit the road—we’ve got various small bits of functionality implemented, now we have to weave them together into actual operations.

The first thing we need to be able to do, given a buffer in which someone has written a post, is figure out where we might go posting it. Remember, in the buffer, the blog is identified with a simple #+POST_BLOG: <name> property. It’s value is just going to be a text string. And it may not even be present!

So what we have here is a pair of functions that, given a post structure, will furnish us with a complete blog structure, or throw an exception indicating we should give up if the user declined to furnish us with all the info we needed.

(defun org-blog-post-to-blog (post)
  "Determine the blog to use for the given post.

It will ask for the blog name and blog engine if necessary, and
then hand off to the particular engine's `-params' function, so
it may make a number of interactive queries to the user."
  (let* ((name (org-blog-get-name post))
         (blog (cdr (assoc name org-blog-alist)))
         (engine (org-blog-blog-to-engine blog))
         (funcname (concat "org-blog-" engine "-params"))
         (func (intern funcname)))
    (unless (functionp func)
      (error (format "Can't find params function for %s engine" engine)))
    (apply func blog nil)))

(defun org-blog-blog-to-engine (blog)
  "Get the blog engine name from the blog structure.

If it's not present, ask the user to choose from among those
available in org-blog-alist."
  (let ((engine (or (cdr (assq :engine blog))
                    (empty-string-is-nil (completing-read
                                          "Blog software: "
                                          (mapcar 'car org-blog-engine-alist) nil t)))))
    (unless engine
      (error (format "Can't find engine %s" engine)))
    engine))

The important thing to note, I think, is that these functions don’t presume to “know” anything about the structure of a blog—that is, while a WordPress blog may require various bits of information (XML-RPC endpoint, username, password, blog-id), some other back-end might require an entirely different set of attributes…and these functions don’t care. By delegating all the work of making sure that all attributes are satisfied to functions that are written alongside the particular blogging back-end, we allow back-ends that have very different requirements. I could see this being used not just for WordPress-like server-oriented back-ends but also for static site generators and the like.

Testing is fairly straightforward:

(ert-deftest ob-test-org-blog-post-to-blog ()
    "Test getting the blog information from a blog post"
    (let* ((blog-passwd (read-passwd "Password for blog listing: "))
           (org-blog-alist `(("bar" . ((:engine . "wp")
                                       (:xmlrpc . "http://wordpress.com/xmlrpc.php")
                                       (:username . "mdorman@ironicdesign.com")
                                       (:password . ,blog-passwd)))))
           (final-blog-param `((:blog-id . "46183217")
                               (:engine . "wp")
                               (:password . ,blog-passwd)
                               (:username . "mdorman@ironicdesign.com")
                               (:xmlrpc . "https://orgblogtest.wordpress.com/xmlrpc.php"))))
      (org-blog-new)
      (should (equal (org-blog-post-to-blog (org-blog-buffer-extract-post)) final-blog-param))))

We mock up the org-blog-alist, create a new post (which will automatically be assigned to the one available blog), and then we go through the process to get our blog structure out.

I will admit, I could probably be a little more adventurous in testing this—testing failure modes, etc.—but that will have to wait for future days when I’m just looking for something small to do.

Tomorrow I think we’ll add the XML-RPC code for WordPress, as well as the generic machinery that calls it.

Everybody Lies

I miss House. What could be more fun than watching Hugh Laurie verbally abuse people for an hour each week?

Anyway, we’ve gotten to the point where we want to start pushing data to the server. Which means we’re going to have to start pulling data from the server. Specifically, configuration data for the blog, because some of what we need to have on hand to create or otherwise manipulate posts isn’t always obvious or easily discoverable—so we need to try and figure it out for ourselves.

For this post, I’m actually going to start off with the test (which, incidentally, I left off of the last post because it was already ginormously long—but rest assured, if you check out the github repository, everything is being tested):

(ert-deftest ob-test-wp-params ()
  "Test getting the blog-id (and correct xmlrpc URL) via xmlrpc"
  (let* ((blog-passwd (read-passwd "Password for blog listing: "))
         (initial-blog-param `((:xmlrpc . "http://wordpress.com/xmlrpc.php")
                               (:username . "mdorman@ironicdesign.com")
                               (:password . ,blog-passwd)))
         (final-blog-param `((:blog-id . "46183217")
                             (:engine . "wp")
                             (:password . ,blog-passwd)
                             (:username . "mdorman@ironicdesign.com")
                             (:xmlrpc . "https://orgblogtest.wordpress.com/xmlrpc.php"))))
    (should (equal (org-blog-wp-params initial-blog-param) final-blog-param))))

I want to start with the test because I think it’s pretty illustrative of the divergence between what people may know about their blog, and what is actually needed. For instance, if you’re on a big hosting service, do you actually know the ID of your blog? Yet this is a necessary component for creating posts, so we have to be able to discover it. And I only stumbled across it by accident, but I assumed that the XML-RPC end-point for a blog on wordpress.com was on wordpress.com…but it’s not.

Anyway, you can see how given very partial and even somewhat erroneous information, we expect org-blog-wp-params (or any equivalent function for another engine) to give us the right stuff to make actual posts.

org-blog-wp-params starts off simple enough, and then suddenly goes non-linear. The reason is simple: the url, username and password are things that we must get from the user before we have a chance to do anything else.

(Actually, that’s not true—for WordPress blogs, at least, you could get the URL of the blog, look for the <link rel=”EditURI”> tag, follow that, parse the XML and get everything but the username and password; but since you need those anyway, it’s a lot less work this way. Perhaps some time in the future I’ll take advantage of the EditURI bit.)

So for each of those first three items, we look them up in an exiting blog structure, and if there’s nothing, we ask the user for the information, and if there’s still nothing, we bail out—there’s nothing more to do. And then we hit the blog-id, and things get interesting.

(defun org-blog-wp-params (blog)
  "Construct the basic paramlist for wordpress calls.

This starts with the information the user may have set for the
blog in their configuration, and then attempts to fill in any
holes so it can produce a list of necessearily generic
parameters.  `org-blog-wp-call' can then use the output of this
function to make other calls."
  (let ((complete (list (cons :engine "wp"))))
    (push (cons :xmlrpc (or (cdr (assq :xmlrpc blog))
                            (empty-string-is-nil (read-from-minibuffer "XML-RPC URL: "))
                            (error "Posting cancelled")))
          complete)

    (push (cons :username (or (cdr (assq :username blog))
                              (empty-string-is-nil (read-from-minibuffer "Username: "))
                              (error "Posting cancelled")))
          complete)
    (push (cons :password (or (cdr (assq :password blog))
                              (empty-string-is-nil (read-passwd "Password: "))
                              (error "Posting cancelled")))
          complete)
    (push (cons :blog-id (or (cdr (assq :blog-id blog))                                          (blog-id)
                             (empty-string-is-nil (let ((userblogs (xml-rpc-method-call
                                                                    (cdr (assq :xmlrpc complete))
                                                                    'wp.getUsersBlogs
                                                                    (cdr (assq :username complete))
                                                                    (cdr (assq :password complete)))))
                                                    (cond
                                                     ;; If there's no blogs, fail
                                                     ((eq userblogs nil)
                                                      nil)
                                                     ;; If there's only one blog, use its blog-id (and xml-rpc) automatically
                                                     ((equal (length userblogs) 1)
                                                      (setcdr (assq :xmlrpc complete) (cdr (assoc "xmlrpc" (car userblogs))))
                                                      (cdr (assoc "blogid" (car userblogs))))
                                                     ;; FIXME: Prompt the user from the list of blogs (if there's more than 1
                                                     ;; Then shove the blog info into complete
                                                     (t
                                                      (reduce
                                                       #'(lambda (entry)
                                                           (when (string= (cdr (assoc "blogName" entry)))
                                                             (print (format "XMLRPC from server is %s" (cdr (assoc "xmlrpc" userblog))))
                                                             (setcdr (assq :xmlrpc complete) (cdr (assoc "xmlrpc" userblog)))
                                                             (cdr (assoc "blogid" userblog))))
                                                       userblogs
                                                       :initial-value (completing-read
                                                                       "Blog Name: "
                                                                       (mapcar #'(lambda (entry)
                                                                                   (cdr (assoc "blogName" entry)))
                                                                               userblogs) nil t))))))
                             (error "Posting cancelled")))
          complete)
    (sort
     complete
     #'(lambda (a b)
         (string< (car a) (car b))))))

If the blog-id is in the blog structure we’ve been handed, we assume it’s correct and move on. If it’s not present, though, we assume that the user has no idea what it might be, so we do an XML-RPC call to get the list of blogs belonging to the user.

When looking at the output of that call, there’s three possible scenarios:

  1. there’s no blog available at that URL, in which case we’re done.
  2. there’s one blog available at that URL, in which case we grab it (and also make sure we pull out any XML-RPC endpoint information) and we’re done.
  3. there’s more than one blog available at that URL. In which case, we prompt the user for a selection from among the list of blogs. If they choose one, we grab the blog-id (and the XML-RPC endpoint) and we’re done.

If they don’t opt for one of the above, again, we cancel. Otherwise, we sort the alist we’ve created (to make it easy to test), and we’re done.

One thing that we don’t yet do that I would like to is tell the user what they should be putting in their config in order to avoid us having to do all this consultation—this would lower the barrier to entry to using org-blog even a little more; you just fire up org-blog-new for the first time, give it a name for the blog, then when you do org-blog-save, it would prompt you for the information and spit back a configuration block after it’s done saving.

But that’s the future.

Tomorrow, we’ll look at where this function fits into the bigger scheme of things.

And now I will turn this hat into a rabbit

Before my digression about optimization and more idiomatic structures, we had just implemented the conversion of a post structure into a structure suitable for handing to WordPress. In a system like this, though, transformations always come in pairs, so we know that the complimentary WordPress to post structure operation has to be around here somewhere.

However, it’s important, I think, to realize that what WordPress returns to represent a post is a lot more than we put in. So here’s an example of what a WordPress post looks like, retrieved through XML-RPC and cast into sexps (automatically by the elisp xml-rpc module):

'(("post_id" . "1")
  ("post_title" . "Test 1 Title")
  ("post_date" :datetime
   (20738 4432))
  ("post_date_gmt" :datetime
   (20738 18832 0 0))
  ("post_modified" :datetime
   (20738 4432))
  ("post_modified_gmt" :datetime
   (20738 4432))
  ("post_status" . "publish")
  ("post_type" . "post")
  ("post_name" . "t1n")
  ("post_author" . "3075621")
  ("post_password")
  ("post_excerpt" . "t1e")
  ("post_content" . "\n<p>Test 1 Content\n</p>")
  ("post_parent" . "0")
  ("post_mime_type")
  ("link" . "http://example.com/")
  ("guid" . "http://example.com/")
  ("menu_order" . 0)
  ("comment_status" . "closed")
  ("ping_status" . "open")
  ("sticky")
  ("post_thumbnail")
  ("post_format" . "standard")
  ("terms"
   (("term_id" . "126039325")
    ("name" . "t1c1")
    ("slug" . "t1c1")
    ("term_group" . "0")
    ("term_taxonomy_id" . "4")
    ("taxonomy" . "category")
    ("description")
    ("parent" . "0")
    ("count" . 0))
   (("term_id" . "126039469")
    ("name" . "t1c2")
    ("slug" . "t1c2")
    ("term_group" . "0")
    ("term_taxonomy_id" . "5")
    ("taxonomy" . "category")
    ("description")
    ("parent" . "0")
    ("count" . 0))
   (("term_id" . "147991082")
    ("name" . "t1k1")
    ("slug" . "t1k1")
    ("term_group" . "0")
    ("term_taxonomy_id" . "6")
    ("taxonomy" . "post_tag")
    ("description")
    ("parent" . "0")
    ("count" . 0))
   (("term_id" . "147991085")
    ("name" . "t1k2")
    ("slug" . "t1k2")
    ("term_group" . "0")
    ("term_taxonomy_id" . "7")
    ("taxonomy" . "post_tag")
    ("description")
    ("parent" . "0")
    ("count" . 0))
   (("term_id" . "147991087")
    ("name" . "t1k3")
    ("slug" . "t1k3")
    ("term_group" . "0")
    ("term_taxonomy_id" . "8")
    ("taxonomy" . "post_tag")
    ("description")
    ("parent" . "0")
    ("count" . 0)))
  ("custom_fields"))

As you can see, there’s a lot of stuff in there that we don’t deal with in our post structure—the guid, the modification times, menu_order and more. Even more alarming is the sheer quantity of information we get back to describe categories and tags—they’re intermixed in the terms field, along with a lot of information we don’t intend to mess with.

We have a bit of a job condensing this stuff down.

I’m actually going to take a look at the bit of the code responsible for handling the terms field first. It takes the current post structure, as well as the list of terms entries, and updates the post structure to have appropriate :category and :tags fields, and is itself fairly straightforward:

(defun org-blog-wp-to-post-handle-taxonomy (post entries)
  "Handle mapping WordPress taxonomy info into a post struct.

We have to operate on all of the items in the taxonomy structure,
glomming them onto the existing post."
  (let* ((tlist (org-blog-wp-xml-terms-to-term-alist entries))
         (category (assoc "category" tlist))
         (tag (assoc "post_tag" tlist)))
    (when category
      (push (cons :category (cdr category)) post))
    (when tag
      (push (cons :tags (cdr tag)) post))))

(You might wonder if I should replace those calls to post with cons as I discussed yesterday. The answer is no: this function, unfortunately, exists for its side-effects in modifying post, so that’s not an option. Though I will probably rewrite it.)

All that’s doing, though, is adding a category or tag to our post when it’s present—the real action is in the function that takes the flat list from terms and turns it into an alist:

(defun org-blog-wp-xml-terms-to-term-alist (terms)
  "Handle turning WordPress taxonomy lists into an alist.

From here we can extract just the bits we need."
  (reduce
   '(lambda (lists term)
      (let ((name (cdr (assoc "name" term)))
            (taxonomy (cdr (assoc "taxonomy" term))))
        (cons (append (list taxonomy) (cdr (assoc taxonomy lists)) (list name)) lists)))
   terms :initial-value nil))

I had a much more convoluted version of this at one point, taking great care to remove the existing values for the attribute, because I lost sight of two complimentary attributes of lists in Lisp, and alists in particular.

The first is that cons takes the cons cell that is its first argument and sets its “next item” pointer (cdr) to point to the second argument. This is a constant-time operation, which is good, because you do it a lot in lisp, and it means whatever you cons goes to the front of the list.

The second is that when querying an alist, whether using assoc or assq or anything that looks at the first item, all the functions stop at the first mtach.

So instead of having to alter a list as I add terms to it, I can just cons the fully updated list onto the beginning of the results, and any time you search for that item in the alist, you will find the most up-to-date one first.

With all that term handling out of the way, the actual transformation function is kind of anticlimactic:

(defun org-blog-wp-to-post (wp)
  "Transform a WordPress struct into a post.

This is largely about mapping tag names, though the `terms'
structure benefits from a helper function to handle mapping it
properly.

For convenience in testing and inspection, the resulting alist is
sorted."
  (sort
   (reduce
    '(lambda (post new)
       "Do key and value transformations."
       (let ((k (car new))
             (v (cdr new)))
         (cond ((eq v nil)
                post)
               ((string= "terms" k)
                (org-blog-wp-to-post-handle-taxonomy post v))
               ((string= "post_date_gmt" k)
                ;; Must be a better way to extract this value
                (cons (cons (car (rassoc k org-blog-wp-alist)) (time-add (cadr v) (seconds-to-time (car (current-time-zone))))) post))
               ((rassoc k org-blog-wp-alist)
                (cons (cons (car (rassoc k org-blog-wp-alist)) v) post))
               (t
                 post))))
    wp :initial-value nil)
   '(lambda (a b)
      (string< (car a) (car b)))))

It’s just a variation on the existing transformation functions, doing the translation in a different direction. Which, again, argues for a more general implementation that’s using table-driven transformations for individual items, an idea I hope I’ll get to before too long.

And now a slight diversion…

So rather than continuing on with our next section, I’m going to take a moment and look at some refactoring I did to several bits of code.

I would say that some of the issues with the original code stem from me being relatively new to lisp—I don’t always recognize the potential idiomatic versions of things, or I don’t have a visceral notion of exactly what a particular function is doing.

The other source is probably just that this code changed over time, and I didn’t always take the time, or see the opportunity to refactor at each step. I will say, though, that I am inordinately happy to have a fairly comprehensive test suite that immediately identified problematic transformations in the code. Without that, I would have been much less confident of my results.

So, without further ado, here’s the first function we’ll refactor, in its original form:

(defun org-blog-post-to-wp (post)
  "Transform a post into a structure for submitting to WordPress.

This is largely about mapping tag names, though the handling of
`category' and `tags' is little more complex as the WordPress API
now groups them as `taxonomies', and requires a hierarchical
structure to differentiate them.

For convenience in testing and inspection, the resulting alist is
sorted."
  (sort
   (reduce
    '(lambda (wp new)
       (let ((k (car new))
             (v (cdr new)))
         (when v
           (cond ((eq :category k)
                  (setq wp (org-blog-post-to-wp-add-taxonomy wp "category" v)))
                 ((eq :date k)
                  ;; Convert to GMT by adding seconds offset
                  (push (cons "post_date_gmt" (list :datetime
                                                    (time-add (car v)
                                                              (seconds-to-time (- (car (current-time-zone)))))))
                        wp))
                 ((eq :tags k)
                  (setq wp (org-blog-post-to-wp-add-taxonomy wp "post_tag" v)))
                 ((eq :title k)
                  (push (cons "post_title" (or v "No Title")) wp))
                 ((assq k org-blog-wp-alist)
                  (push (cons (cdr (assq k org-blog-wp-alist)) v) wp))
                 )))
       wp)
    post :initial-value nil)
   '(lambda (a b)
      (string< (car a) (car b)))))

So, my big “Aha!” moment was when I happened to go back and re-read the documentation for push, which has this observation embedded in it:

[…] and this macro is equivalent to (setq listname (cons element listname)).

Now this might not immediately suggest any particular refactoring, but I also noticed that the two cases that weren’t using push, were using (setq wp), and in both of those cases, (:category and :tags), we’re setting it to the output of a function, and we’re using the modified value of wp as the return value of our lambda.

Knowing that a lisp form generally returns the value of the last statement that it executes, and that both setq and push return their modified values, my first simplification was to simply remove the lonely little wp at the end of the lambda, and let the result of the cond be the return value from the lambda. Which promptly broke things—I was getting partial structures back.

What I hadn’t taken into account was that when there was no value in v, the when statement would return no value—so any attribute that didn’t have a value would reset reduce‘s accumulator to nil.

That was easily enough solved by converting the when to an if that returned the unmodified wp if there was no value to be added. This got me:

(lambda (wp new)
  (let ((k (car new))
        (v (cdr new)))
    (if v
        (cond ((eq :category k)
               (setq wp (org-blog-post-to-wp-add-taxonomy wp "category" v)))
              ((eq :date k)
               ;; Convert to GMT by adding seconds offset
               (push (cons "post_date_gmt" (list :datetime
                                                 (time-add (car v)
                                                           (seconds-to-time (- (car (current-time-zone)))))))
                     wp))
              ((eq :tags k)
               (setq wp (org-blog-post-to-wp-add-taxonomy wp "post_tag" v)))
              ((eq :title k)
               (push (cons "post_title" (or v "No Title")) wp))
              ((assq k org-blog-wp-alist)
               (push (cons (cdr (assq k org-blog-wp-alist)) v) wp)))
      wp)))

That seemed like just moving the food around on the plate—hardly worth it. However, I then realized a further change—that everything could go in the cond, and the if could be eliminated entirely. This produces:

(lambda (wp new)
  (let ((k (car new))
        (v (cdr new)))
    (cond ((eq v nil)
           wp)
          ((eq :category k)
           (setq wp (org-blog-post-to-wp-add-taxonomy wp "category" v)))
          ((eq :date k)
           ;; Convert to GMT by adding seconds offset
           (push (cons "post_date_gmt" (list :datetime
                                             (time-add (car v)
                                                       (seconds-to-time (- (car (current-time-zone)))))))
                 wp))
          ((eq :tags k)
           (setq wp (org-blog-post-to-wp-add-taxonomy wp "post_tag" v)))
          ((eq :title k)
           (push (cons "post_title" (or v "No Title")) wp))
          ((assq k org-blog-wp-alist)
           (push (cons (cdr (assq k org-blog-wp-alist)) v) wp)))))

While it’s not really shorter, I think it’s clearer—all inspection of the item we’re working with is done in the cond, rather in a combination of an initial conditional (our if / when) and then the cond.

There’s one last little bit of cleanup that’s worth doing. It circles back around to the initial observation about push. That is, if we’re just using the return value of push, which is really the return value of the implicit setq, why do we need push / setq at all? They’re just noise and extra operations. So we can transform any push to a simple cons, and eliminate our setq calls entirely.

(lambda (wp new)
  (let ((k (car new))
        (v (cdr new)))
    (cond ((eq v nil)
           wp)
          ((eq :category k)
           (org-blog-post-to-wp-add-taxonomy wp "category" v))
          ((eq :date k)
           ;; Convert to GMT by adding seconds offset
           (cons (cons "post_date_gmt" (list :datetime
                                             (time-add (car v)
                                                       (seconds-to-time (- (car (current-time-zone)))))))
                 wp))
          ((eq :tags k)
           (org-blog-post-to-wp-add-taxonomy wp "post_tag" v))
          ((eq :title k)
           (cons (cons "post_title" (or v "No Title")) wp))
          ((assq k org-blog-wp-alist)
           (cons (cons (cdr (assq k org-blog-wp-alist)) v) wp)))))

Again, it’s not dramatically shorter, but it is much more focused—it is doing what it intends with as little extra fuss as possible. And, again, it’s more about transformations than imperative operations—where we used to be setting variables and pushing things onto them, now we’re running functions and returning their output, or creating a new items by prepending a cons cell on an existing structure.

Even more interestingly, once I refactored this function, I started looking at a number of other places where I was able to make the same sort of changes—moving from a more imperative style to one where I was more willing to trust the output of my functions to carry me through.

And now to turn this rabbit into a hat…

It’s probably a consequence of my recent study of functional programming—first with Haskell, and then, to a lesser extent, with Emacs Lisp itself—that I structured most of the important bits of org-blog as data transformations.

Really, one of the fundamental insights I gained while teaching myself Haskell was that it is extremely empowering to be able to know that when you call a function, nothing should be altered—that immutability really does increase your confidence that your program is doing what you think.

Now Emacs Lisp doesn’t have Haskell’s immutability or purity or any of those things—in fact, I have been a little dismayed to discover how many basic operations in Emacs Lisp mutate their arguments in one way or another—but it has most of the facilities you need to be able to program in that fashion.

So, with that in mind, the first step in actually posting things to a WordPress blog is going to be to get our post structure into a format that we can feed to WordPress’s XML-RPC interface.

Step one in that process is to define a correspondence of some sort between a property in a post structure and a name in a WordPress structure:

(defconst org-blog-wp-alist
  (list (cons :category "category")
        (cons :content "post_content")
        (cons :date "post_date_gmt")
        (cons :excerpt "post_excerpt")
        (cons :id "post_id")
        (cons :link "link")
        (cons :name "post_name")
        (cons :parent "post_parent")
        (cons :status "post_status")
        (cons :tags "post_tag")
        (cons :title "post_title")
        (cons :type "post_type")))

You will have probably realized by now that I go back and forth between quoting things like this and constructing them with list and cons. I will go back and make things consistent eventually.

So this table is used by the actual transformation function, which looks like this:

(defun org-blog-post-to-wp (post)
  "Transform a post into a structure for submitting to WordPress.

This is largely about mapping tag names, though the handling of
`category' and `tags' is little more complex as the WordPress API
now groups them as `taxonomies', and requires a hierarchical
structure to differentiate them.

For convenience in testing and inspection, the resulting alist is
sorted."
  (sort
   (reduce
    '(lambda (wp new)
       (let ((k (car new))
             (v (cdr new)))
         (when v
           (cond ((eq :category k)
                  (setq wp (org-blog-post-to-wp-add-taxonomy wp "category" v)))
                 ((eq :date k)
                  ;; Convert to GMT by adding seconds offset
                  (push (cons "post_date_gmt" (list :datetime
                                                    (time-add (car v)
                                                              (seconds-to-time (- (car (current-time-zone)))))))
                        wp))
                 ((eq :tags k)
                  (setq wp (org-blog-post-to-wp-add-taxonomy wp "post_tag" v)))
                 ((eq :title k)
                  (push (cons "post_title" (or v "No Title")) wp))
                 ((assq k org-blog-wp-alist)
                  (push (cons (cdr (assq k org-blog-wp-alist)) v) wp))
                 )))
       wp)
    post :initial-value nil)
   '(lambda (a b)
      (string< (car a) (car b)))))

Conceptually, this is simple—we’re running reduce over the list of fields in the post structure, and “accumulating” them into the wp parameter we declare for our lambda.

The unfortunate complexity comes from the fact that while many fields can simply be copied over, a few require significant munging (:date is the biggie, though we default our :title as well—which should probably happen in the post-to-blog transformation now that I think on it), and the :category and :tags fields require a significant chunk of code to handle because instead of having separate fields for each in its XML-RPC interface, WordPress places the two fields under a higher-level structure called taxonomies—and we don’t want to have an empty entry if a post is lacking either or both of the fields.

Thus we have the org-blog-post-to-wp-add-taxonomy function:

(defun org-blog-post-to-wp-add-taxonomy (wp taxonomy entries)
  "Handle adding taxonomy items to a WordPress struct.

The fiddly part is making sure that the sublists are sorted, for
convenience in testing and inspection."
  (let* ((terms (assoc "terms_names" wp))
         (existing (cdr terms))
         (struct (cons taxonomy entries)))
    (if existing
        (progn
          (push struct existing)
          (setcdr terms (sort
                         existing
                         '(lambda (a b)
                            (string< (car a) (car b))))))
      (push (list "terms_names" struct) wp))
    wp))

Simply put, if the terms_names field already exists, we have to add our new “taxonomy” entry to it, but if it doesn’t exist, we need to create it. This is fiddlier than I would like it to be. I actually posted a question on StackOverflow to see if there was a cleaner way; the consensus was that although there were other strategies, there wasn’t anything a whole lot cleaner.

Now this is the place I put on my mea culpa hat, because as I re-examine these two functions, I see one thing I should be doing to make it cleaner—attaching the transformation functions for each field to their entries in the org-blog-wp-alist. This would have several beneficial effects: the transformations would be closely associated with their related fields (thus easily kept up to date) and our reduce becomes much less cluttered—just a function invocation per field. Also, instead of doing all those setq invocations, I should let the result of push (and other functions) simply be the result of the cond, which becomes the result of the lambda, which is the same as having wp be the last sexp in the lambda.

But before I make that change, I want to add a test to make sure that I don’t break anything:

(ert-deftest ob-test-posts-and-wp ()
    "Transfer from buffers to posts and back again"
    (let ((post1-struct '((:blog . "t1b")
                          (:category "t1c1" "t1c2")
                          (:content . "\n<p>Test 1 Content\n</p>")
                          (:date (20738 4432 0 0))
                          (:excerpt . "t1e")
                          (:id . "1")
                          (:link . "http://example.com/")
                          (:name . "t1n")
                          (:parent . "0")
                          (:status . "publish")
                          (:tags "t1k1" "t1k2" "t1k3")
                          (:title . "Test 1 Title")
                          (:type . "post")))
          (post1-wp-input '(("link" . "http://example.com/")
                            ("post_content" . "\n<p>Test 1 Content\n</p>")
                            ("post_date_gmt" :datetime (20738 18832 0 0))
                            ("post_excerpt" . "t1e")
                            ("post_id" . "1")
                            ("post_name" . "t1n")
                            ("post_parent" . "0")
                            ("post_status" . "publish")
                            ("post_title" . "Test 1 Title")
                            ("post_type" . "post")
                            ("terms_names"
                             ("category" "t1c1" "t1c2")
                             ("post_tag" "t1k1" "t1k2" "t1k3")))))
        (should (equal (org-blog-post-to-wp post1-struct) post1-wp-input))))

And, with that done, in fact, I’m not going to make any changes right at the moment—I really want to figure out a sensible way to unify all of my post-transformation tables in one place, which is a somewhat more ambitious change. So for the moment I’ll note the desire in a FIXME comment, and move on. Tomorrow we’ll look at transforming from the structure WordPress outputs back into a post.

Creating a new post (redux)

Now that we’ve got a way to ask the user to choose a blog, as well as ways to extract a post structure from a buffer and merge a post structure back into a buffer, we can actually write our function for creating a new post.

It turns out it’s pretty simple:

(defun org-blog-new ()
  "Create a new buffer primed for a blog entry.

Get a name for the blog from the user, and create a new buffer
with the name of the blog, a timestamp reflecting the current
time, and a number of other empty fields that the user may wish
to fill in."
  (interactive)
  (let ((name (org-blog-get-name)))
    (switch-to-buffer (generate-new-buffer (format "*org-blog/%s*" name)))
    (org-mode)
    (org-blog-mode)
    (org-blog-buffer-merge-post `((:blog . ,name)
                                  (:category . "")
                                  (:date ,(current-time))
                                  (:excerpt . "")
                                  (:format . "post")
                                  (:status . "publish")
                                  (:tags . "")
                                  (:title . "")
                                  (:type . "post")))))

The docstring lays it all out, really—get a blog name, set up a buffer, turn on org-mode and org-blog-mode and merge in a basically-empty post structure.

Now some might wonder why I don’t just slap a default chunk of string content into the buffer, rather than merging a post structure that’s got a bunch of empty fields into the buffer, as it would probably be less cumbersome that way.

The answer is that down the road I would like for people to be able to set some of the fields in the default post structure in the definition of the blog, so we have a standard way to set initial defaults on a per-blog basis.

This might seem like overkill, but I’m thinking of this scenario: say you have a blog that’s aggregated on Planet Foo. However, in addition to your interest in Foo, you are also a follower of US politics—and when you post about that, there’s always people getting riled up. Most people in that situation will ask to have the planet pull a feed from a particular category or tag, so that only posts about Foo go to the planet. If you’re like me, though, you may forget to tag all Foo posts appropriately, and so some content would get missed. With this mechanism in place, it would be possible for you to have a “foo” blog definition that included (:category . "Foo") so that when you did org-blog-new, you could choose the “foo” blog and get a preloaded template that will make sure your post goes to the right place.

The code to do that obviously isn’t there yet—hence our hard-coded post structure—but I don’t think it will be too hard to implement down the road, and I think it could be a compelling feature.

Now, with all that out of the way, let’s look at one of the tests:

  (ert-deftest ob-test-org-blog-new-from-alist ()
    "Test creating a new blog post with an alist"
    (with-mock
     (stub current-time => '(20738 4432))
     (let ((org-blog-alist '(("bar")))
           (post-string "\
#+POST_BLOG: bar
#+POST_CATEGORY: 
#+DATE: [2013-01-25 Fri 00:00]
#+DESCRIPTION: 
#+POST_STATUS: publish
#+KEYWORDS: 
#+TITLE: 
#+POST_TYPE: post
"))
       (org-blog-new)
       (should (string= (org-no-properties (buffer-string)) post-string)))))

As a practical matter, it is simplest to stub out current-time so that we can have a predictable time string (in the actual source there are also variations that test with stubs for completing-read, just like in our org-blog-get-name tests). We call org-blog-new, and examine the resulting content for sanity. Pretty simple.

So, that’s it for setting up for a new post. The next installment will start to tackle the question of posting to a blog.