Creating a relationship between a buffer and a post (part 2)

Last time, we looked at what was involved in extracting the information we wanted from our buffer and putting it into a post structure. Today we’ll look at the complimentary operation of merging the data from a post into a buffer.

This is a very important part of the process—there are certain things that we must record, like the id that the post is assigned by the blogging software, if we’re going to be able to really maintain the blog from within org-blog. While it would be very simple to skip this part, imagine if, once you had posted an entry for the first time, you had to log into your site in order to edit it? That would be a failing user experience. So, merge we must.

Really, though, the process is fairly straightforward. First, though, we want to establish a mapping between the property used within our post structure, and the property name (in the org-mode sense) that is used within the buffer:

(defconst mapping (list (cons :blog "POST_BLOG")
                        (cons :category "POST_CATEGORY")
                        (cons :date "DATE")
                        (cons :excerpt "DESCRIPTION")
                        (cons :id "POST_ID")
                        (cons :link "POST_LINK")
                        (cons :name "POST_NAME")
                        (cons :parent "POST_PARENT")
                        (cons :status "POST_STATUS")
                        (cons :tags "KEYWORDS")
                        (cons :title "TITLE")
                        (cons :type "POST_TYPE")))

Looking at this again with fresh eyes makes me realize that this data structure is going to get refactored before too long. As an example of why, let’s look back at a piece of yesterday’s code:

'(("POST_BLOG" :blog)
  ("POST_CATEGORY" :category)
  ("POST_ID" :id)
  ("POST_LINK" :link)
  ("POST_NAME" :name)
  ("POST_PARENT" :parent)
  ("POST_STATUS" :status)
  ("POST_TYPE" :type))

The fact that these two bits of structure are largely redundant should be pretty obvious. And I do actually have a plan for refactoring this (and a few other things) in a way that I think will clean up a lot of code. But I want to get the first version out before I worry about that too much—what I have is working, and I’d rather have people using it.

Anyway, that mapping structure is used in the function that actually update the buffer. The idea is pretty simple, really—pull out the current post values in the buffer, then iterate over the new values, making sure they’re formatted correctly, then examining the current value. If it’s nil, insert the new value at the head of the buffer, otherwise compare with the exiting value, and When they differ, find the pertinent property definition and update it.

To make sure that stuff goes in with a semblance of order, we sort a copy of the list before we start iterating—though, in truth, this doesn’t work as well as I’d like because we’re sorting on the property name, but inserting the property string, which may have “POST_” included. But that’s for another refactoring.

(defun org-blog-buffer-merge-post (merge)
  "Merge a post into a buffer.

Given a post structure (presumably returned from the server),
update the buffer to reflect the values it contains."
  (save-excursion
    (save-restriction
      ;; Get the current values
      (let ((current (org-blog-buffer-extract-post)))     (extract)
        (mapc                                             (iterate)
         (lambda (item)
           (let ((k (car item))
                 (v (cdr item))
                 val existing)
             (when (cdr (assq k mapping))
               (setq val (cond ((eq v nil)                (format)
                                (print "setting val to nil")
                                nil)
                               ((eq k :date)
                                (format-time-string "[%Y-%m-%d %a %H:%M]" (car v)))
                               ((listp v)
                                (mapconcat 'identity v ", "))
                               ((stringp v) 
                                v)
                               (t
                                "default")))
               (goto-char (point-min))
               ;; (print (format "Comparison for %s is %s against %s" k v (cdr (assq k current))))
               (cond
                ;; Inserting a new keyword
                ((eq (cdr (assq k current)) nil)          (new)
                 (when val
                   (insert (concat "#+" (cdr (assq k mapping)) ": " val "\n"))))
                ;; Updating an existing keyword
                ((not (equal (cdr (assq k current)) val)) (update)
                 (let ((re (org-make-options-regexp (list (cdr (assq k mapping))) nil))
                       (case-fold-search t))
                   (re-search-forward re nil t)
                   (replace-match (concat "#+" (cdr (assq k mapping)) ": " val) t t)))))))
         ;; Reverse sort fields to insert alphabetically
         (sort                                            (sort)
          (copy-alist merge)
          '(lambda (a b)
             (string< (car b) (car a)))))))))

When you come down to it, the process really is simple enough. The refactoring I envision is to create a table of all our post properties and the processes that need to be run to convert from post to buffer and back again, so that this routine becomes much more straightforward—map over each item, apply its formatter, see if it’s new and/or matches and behave appropriately. This could also be used in the extraction routine.

Anyway, I’m only going to show the last test, where we actually round-trip our structure. We create a temporary buffer, merge in a post structure, then extract a post from the resulting buffer, compare it against what we expect, then merge it back in a second time, and make sure that it matches again.

(ert-deftest ob-test-merge-round-trip ()
  "Try merging a full post into a full buffer, and make sure
you get the same thing out."
  (with-temp-buffer
    (let ((post-string "#+POST_BLOG: t2b
#+POST_CATEGORY: t2c1, t2c2
#+DATE: [2013-01-25 Fri 00:00]
#+DESCRIPTION: t2e
#+POST_ID: 1
#+POST_LINK: http://example.com/
#+POST_NAME: t2n
#+POST_PARENT: 0
#+POST_STATUS: publish
#+KEYWORDS: t2k1, t2k2, t2k3
#+TITLE: Test 2 Title
#+POST_TYPE: post
")
          (post-struct '((:blog . "t2b")
                         (:category "t2c1" "t2c2")
                         (:content . "\n")
                         (:date (20738 4432))
                         (:excerpt . "t2e")
                         (:id . "1")
                         (:link . "http://example.com/")
                         (:name . "t2n")
                         (:parent . "0")
                         (:status . "publish")
                         (:tags "t2k1" "t2k2" "t2k3")
                         (:title . "Test 2 Title")
                         (:type . "post"))))
      (org-blog-buffer-merge-post post-struct)
      (should (equal (buffer-string) post-string))
      (should (equal (org-blog-buffer-extract-post) post-struct))
      (org-blog-buffer-merge-post post-struct)
      (should (equal (buffer-string) post-string)))))

Creating a relationship between a buffer and a post (part 1)

In order to support multiple blogging back-ends, it is necessary that we work at some level of abstraction. One piece of blog software’s notion of tags isn’t necessarily going to line up with another’s, etc. So we introduce the notion of a post:

A post is an alist consisting of the fields:

:blog (#+POST_BLOG)
A string naming an entry in org-blog-alist
:category (#+POST_CATEGORY)
A list of strings naming categories to which the post belongs
:content (body after export)
A string containing HTML-formatted content
:date (#+DATE)
A date and time for the post
:excerpt (#+DESCRIPTION)
A string containing an optional excerpt of the post
:id (#+POST_ID)
A string containing a unique ID (generally numeric) for the post
:link (#+POST_LINK)
A string containing a link to the permanent location of the post
:name (#+POST_NAME)
A string containing the canonical name for the post
:parent (#+POST_PARENT)
A string containing a unique ID (generally numeric) for the parent of the post
:status (#+POST_STATUS)
A string denoting the status (`draft’, `published’) of the post
:tags (#+KEYWORDS)
A list of strings representing the names of tags
:title (#+TITLE)
A string containing the title of the post
:type (#+POST_TYPE)
A string containing an optional format for the post

It’s not absolutely essential that every field be present; parent and excerpt, for instance are pretty thoroughly optional. Some fields are really intended to be filled in by the blogging software, like id and link. One thing I did do was, whenever it seemed to make sense, I used a standard org-mode property name—so :date is derived from #+DATE, for instance. Whenever I “make up” a property name, I keep it in the #+POST_ namespace, to try and avoid collisions.

So, given a buffer, how do we get to a post? The answer is: the org-mode exporter.

Now the code I’m presenting here works with org-mode < 8.0. I’m hoping, once I’ve gotten this initial round of development all worked out, that I’ll be able to convert over to using that interface, which, based on my light reading, should be somewhat nicer to work with. We’ll probably end up with our own org-blog-post export format that will work in a fairly standard fashion. But that’s for later. For now:

(defun org-blog-buffer-extract-post ()
  "Transform a buffer into a post.

We do as little processing as possible on individual items, to
retain the maximum flexibility for further transformation."
  (save-excursion
    (save-restriction
      (let ((org-export-inbuffer-options-extra '(("POST_BLOG" :blog)
                                                 ("POST_CATEGORY" :category)
                                                 ("POST_ID" :id)
                                                 ("POST_LINK" :link)
                                                 ("POST_NAME" :name)
                                                 ("POST_PARENT" :parent)
                                                 ("POST_STATUS" :status)
                                                 ("POST_TYPE" :type)))
            (org-export-date-timestamp-format "%Y%m%dT%T%z")
            (org-export-with-preserve-breaks nil)
            (org-export-with-priority nil)
            (org-export-with-section-numbers nil)
            (org-export-with-sub-superscripts nil)
            (org-export-with-tags nil)
            (org-export-with-toc nil)
            (org-export-with-todo-keywords nil))
        (sort
         (list (cons :blog (property-trim :blog))
               (cons :category (property-split :category))
               (cons :date (let ((timestamp (property-trim :date)))
                             (when timestamp
                               (list (date-to-time timestamp)))))
               (cons :excerpt (property-trim :description))
               (cons :id (property-trim :id))
               (cons :link (property-trim :link))
               (cons :name (property-trim :name))
               (cons :parent (property-trim :parent))
               (cons :status (property-trim :status))
               (cons :tags (property-split :keywords))
               (cons :title (property-trim :title))
               (cons :type (property-trim :type))
               (cons :content (org-no-properties (condition-case nil
                                                     (org-export-as-html nil nil nil 'string t nil)
                                                   (wrong-number-of-arguments
                                                    (org-export-as-html nil nil 'string t nil))))))
         '(lambda (a b)
            (string< (car a) (car b))))))))

org-blog-buffer-extract-post starts off with what may actually be a bit of superfluous code—I know that org-export-as-html calls save-excursion, so it might not actually be necessary for us to do it. But I’d rather be safe. The same is true for the save-restriction.

We then make sure that the exporter will pick up our custom properties by adding them to org-export-inbuffer-options-extra, and we set a number of items that describe things about what the export will end up including and/or how particular items will look. In fact, these should all be override-able for an individual post by using the #+OPTIONS property—these are just the defaults that I think are sane.

Then the magic happens.

If you’re not used to a very functional style of programming, this code may be a little confusing—all the action is really happening down at the bottom of the function, where org-export-as-html is being called. In fact, if I’m truthful, I’m vaguely amazed it works at all.

See, when org-export-as-html gets run, in addition to returning the document transformed into HTML, it places a bunch of meta-data in the org-infile-property-plist. Our function property-trim is a wrapper for pulling values out of that list and removing any leading spaces:

(defun property-trim (k)
  "Get a property value trimmed of leading spaces."
  (let ((v (plist-get (org-infile-export-plist) k)))
    (when v
      (replace-regexp-in-string "^[[:space:]]+" "" v))))

We run that across most of the property items to get a good value. We also have a variant, property-split, that will split a value on commas, returning a list:

(defun property-split (k)
  "Get a property value trimmed of leading spaces and split on commas."
  (let ((v (property-trim k)))
    (when v
      (split-string v "\\( *, *\\)" t))))

This is used in possibly multi-valued fields, as for tags or categories.

If you look closely, you can see org-export-as-html getting run in order to provide the value for the :content field. But looking at the code again—and this is some of the first code I wrote—I don’t know how that is guaranteed to happen before everything else starts looking at the property list items.

Perhaps it will all become clearer (and less side-effect-y) with the new exporter.

Anyway, time to write a test or two. We’ll begin by extracting a post structure from an empty buffer:

(ert-deftest ob-test-extract-from-empty ()
  "Try extracting a post from an empty buffer."
  (with-temp-buffer
    (should (equal (org-blog-buffer-extract-post) '((:blog)
                                                    (:category)
                                                    (:content . "\n")
                                                    (:date)
                                                    (:excerpt)
                                                    (:id)
                                                    (:link)
                                                    (:name)
                                                    (:parent)
                                                    (:status)
                                                    (:tags)
                                                    (:title)
                                                    (:type))))))

As we would expect, we end up with an alist that is basically devoid of values, except for the content, which is pretty darn bare. In fact, down the road, we will probably do some more massaging of content that will change even that, but we test against what we have now.

Then we build a test that actually extracts some content. Including

(ert-deftest ob-test-extract-from-empty ()
  "Try extracting a post from an empty buffer."
  (with-temp-buffer
    (insert "\
 #+POST_BLOG: t1b
 #+POST_CATEGORY: t1c1, t1c2
 #+DATE: [2013-01-25 Fri 00:00]
 #+DESCRIPTION: t1e
 #+POST_ID: 1
 #+POST_LINK: http://example.com/
 #+POST_NAME: t1n
 #+KEYWORDS: t1k1, t1k2, t1k3
 #+TITLE: Test 1 Title
 #+POST_TYPE: post

Just a little bit of content.")
    (should (equal (org-blog-buffer-extract-post) '((:blog . "t1b")
                                                    (:category "t1c1" "t1c2")
                                                    (:content . "\n\n<p>Just a little bit of content\n</p>")
                                                    (:date (20738 4432))
                                                    (:excerpt . "t1e")
                                                    (:id . "1")
                                                    (:link . "http://example.com/")
                                                    (:name . "t1n")
                                                    (:parent)
                                                    (:status)
                                                    (:tags "t1k1" "t1k2" "t1k3")
                                                    (:title . "Test 1 Title")
                                                    (:type . "post"))))))

And that’s it for now. Next time we’ll look at the process of merging a post structure back into a buffer. Once we have our two-way transformation capability, the world is our mollusk. Well, once we have that and a little XML-RPC code.

Creating a new post

Picking up from where we left off yesterday, let’s think about what we want our work-flow to be.

In terms of how the user uses org-blog I want it to get out of the way as much as possible, so I’m going to try and keep the majority of the user’s interaction focused on two actions: starting a new post and saving it.

I have a couple of different blogs I post to, so I also want to make sure that org-blog seamlessly supports managing content for more than one blog.

If we’re not going to have to type everything that describes a blog in each time we create or save a post, we have to have that configured somewhere. So right off the bat, we need to create a place to put our config info:

(defcustom org-blog-alist nil
  "An alist for specifying blog information.

There are a number of parameters.  Some day I will enumerate
them.")

OK, the docstring is a little bit of a cop-out, but we don’t yet know what parameters will be pertinent. With a place to list the blogs we work with regularly, let’s look at creating a new post.

The first step will be to figure out what blog the user wants to write the post for. If there are no blogs configured, we can accept any name the user wants to give us. If there’s only one blog configured, we can reasonably assume that’s it. If there’s more than one, we should prompt with the available choices.

(defun org-blog-get-name (&optional post)
  "Get a name of a blog, perhaps working from a post.

If we're given a post structure, we will extract the blog name from it.
Otherwise, if there's only one entry in the `org-blog-alist', we
will use that entry by default, but will accept anything, as long
as the user confirms it, and if they don't enter anything at all,
we default to unknown."
  (or (cdr (assoc :blog post))
      (and (equal (length org-blog-alist) 1)
           (caar org-blog-alist))
      (empty-string-is-nil (completing-read
                            "Blog to post to: "
                            (mapcar 'car org-blog-alist) nil 'confirm))
      "unknown"))

You might be confused about the optional post parameter. My crystal ball tells me that we will also want this function to be able to tell us what blog is associated with an already-existing post, so we have the option of passing in a post structure that will be consulted for a :blog entry, which we will prefer to anything else. Other than that the code is pretty much what I laid out above.

There is a reference to a small supplementary function that I was a little surprised I needed—it turns out that completing-read will return the empty string if the user just hits enter. This doesn’t get us any useful information, so I wrote this short function to coerce the empty string to nil, so the or will fall through in that event:

(defun empty-string-is-nil (string)
  "Return any string except the empty string, which is coerced to nil."
  (unless (= 0 (length string))
    string))

Now to write some tests. We want to test getting the blog name in all the ways that are available. For the last two tests, where we’re testing code paths that depend on the output of completing-read, we take advantage of the el-mock library to sub in a version that returns a constant that represents what we want to hear.

(ert-deftest ob-test-get-name-from-blog ()
  "Test getting the blog name from a blog spec"
  (should (string= (org-blog-get-name '((:blog . "foo"))) "foo")))
(ert-deftest ob-test-get-name-from-alist ()
  "Test getting the blog name from the alist"
  (let ((org-blog-alist '(("bar"))))
    (should (string= (org-blog-get-name) "bar"))))
(ert-deftest ob-test-get-name-from-completing-read ()
  "Test getting the blog name from completing-read"
  (with-mock
   (stub completing-read => "baz")
   (should (string= (org-blog-get-name) "baz"))))
(ert-deftest ob-test-get-name-from-default ()
  "Test getting the blog name from default"
  (with-mock
   (stub completing-read => "")
   (should (string= (org-blog-get-name) "unknown"))))

The design and implementation of org-blog

So, after some quantity of hacking, I have produced a first, basic, working version of org-blog (currently available on the dirty branch on github).

However, its git history reflects all the twists and turns and dead-ends and unfortunate omnibus-commits I did along the way. I really don’t want to have that be the basis for future development, etc., so I’m going to start filtering things into the master branch on github in logical, digestible, clean chunks.

Having made that choice, I figured, “What the hell?”, I might as well get some blog content out of it as well.

So, for the next however long it takes, I’ll be refactoring the code as it stands into logical sets of changes, and posting articles discussing them, as well all the other issues surrounding the implementation.

My ambition, of course, is to get a post written each day, but don’t hold me to that.

So, to start us out, here’s the beginning of our mode:

(provide 'org-blog)

(define-minor-mode org-blog-mode
  "Toggle org-blog mode.

With no argument, the mode is toggled on/off.
Non-nil argument turns mode on.
Nil argument turns mode off.

Commands:

\\{org-blog-mode-map}"
  :init-value nil
  :lighter " org-blog")

This is about as minimalist a minor mode as you can get—it does nothing at all, and even uses the define-minor-mode macro to avoid doing most of the work itself. But it’s a start.

One other thing I really believe in, though, is testing. So we’re going to try and keep everything reasonably tested as we go along. So here’s the test to make sure that enabling org-blog-mode goes as expected:

(ert-deftest ob-test-enable-org-blog-mode ()
    "Test turning on the org-blog minor mode"
    (with-temp-buffer
      (org-blog-mode)
      (eq org-blog-mode 1)))

That’s it until next time.

The return of ‘Do You Even Lisp?’

I didn’t expect to drop off the face of the planet for nearly two
months.

But I decided to go off and learn lisp “in my spare time”—which took
a while. And then I decided that the best way to do that was to write
a minor mode for blogging to complement org-mode. And then I decided
that to give myself incentive, I wouldn’t do any blogging until I had
it working.

It took a little longer than I had hoped—in part because I think I
was overly ambitious; I was going to learn a new language well enough
to write an extension in it, and make it as functionally-pure as I
could, and make sure that it’s well tested….etc., etc. I wasn’t
just going to hack something out.

So it took a while. And there are still rough edges that I will
continue to file off. But if you can see this post, you know it’s
working.

Keeping what I need to know at finger’s length

I’ve been trying to cram in so much information so quickly, I’m starting (hah!) to realize that it’s not all sticking.

So, what better tool to use than Emacs to solve my problem with not remembering Emacs commands?

My solution is simple—create a cheat-sheet. The great thing about Emacs is that this doesn’t have to be a piece of paper, it can be a file that I can maintain in org-mode, just like this blog. In fact, I can also maintain it as a page on the blog. And with the most trivial bit of elisp, I can make sure that I can get to it with no more than two easy-to-remember keystrokes:

(define-key global-map [?\C-h ?\C-h]
  '(lambda ()
     (interactive)
     (view-file "~/org/doyouevenlisp.com/cheatsheet.org")))

The joy of an integrated environment

Most of my GNU Emacs learning time has continued to be spent writing elisp. I am very slowly starting to retrain myself on a couple of basic keystrokes: M-g M-g (goto-line) and M-% (query-replace) are two biggies, because my use of ido-ubquitous mode actually means that my default use of M-x to get to both of those is disrupted—the minibuffer no longer autocompletes in the same way, so if I’m going to have to relearn how to get to them, I should really re-learn the short versions.

Anyway, I did want to take a moment to sing the praises of the GNU Emacs Help facility, because it has been invaluable. Specifically C-h f (describe-function) and C-h v (describe-variable). While the help they give is necessarily brief, it’s often enough for someone like me, who just needs a gentle prod about something now and again. I even used C-h f to look up the documentation for if a few minutes ago, because I couldn’t remember if you had to do progn to do a multi-statement else block.

So that’s it for today. Learn the rich set of commands that let you get right to what you need in the documentation—it will be invaluable for exploring emacs.

Weekly Wrap-Up #3

So this is the two-days-late Weekly Wrap-Up #3. I spent the last two days doing my first real batch of elisp hacking of any significance, so I don’t feel too bad about missing the normal rhythm of things.

My big winner for this week is C-M-@ (mark-sexp), which made several moments in elisp hacking bearable—there’s inevitably a moment in any lisp code where the right parens start to pile up, and good luck with trying to figure out what to mark by hand.

I’m still not habituated to using TRAMP for editing files using sudo or ssh, which is unfortunate since I do that a lot. And the mark ring stuff still eludes me.

I think this week I’m going to take on registers and rectangles, and perhaps start building up a cheat-sheet I can use to keep the things I’ve discovered a little more present.

Refining org2blog

No updates to be made, because I spent the entire day working on org2blog. I actually kinda liked it. A lot.

I’ve now rewritten the XML-RPC back-end to use the new, documented, WordPress API. Right at the moment, this is just running-in-place, but I hope to use the new code to simplifying things more.

No Weekly Wrapup today

Instead, I spent the time I would normally allocate to writing something for “Do you even lisp?” to enhancing org2blog, the software I’m using to manage this blog.

Well, enhancing might be saying a lot—since I’ve been doing a little WordPress hacking in other contexts, I’ve become aware of WordPress’ new (released with 3.4, so only six months old at this time) “native” XML-RPC API, and I chose to start moving org2blog to use that, and move it away from the hodge-podge of Blogger and MoveableType and metaWeblog APIs that are currently in use.

I hope that over time this will simplify the API, and perhaps result in even better possibilities for interaction—it would, of course, also be great if we could start to abstract away the specifics of a blog’s back-end requirements into a well-defined API so we could easily use whatever is the most featureful back-end for a given blog.

Others might see if differently; we’ll know when I start sending in merge requests.

I also want to automate the process of storing local copies of articles in a hierarchy that mirrors their permalinks and a couple of other things. That I’m also effectively learning elisp at the same time will make all of that very interesting.