Recently for a client project I’ve been studying operational transformation, which sounds significantly more sci-fi than it actually is.
If you aren’t familiar with OT, the premise is this: if you and I are editing a document in real time then we can’t just keep sending up our drafts to the server because we’ll overwrite each other (and also not get each others’ changes). Instead, we send up operations.
But if you and I send up, say, edits
B (respectively) to a server and the server arbitrarily applies mine first, it can’t simply apply
B was an edit to an earlier version of the document and it may not make sense anymore.
While real-time collaborative editing has a lot of moving parts, the heart of it is a transformation algorithm which, in the scenario described above, transforms
B into something which can be applied after
I decided I wanted to implement this transformation algorithm myself to better understand how it works, but for this to be meaningful (and to know if it actually worked) I need to first come up with a way to represent documents and edits, and apply said edits to said documents.
Thus this first article will explain how I created a domain specific language (DSL) for document operations as well as an accompanying interpreter which computes a new document from an original, based on the operations. The techniques used here give me several nice things:
- My DSL takes advantage of Haskell’s type checker automatically;
- The syntax of the language is separate from the semantics, meaning I may write several different interpreters for the same language; and
- Because of that separation it’s easy to transform the
Operationexpressions before ever caring what they do.
You can view the entire code listing here and play along at home!
1 Free falling
In OT, there are three kinds of edits: insertions, deletions, and skips (also called “retains”), and those will be the three commands in our operation language.
OT assumes that you have a document and a cursor at some position in that document. Insertion, uh, inserts characters and moves the cursor forward. Deletion keeps the cursor still but deletes some number of characters after the cursor. Retaining just skips the cursor ahead without affecting the document.
For our purposes, we can define documents like so:
This makes the type signatures a little bit more expressive.
free package on Hackage defines what is called the free monad. If you’re not comfortable with the term “monad” that’s actually great. What you need to know is: monads define imperative languages and side effects they may perform, and how they may perform them.
Free type exported from this package is the common structure of all monads distilled into a “just add water” data type. All you have to do is provide the grammar of the language you want to create.
Here is what
Free looks like, with some naming changes to make it more intuitive:
At heart, all monads define either a finished computation of some value
a, or declare there is more to be done. That’s really it. The
f represents your grammar, limiting how expressions in your language interact with one another.
Without further ado, then, here is our
data OperationGrammar k = Insert String k | Delete Int k | Retain Int k deriving (Show, Functor) type Operation = Free OperationGrammar () insert :: String -> Operation insert str = liftF $ Insert str () delete :: Int -> Operation delete n = liftF $ Delete n () retain :: Int -> Operation retain n = liftF $ Retain n ()
And with that you have created an embedded language. You can define operations like so:
Note the definition of
Operation. Monads in general are computations of some base type and the result of the computation may often be important. In our case however the language is only useful for its side effects. Because of this we can safely “plug the hole” with
2 A cursory overview
We have said that operations essentially dictate how to edit a document at a given cursor position, which may result in changes to both the document and the cursor position.
Thus we will say that an operation is interpreted by a cursor representing a document and a position.
Just as there is a
Free monad there is a
Cofree comonad. Calm down. Breathe.
Monads define languages with side effects and comonads define interpreters with internal environments. That’s really it.
Cofree definition is this:
You can think of
a as the internal state of the comonad. At any given point in a comonad computation you are in a state and you have the possibility of moving to a different state. In the case of cofree comonads, your options are defined by - you guessed it - some functor type.
We want an interpreter which can process the
OperationGrammar commands we wrote. So, here goes:
Cursor a is an interpreter which modifies an internal value of type
a using one of three different handler functions. The handlers will accept the arguments from their corresponding operations and then produce (ultimately) a new
Cursor a. The
Cofree type handles tying all the particular knots.
Now, we might want different kinds of interpreters for different situations. One might pretty-print
Operation expressions to a log. One might apply them to a document. One might send them over the network and then apply edits from other people first.
We want a document editor, specifically, which keeps track of a document string and a cursor position. So let’s define it:
Now we need to create an
free package provides the
coiter function, which is a bit of an opaque name. However, refer back to the
Cofree definition. A cofree comonad at all times has a current internal value and a way to get to a “next” value. This makes it a bit like a stream of values that lets you pick which branch you want to explore.
coiter is a way of building a comonad stream in a way that is not nearly as complicated as it sounds.
Let’s start at the top level:
We provide a starting state (most likely
(0, "") but it could be anything) and a function
next which takes a cofree comonad and produces an interpreter value that contains the next comonad. I know, I know. Let it sink in.
Let’s define these auxiliary functions.
Data.List in the base libraries.)
coInsert accepts a state value (in this case,
(Int, Document)) and a string to insert, and produces a new state value. That’s really it. This really showcases the elegance of the (co)free approach to DSLs: the behavior of a command is defined separately from the declaration of the command, and indeed it may not be the only behavior.
coRetain are similar:
And with that, we can create a new
newEditor (0, "") and … well, actually, we still need to feed
Operations into the
Editor to get the result document. Hm.
3 Run away!!!
We have a grammar for document operations and an interpreter for those operations and how they affect a document and cursor position. We need to define how these may be combined.
The tricky thing is, the language and interpreter are each composites of two types. The
Free layer of an operation tells the
Cofree layer of the interpreter whether to keep going or yield the latest state value; and
OperationGrammar dictates which
CursorF function will be used to modify the state value.
The following type class, with a different name, is stolen from David Laing:
run method will accept the final monad and comonad computations as well as a function saying what to do with them.
We will define instances for both
OperationGrammar and let the compiler decide what we mean when we say
instance Run f g => Run (Cofree f) (Free g) where run p (a :< _) (Pure x) = p a x run p (_ :< fs) (Free gs) = run (run p) fs gs instance Run CursorF OperationGrammar where run f (CursorF i _ _) (Insert s k) = f (i s) k run f (CursorF _ d _) (Delete n k) = f (d n) k run f (CursorF _ _ r) (Retain n k) = f (r n) k
The idea behind all three is we give the operation argument to the appropriate handler, which will produce the correct next
Cursor. Both that and the next
Operation are then given to
With this we may define the
edit function to transform a
Document with an
Since the base value of every
() we use
const to basically throw away the monadic result.
4 Let’s try it out!
And there you have it! We created a language for document operations and an interpreter which could actually transform documents with it.
My next article will show how to actually do the OT part of this. Until then!