<?xml version="1.0" encoding="utf-8"?><feed xmlns="http://www.w3.org/2005/Atom" xml:lang="en-US"><generator uri="https://jekyllrb.com/" version="4.2.2">Jekyll</generator><link href="https://blag.bcc32.com/feed.xml" rel="self" type="application/atom+xml" /><link href="https://blag.bcc32.com/" rel="alternate" type="text/html" hreflang="en-US" /><updated>2022-07-02T17:21:59-04:00</updated><id>https://blag.bcc32.com/feed.xml</id><title type="html">bcc32</title><subtitle>A blog mostly about OCaml, named after a C/C++ compiler.</subtitle><author><name>Aaron L. Zeng</name></author><entry><title type="html">Open Recursion with Modules</title><link href="https://blag.bcc32.com/shenanigans/2019/07/15/open-recursion-with-modules/" rel="alternate" type="text/html" title="Open Recursion with Modules" /><published>2019-07-15T22:40:00-04:00</published><updated>2019-07-15T22:40:00-04:00</updated><id>https://blag.bcc32.com/shenanigans/2019/07/15/open-recursion-with-modules</id><content type="html" xml:base="https://blag.bcc32.com/shenanigans/2019/07/15/open-recursion-with-modules/"><![CDATA[<p>
This post is about implementing open recursion with OCaml modules as
opposed to classes.  It&rsquo;s a complete hack and a terrible idea&#x2014;but
it&rsquo;s fun!
</p>

<div id="table-of-contents">
<h2>Table of Contents</h2>
<div id="text-table-of-contents">
<ul>
<li><a href="#orgd038663">1. What&rsquo;s open recursion?</a></li>
<li><a href="#org2787491">2. Use cases</a></li>
<li><a href="#org77d1d7c">3. Translating to OCaml</a></li>
<li><a href="#org812e8cb">4. Module time</a>
<ul>
<li><a href="#org25adca3">4.1. It doesn&rsquo;t work!</a></li>
<li><a href="#orged43eff">4.2. Recursive module shenanigans</a></li>
</ul>
</li>
</ul>
</div>
</div>

<div id="outline-container-orgd038663" class="outline-2">
<h2 id="orgd038663"><span class="section-number-2">1</span> What&rsquo;s open recursion?</h2>
<div class="outline-text-2" id="text-1">
<p>
<i>Open recursion</i>, in the context of object-oriented programming (OOP),
refers to the ability of a method on an object to call another method
on the same object (&ldquo;self&rdquo;), with the implementation of the second
method not being fixed.  So, for example:
</p>

<div class="org-src-container">
<pre class="src src-python"><span class="org-keyword">class</span> <span class="org-type">Greeter</span>:
    <span class="org-keyword">def</span> <span class="org-function-name">greet</span><span class="org-rainbow-delimiters-depth-1">(</span><span class="org-keyword">self</span><span class="org-rainbow-delimiters-depth-1">)</span>:
        <span class="org-keyword">return</span> <span class="org-string">'Hello, '</span> + <span class="org-keyword">self</span>.addressee<span class="org-rainbow-delimiters-depth-1">()</span>

    <span class="org-keyword">def</span> <span class="org-function-name">addressee</span><span class="org-rainbow-delimiters-depth-1">(</span><span class="org-keyword">self</span><span class="org-rainbow-delimiters-depth-1">)</span>:
        <span class="org-keyword">return</span> <span class="org-string">'World'</span>

<span class="org-variable-name">g</span> = Greeter<span class="org-rainbow-delimiters-depth-1">()</span>
<span class="org-keyword">print</span><span class="org-rainbow-delimiters-depth-1">(</span>g.greet<span class="org-rainbow-delimiters-depth-2">()</span><span class="org-rainbow-delimiters-depth-1">)</span>
</pre>
</div>

<pre class="example">
Hello, World
</pre>


<p>
Now, if we derive a subclass from <code>Greeter</code> and override one of its
methods:
</p>

<div class="org-src-container">
<pre class="src src-python"><span class="org-keyword">class</span> <span class="org-type">Greeter</span>:
    <span class="org-keyword">def</span> <span class="org-function-name">greet</span><span class="org-rainbow-delimiters-depth-1">(</span><span class="org-keyword">self</span><span class="org-rainbow-delimiters-depth-1">)</span>:
        <span class="org-keyword">return</span> <span class="org-string">'Hello, '</span> + <span class="org-keyword">self</span>.addressee<span class="org-rainbow-delimiters-depth-1">()</span>

    <span class="org-keyword">def</span> <span class="org-function-name">addressee</span><span class="org-rainbow-delimiters-depth-1">(</span><span class="org-keyword">self</span><span class="org-rainbow-delimiters-depth-1">)</span>:
        <span class="org-keyword">return</span> <span class="org-string">'World'</span>

<span class="org-keyword">class</span> <span class="org-type">NameGreeter</span><span class="org-rainbow-delimiters-depth-1">(</span>Greeter<span class="org-rainbow-delimiters-depth-1">)</span>:
    <span class="org-keyword">def</span> <span class="org-function-name">__init__</span><span class="org-rainbow-delimiters-depth-1">(</span><span class="org-keyword">self</span>, name<span class="org-rainbow-delimiters-depth-1">)</span>:
        <span class="org-keyword">self</span>.name = name

    <span class="org-keyword">def</span> <span class="org-function-name">addressee</span><span class="org-rainbow-delimiters-depth-1">(</span><span class="org-keyword">self</span><span class="org-rainbow-delimiters-depth-1">)</span>:
        <span class="org-keyword">return</span> <span class="org-keyword">self</span>.name

<span class="org-variable-name">g</span> = NameGreeter<span class="org-rainbow-delimiters-depth-1">(</span><span class="org-string">'Bob'</span><span class="org-rainbow-delimiters-depth-1">)</span>
<span class="org-keyword">print</span><span class="org-rainbow-delimiters-depth-1">(</span>g.greet<span class="org-rainbow-delimiters-depth-2">()</span><span class="org-rainbow-delimiters-depth-1">)</span>
</pre>
</div>

<pre class="example">
Hello, Bob
</pre>


<p>
So, overriding the <code>addressee</code> method changed the behavior of the
<code>greet</code> method.  In other words, the <code>greet</code> method from the parent
class called the <code>addressee</code> method of the subclass, even though that
method doesn&rsquo;t even &ldquo;know&rdquo; that the <code>NameGreeter</code> class exists.
</p>
</div>
</div>

<div id="outline-container-org2787491" class="outline-2">
<h2 id="org2787491"><span class="section-number-2">2</span> Use cases</h2>
<div class="outline-text-2" id="text-2">
<p>
One salient use case for open recursion is in Abstract Syntax Tree
(AST) transformers.  This arises in the land of OCaml preprocessors,
where syntax extensions are implemented by defining a class whose
methods correspond to the mapping functions over different kinds of
OCaml AST nodes.
</p>

<p>
The object system supports open recursion, which allows a transformer
to be defined by inheriting from a default base class, whose methods
are no-ops (they return the input AST node).  Overriding just the
method for the type of AST node the syntax extension is interested in
works correctly because the inherited methods will call the new
implementations instead.
</p>

<p>
You can read more about how open recursion facilitates writing AST
transformers, and more generally about writing ppx&rsquo;s: see whitequark&rsquo;s
excellent <a href="https://whitequark.org/blog/2014/04/16/a-guide-to-extension-points-in-ocaml/">blog post</a>.
</p>
</div>
</div>

<div id="outline-container-org77d1d7c" class="outline-2">
<h2 id="org77d1d7c"><span class="section-number-2">3</span> Translating to OCaml</h2>
<div class="outline-text-2" id="text-3">
<p>
Here&rsquo;s the above Python code translated into OCaml:
</p>

<div class="org-src-container">
<pre class="src src-ocaml"><span class="org-tuareg-font-lock-governing">class</span> <span class="org-function-name">greeter</span> = <span class="org-tuareg-font-lock-governing">object</span> <span class="org-rainbow-delimiters-depth-1">(</span><span class="org-variable-name">self</span><span class="org-rainbow-delimiters-depth-1">)</span>
  <span class="org-tuareg-font-lock-governing">method</span> <span class="org-function-name">greet</span> = <span class="org-string">"Hello, "</span> <span class="org-tuareg-font-lock-operator">^</span> self#addressee
  <span class="org-tuareg-font-lock-governing">method</span> <span class="org-function-name">addressee</span> = <span class="org-string">"World"</span>
<span class="org-tuareg-font-lock-governing">end</span>

<span class="org-tuareg-font-lock-governing">class</span> <span class="org-function-name">name_greeter</span> <span class="org-variable-name">name</span> = <span class="org-tuareg-font-lock-governing">object</span> <span class="org-rainbow-delimiters-depth-1">(</span><span class="org-variable-name">self</span><span class="org-rainbow-delimiters-depth-1">)</span>
  <span class="org-tuareg-font-lock-governing">inherit</span> greeter <span class="org-keyword">as</span> super
  <span class="org-tuareg-font-lock-governing">method</span><span class="org-tuareg-font-lock-operator">!</span> <span class="org-function-name">addressee</span> = name
<span class="org-tuareg-font-lock-governing">end</span>

<span class="org-tuareg-font-lock-governing">let</span> <span class="org-variable-name">g</span> = <span class="org-keyword">new</span> name_greeter <span class="org-string">"Bob"</span>
<span class="org-tuareg-font-lock-governing">let</span> <span class="org-rainbow-delimiters-depth-1">()</span> = print_endline g#greet
</pre>
</div>

<pre class="example">
Hello, Bob
</pre>

<p>
Note the use of <code>method!</code> instead of <code>method</code> in the subclass.  Akin
to the <code>@Override</code> annotation in Java, it asks the compiler to check
that you are overriding an existing method.  If not, the compiler will
issue a warning, in case you have misspelled the method name, for
example.
</p>
</div>
</div>

<div id="outline-container-org812e8cb" class="outline-2">
<h2 id="org812e8cb"><span class="section-number-2">4</span> Module time</h2>
<div class="outline-text-2" id="text-4">
</div>
<div id="outline-container-org25adca3" class="outline-3">
<h3 id="org25adca3"><span class="section-number-3">4.1</span> It doesn&rsquo;t work!</h3>
<div class="outline-text-3" id="text-4-1">
<p>
Now, there&rsquo;s a common feeling in the OCaml community that the OOP part
of the language is largely subsumed by the <a href="https://ocaml.org/learn/tutorials/modules.html">module system</a>, which is a
powerful and versatile abstraction tool.  One of the main claims that
OOP still has over modules is that open recursion is not possible with
modules.  For example, the analogous code written using modules does
not have the desired behavior:
</p>

<div class="org-src-container">
<pre class="src src-ocaml"><span class="linenr"> 1: </span><span class="org-tuareg-font-lock-governing">module</span> <span class="org-tuareg-font-lock-module">Greeter</span> = <span class="org-tuareg-font-lock-governing">struct</span>
<span class="linenr"> 2: </span>  <span class="org-tuareg-font-lock-governing">type</span> <span class="org-type">t</span> = unit
<span class="linenr"> 3: </span>
<span class="linenr"> 4: </span>  <span class="org-tuareg-font-lock-governing">let</span> <span class="org-function-name">create</span> <span class="org-rainbow-delimiters-depth-1">()</span> = <span class="org-rainbow-delimiters-depth-1">()</span>
<span id="coderef-lexical-binding" class="coderef-off"><span class="linenr"> 5: </span>  <span class="org-tuareg-font-lock-governing">let</span> <span class="org-function-name">addressee</span> <span class="org-variable-name">_t</span> = <span class="org-string">"World"</span></span>
<span class="linenr"> 6: </span>  <span class="org-tuareg-font-lock-governing">let</span> <span class="org-function-name">greet</span> <span class="org-variable-name">_t</span> = <span class="org-string">"Hello, "</span> <span class="org-tuareg-font-lock-operator">^</span> addressee _t
<span class="linenr"> 7: </span><span class="org-tuareg-font-lock-governing">end</span>
<span class="linenr"> 8: </span>
<span class="linenr"> 9: </span><span class="org-tuareg-font-lock-governing">module</span> <span class="org-tuareg-font-lock-module">Name_greeter</span> = <span class="org-tuareg-font-lock-governing">struct</span>
<span class="linenr">10: </span>  <span class="org-tuareg-font-lock-governing">type</span> <span class="org-type">t</span> = <span class="org-rainbow-delimiters-depth-1">{</span> name : string <span class="org-rainbow-delimiters-depth-1">}</span>
<span class="linenr">11: </span>  <span class="org-tuareg-font-lock-governing">include</span> <span class="org-tuareg-font-lock-module"><span class="org-rainbow-delimiters-depth-1">(</span></span><span class="org-tuareg-font-lock-module">Greeter :</span><span class="org-type"> module type of Greeter with type t := Greeter.t</span><span class="org-tuareg-font-lock-module"><span class="org-rainbow-delimiters-depth-1">)</span></span>
<span class="linenr">12: </span>  <span class="org-tuareg-font-lock-governing">let</span> <span class="org-function-name">create</span> ~<span class="org-variable-name">name</span> = <span class="org-rainbow-delimiters-depth-1">{</span> name <span class="org-rainbow-delimiters-depth-1">}</span>
<span class="linenr">13: </span>  <span class="org-tuareg-font-lock-governing">let</span> <span class="org-function-name">addressee</span> <span class="org-variable-name">t</span> = t.name
<span class="linenr">14: </span><span class="org-tuareg-font-lock-governing">end</span>
<span class="linenr">15: </span>
<span class="linenr">16: </span><span class="org-tuareg-font-lock-governing">let</span> <span class="org-variable-name">g</span> = <span class="org-tuareg-font-lock-module">Name_greeter.</span>create <span class="org-tuareg-font-lock-label">~name</span>:<span class="org-string">"Bob"</span>
<span class="linenr">17: </span><span class="org-tuareg-font-lock-governing">let</span> <span class="org-rainbow-delimiters-depth-1">()</span> = print_endline <span class="org-rainbow-delimiters-depth-1">(</span><span class="org-tuareg-font-lock-module">Name_greeter.</span>greet g<span class="org-rainbow-delimiters-depth-1">)</span>
</pre>
</div>

<pre class="example">
Hello, World
</pre>

<p>
Because the definition of <code>greet</code> refers to the lexical binding of
<code>addressee</code> in force at the location of the definition, i.e., line
<a href="#coderef-lexical-binding" class="coderef" onmouseover="CodeHighlightOn(this, 'coderef-lexical-binding');" onmouseout="CodeHighlightOff(this, 'coderef-lexical-binding');">5</a>.  That means that overriding the definition of
<code>addressee</code> has no effect except on the behavior of calling
<code>addressee</code> itself.
</p>
</div>
</div>

<div id="outline-container-orged43eff" class="outline-3">
<h3 id="orged43eff"><span class="section-number-3">4.2</span> Recursive module shenanigans</h3>
<div class="outline-text-3" id="text-4-2">
<p>
It turns out we can accomplish open recursion using the module system,
but we need to structure our types a bit differently than using the
straightforward classes-to-modules translation.
</p>

<p>
First, we define a <code>Greeter</code> signature for the overall signature of
the class.  Then, add the base &ldquo;class&rdquo; <code>Greeter</code> as a recursive
functor that takes a module of its own output type as input.  That is
to say,
</p>

<div class="org-src-container">
<pre class="src src-ocaml"><span class="org-tuareg-font-lock-governing">module type</span> <span class="org-tuareg-font-lock-module">Greeter</span> = <span class="org-tuareg-font-lock-governing">sig</span>
  <span class="org-tuareg-font-lock-governing">type</span> <span class="org-type">t</span>
  <span class="org-tuareg-font-lock-governing">type</span> <span class="org-type">ctor</span>

  <span class="org-tuareg-font-lock-governing">val</span> <span class="org-function-name">create</span> : ctor -&gt; t
  <span class="org-tuareg-font-lock-governing">val</span> <span class="org-function-name">addressee</span> : t -&gt; string
  <span class="org-tuareg-font-lock-governing">val</span> <span class="org-function-name">greet</span> : t -&gt; string
<span class="org-tuareg-font-lock-governing">end</span>

<span class="org-tuareg-font-lock-governing">module</span> <span class="org-tuareg-font-lock-module">Greeter</span> <span class="org-variable-name"><span class="org-rainbow-delimiters-depth-1">(</span></span><span class="org-variable-name">Self :</span><span class="org-type"> Greeter</span><span class="org-variable-name"><span class="org-rainbow-delimiters-depth-1">)</span></span><span class="org-variable-name"> </span>:
  <span class="org-tuareg-font-lock-module">Greeter</span> <span class="org-tuareg-font-lock-governing">with type</span> <span class="org-type">t</span> = <span class="org-tuareg-font-lock-module">Self.</span>t <span class="org-tuareg-font-lock-governing">and type</span> <span class="org-type">ctor</span> = <span class="org-tuareg-font-lock-module">Self.</span>ctor = <span class="org-tuareg-font-lock-governing">struct</span>
  <span class="org-tuareg-font-lock-governing">type</span> <span class="org-type">t</span> = <span class="org-tuareg-font-lock-module">Self.</span>t
  <span class="org-tuareg-font-lock-governing">type</span> <span class="org-type">ctor</span> = <span class="org-tuareg-font-lock-module">Self.</span>ctor

  <span class="org-tuareg-font-lock-governing">let</span> <span class="org-variable-name">create</span> = <span class="org-tuareg-font-lock-module">Self.</span>create
  <span class="org-tuareg-font-lock-governing">let</span> <span class="org-function-name">addressee</span> <span class="org-variable-name">_t</span> = <span class="org-string">"World"</span>
  <span class="org-tuareg-font-lock-governing">let</span> <span class="org-function-name">greet</span> <span class="org-variable-name">t</span> = <span class="org-string">"Hello, "</span> <span class="org-tuareg-font-lock-operator">^</span> <span class="org-tuareg-font-lock-module">Self.</span>addressee t
<span class="org-tuareg-font-lock-governing">end</span>
</pre>
</div>

<p>
And now we can define the <code>Greeter</code> &ldquo;class&rdquo; as the <code>Greeter</code> functor
instantiated on itself:
</p>

<div class="org-src-container">
<pre class="src src-ocaml"><span class="org-tuareg-font-lock-governing">module rec</span> <span class="org-tuareg-font-lock-module">G</span> : <span class="org-rainbow-delimiters-depth-1">(</span><span class="org-tuareg-font-lock-constructor">Greeter</span> <span class="org-tuareg-font-lock-governing">with type</span> <span class="org-type">t</span> = unit <span class="org-tuareg-font-lock-governing">and type</span> <span class="org-type">ctor</span> = unit<span class="org-rainbow-delimiters-depth-1">)</span> = <span class="org-tuareg-font-lock-constructor">Greeter</span> <span class="org-rainbow-delimiters-depth-1">(</span><span class="org-tuareg-font-lock-governing">struct</span>
    <span class="org-tuareg-font-lock-governing">type</span> <span class="org-type">t</span> = unit
    <span class="org-tuareg-font-lock-governing">type</span> <span class="org-type">ctor</span> = unit

    <span class="org-tuareg-font-lock-governing">include</span> <span class="org-tuareg-font-lock-module"><span class="org-rainbow-delimiters-depth-2">(</span></span><span class="org-tuareg-font-lock-module">G :</span><span class="org-type"> Greeter with type t := unit and type ctor :</span>=<span class="org-type"> unit</span><span class="org-tuareg-font-lock-module"><span class="org-rainbow-delimiters-depth-2">)</span></span>

    <span class="org-tuareg-font-lock-governing">let</span> <span class="org-function-name">create</span> <span class="org-rainbow-delimiters-depth-2">()</span> = <span class="org-rainbow-delimiters-depth-2">()</span>
  <span class="org-tuareg-font-lock-governing">end</span><span class="org-rainbow-delimiters-depth-1">)</span>

<span class="org-tuareg-font-lock-governing">let</span> <span class="org-rainbow-delimiters-depth-1">()</span> =
  <span class="org-tuareg-font-lock-governing">let</span> <span class="org-variable-name">g</span> = <span class="org-tuareg-font-lock-module">G.</span>create <span class="org-rainbow-delimiters-depth-1">()</span> <span class="org-tuareg-font-lock-governing">in</span>
  print_endline <span class="org-rainbow-delimiters-depth-1">(</span><span class="org-tuareg-font-lock-module">G.</span>greet g<span class="org-rainbow-delimiters-depth-1">)</span>
<span class="org-tuareg-font-double-semicolon">;;</span>
</pre>
</div>

<pre class="example">
Hello, World
</pre>

<p>
And now, we define the &ldquo;subclass&rdquo; <code>Name_greeter</code>:
</p>

<div class="org-src-container">
<pre class="src src-ocaml"><span class="org-tuareg-font-lock-governing">type</span> <span class="org-type">'super name_greeter</span> =
  <span class="org-rainbow-delimiters-depth-1">{</span> super : 'super
  ; name : string
  <span class="org-rainbow-delimiters-depth-1">}</span>

<span class="org-tuareg-font-lock-governing">module</span> <span class="org-tuareg-font-lock-module">Name_greeter</span> <span class="org-variable-name"><span class="org-rainbow-delimiters-depth-1">(</span></span><span class="org-variable-name">Self :</span><span class="org-type"> Greeter</span><span class="org-variable-name"><span class="org-rainbow-delimiters-depth-1">)</span></span><span class="org-variable-name"> </span>:
  <span class="org-tuareg-font-lock-module">Greeter</span> <span class="org-tuareg-font-lock-governing">with type</span> <span class="org-type">t</span> = <span class="org-tuareg-font-lock-module">Self.</span>t <span class="org-tuareg-font-lock-governing">and type</span> <span class="org-type">ctor</span> = <span class="org-tuareg-font-lock-module">Self.</span>ctor = <span class="org-tuareg-font-lock-governing">struct</span>
  <span class="org-tuareg-font-lock-governing">type</span> <span class="org-type">t</span> = <span class="org-tuareg-font-lock-module">Self.</span>t
  <span class="org-tuareg-font-lock-governing">type</span> <span class="org-type">ctor</span> = <span class="org-tuareg-font-lock-module">Self.</span>ctor

  <span class="org-tuareg-font-lock-governing">module rec</span> <span class="org-tuareg-font-lock-module">Super</span> : <span class="org-rainbow-delimiters-depth-1">(</span><span class="org-tuareg-font-lock-constructor">Greeter</span> <span class="org-tuareg-font-lock-governing">with type</span> <span class="org-type">t</span> := t <span class="org-tuareg-font-lock-governing">and type</span> <span class="org-type">ctor</span> := ctor<span class="org-rainbow-delimiters-depth-1">)</span> = <span class="org-tuareg-font-lock-constructor">Greeter</span> <span class="org-rainbow-delimiters-depth-1">(</span><span class="org-tuareg-font-lock-constructor">Self</span><span class="org-rainbow-delimiters-depth-1">)</span>

  <span class="org-tuareg-font-lock-governing">let</span> <span class="org-variable-name">create</span> = <span class="org-tuareg-font-lock-module">Self.</span>create
  <span class="org-tuareg-font-lock-governing">let</span> <span class="org-variable-name">addressee</span> = <span class="org-tuareg-font-lock-module">Self.</span>addressee
  <span class="org-tuareg-font-lock-governing">let</span> <span class="org-function-name">greet</span> <span class="org-rainbow-delimiters-depth-1">(</span><span class="org-variable-name">t</span> :<span class="org-type"> t</span><span class="org-rainbow-delimiters-depth-1">)</span> = <span class="org-tuareg-font-lock-module">Super.</span>greet t
<span class="org-tuareg-font-lock-governing">end</span>

<span class="org-tuareg-font-lock-governing">module rec</span> <span class="org-tuareg-font-lock-module">NG</span> : <span class="org-rainbow-delimiters-depth-1">(</span><span class="org-tuareg-font-lock-constructor">Greeter</span> <span class="org-tuareg-font-lock-governing">with type</span> <span class="org-type">t</span> = unit name_greeter <span class="org-tuareg-font-lock-governing">and type</span> <span class="org-type">ctor</span> = unit * string<span class="org-rainbow-delimiters-depth-1">)</span> =
  <span class="org-tuareg-font-lock-constructor">Name_greeter</span> <span class="org-rainbow-delimiters-depth-1">(</span><span class="org-tuareg-font-lock-governing">struct</span>
    <span class="org-tuareg-font-lock-governing">include</span> <span class="org-tuareg-font-lock-module">NG</span>

    <span class="org-tuareg-font-lock-governing">let</span> <span class="org-function-name">create</span> <span class="org-rainbow-delimiters-depth-2">(</span><span class="org-variable-name">super</span>, <span class="org-variable-name">name</span><span class="org-rainbow-delimiters-depth-2">)</span> = <span class="org-rainbow-delimiters-depth-2">{</span> super; name <span class="org-rainbow-delimiters-depth-2">}</span>
    <span class="org-tuareg-font-lock-governing">let</span> <span class="org-function-name">addressee</span> <span class="org-variable-name">t</span> = t.name
  <span class="org-tuareg-font-lock-governing">end</span><span class="org-rainbow-delimiters-depth-1">)</span>

<span class="org-tuareg-font-lock-governing">let</span> <span class="org-variable-name">g</span> = <span class="org-tuareg-font-lock-module">NG.</span>create <span class="org-rainbow-delimiters-depth-1">(</span><span class="org-rainbow-delimiters-depth-2">()</span>, <span class="org-string">"Bob"</span><span class="org-rainbow-delimiters-depth-1">)</span>
<span class="org-tuareg-font-lock-governing">let</span> <span class="org-rainbow-delimiters-depth-1">()</span> = print_endline <span class="org-rainbow-delimiters-depth-1">(</span><span class="org-tuareg-font-lock-module">NG.</span>greet g<span class="org-rainbow-delimiters-depth-1">)</span>
</pre>
</div>

<p>
Which prints:
</p>

<pre class="example">
Hello, Bob
</pre>

<p>
Success!  <i>But at what cost&#x2026;</i>
</p>
</div>
</div>
</div>]]></content><author><name>Aaron L. Zeng</name></author><category term="shenanigans" /><category term="ocaml" /><summary type="html"><![CDATA[This post is about implementing open recursion with OCaml modules as opposed to classes. It&rsquo;s a complete hack and a terrible idea&#x2014;but it&rsquo;s fun!]]></summary></entry><entry><title type="html">Irregular Expressions</title><link href="https://blag.bcc32.com/uncategorized/2017/12/02/irregular-expressions/" rel="alternate" type="text/html" title="Irregular Expressions" /><published>2017-12-02T02:05:02-05:00</published><updated>2017-12-02T02:05:02-05:00</updated><id>https://blag.bcc32.com/uncategorized/2017/12/02/irregular-expressions</id><content type="html" xml:base="https://blag.bcc32.com/uncategorized/2017/12/02/irregular-expressions/"><![CDATA[<p>
If you&rsquo;ve worked extensively with strings, you&rsquo;ve probably encountered regular
expressions (abbreviated &ldquo;regex&rdquo; or &ldquo;regexp&rdquo;, depending on your political party
and the current moon phase&#x2014;I&rsquo;ll use regexp below).
</p>

<p>
Regular expressions are a popular and powerful tool for manipulating,
validating, and parsing strings. You can use regular expressions to extract
phone numbers, validate email addresses, and inflict pain and suffering to
future maintainers of your code (including you!).
</p>

<p>
But regular expressions are a lot more powerful than you think! This post will
describe an example of a way you can use regexps to perform <i>computation</i>, in
particular, determining whether a number is prime <sup><a id="fnr.1" class="footref" href="#fn.1">1</a></sup>. (This post assumes
familiarity with basic regular expression concepts. If you&rsquo;re not acquainted
with regexps, a decent introduction to them lives <a href="https://www.regexone.com/">here</a>.)
</p>

<div id="table-of-contents">
<h2>Table of Contents</h2>
<div id="text-table-of-contents">
<ul>
<li><a href="#primality-testing">1. Primality testing</a></li>
<li><a href="#explanation">2. Explanation</a>
<ul>
<li><a href="#regexp-constructs">2.1. Regexp constructs used</a></li>
<li><a href="#intuition">2.2. Intuition</a></li>
<li><a href="#examples">2.3. Examples</a>
<ul>
<li><a href="#example-1">2.3.1. n = 1</a></li>
<li><a href="#example-9">2.3.2. n = 9 (composite)</a></li>
<li><a href="#example-7">2.3.3. n = 7 (prime)</a></li>
</ul>
</li>
<li><a href="#performance-enhancements">2.4. Performance enhancements</a></li>
</ul>
</li>
<li><a href="#closing-remarks">3. Closing remarks</a></li>
</ul>
</div>
</div>
<div id="outline-container-orgf63e7d7" class="outline-2">
<h2 id="primality-testing"><a id="orgf63e7d7"></a><span class="section-number-2">1</span> Primality testing</h2>
<div class="outline-text-2" id="text-primality-testing">
<p>
I recently came across a regexp-based test for prime numbers<sup><a id="fnr.2" class="footref" href="#fn.2">2</a></sup>, which is
what inspired this blog post.
</p>

<p>
In Perl:
</p>

<div class="org-src-container">
<pre class="src src-perl"><span class="org-keyword">sub</span> <span class="org-function-name">is_prime</span> <span class="org-rainbow-delimiters-depth-1">{</span>
  <span class="org-keyword">my</span> <span class="org-rainbow-delimiters-depth-2">(</span><span class="org-variable-name">$number</span><span class="org-rainbow-delimiters-depth-2">)</span> = <span class="org-cperl-array">@_</span>;
  <span class="org-keyword">return</span> <span class="org-rainbow-delimiters-depth-2">(</span><span class="org-highlight-numbers-number">1</span> <span class="org-type">x</span> $number<span class="org-rainbow-delimiters-depth-2">)</span> <span class="org-negation-char">!</span>~ <span class="org-cperl-nonoverridable">m</span><span class="org-constant">/</span><span class="org-builtin">\A</span><span class="org-string"> </span><span class="org-builtin">(?:</span><span class="org-string"> 1</span><span class="org-builtin">?</span><span class="org-string"> </span><span class="org-keyword">|</span><span class="org-string"> </span><span class="org-keyword">(</span><span class="org-string">11</span><span class="org-builtin">+?</span><span class="org-keyword">)</span><span class="org-string"> </span><span class="org-builtin">(?&gt;</span><span class="org-string"> </span><span class="org-builtin">\</span><span class="org-type">1</span><span class="org-builtin">+</span><span class="org-string"> </span><span class="org-builtin">)</span><span class="org-string"> </span><span class="org-builtin">)</span><span class="org-string"> </span><span class="org-builtin">\z</span><span class="org-constant">/</span><span class="org-cperl-nonoverridable">xms</span>;
<span class="org-rainbow-delimiters-depth-1">}</span>

<span class="org-cperl-nonoverridable">print</span> <span class="org-type">join</span> <span class="org-string">' '</span>, <span class="org-cperl-nonoverridable">grep</span> <span class="org-rainbow-delimiters-depth-1">{</span> is_prime<span class="org-rainbow-delimiters-depth-2">(</span>$_<span class="org-rainbow-delimiters-depth-2">)</span> <span class="org-rainbow-delimiters-depth-1">}</span> <span class="org-highlight-numbers-number">0</span>..<span class="org-highlight-numbers-number">100</span>;
</pre>
</div>

<p>
The above program produces a list of primes up to 100, as expected:
</p>

<pre class="example">
2 3 5 7 11 13 17 19 23 29 31 37 41 43 47 53 59 61 67 71 73 79 83 89 97
</pre>
</div>
</div>
<div id="outline-container-org50c5c40" class="outline-2">
<h2 id="explanation"><a id="org50c5c40"></a><span class="section-number-2">2</span> Explanation</h2>
<div class="outline-text-2" id="text-explanation">
<p>
How the hell does a regular expression, which simply matches some string
pattern, calculate whether or not a number is prime? Let&rsquo;s dissect the code,
and we&rsquo;ll see that it&rsquo;s equivalent to a naive <a href="https://en.wikipedia.org/wiki/Trial_division">trial division</a> algorithm.
</p>

<p>
Here&rsquo;s a simplified and annotated version of the above code.
</p>

<div class="org-src-container">
<pre class="src src-perl"><span class="org-keyword">sub</span> <span class="org-function-name">is_prime</span> <span class="org-rainbow-delimiters-depth-1">{</span>
  <span class="org-keyword">my</span> <span class="org-rainbow-delimiters-depth-2">(</span><span class="org-variable-name">$number</span><span class="org-rainbow-delimiters-depth-2">)</span> = <span class="org-cperl-array">@_</span>;
  <span class="org-keyword">return</span> <span class="org-rainbow-delimiters-depth-2">(</span><span class="org-highlight-numbers-number">1</span> <span class="org-type">x</span> $number<span class="org-rainbow-delimiters-depth-2">)</span>          <span class="org-comment-delimiter"># </span><span class="org-comment">a string with '1' repeated $number times</span>
    <span class="org-negation-char">!</span>~                          <span class="org-comment-delimiter"># </span><span class="org-comment">negated regexp match operator</span>
    <span class="org-cperl-nonoverridable">m</span><span class="org-constant">{</span>
<span class="org-string">       </span><span class="org-builtin">\A</span><span class="org-string">                       </span><span class="org-comment"># beginning of string</span>
<span class="org-string">       </span><span class="org-builtin">(?:</span><span class="org-string">                      </span><span class="org-comment"># non-capturing group</span>
<span class="org-string">         1</span><span class="org-builtin">?</span><span class="org-string">                     </span><span class="org-comment"># zero or one '1'</span>
<span class="org-string">       </span><span class="org-keyword">|</span>
<span class="org-string">         </span><span class="org-keyword">(</span><span class="org-string">11</span><span class="org-builtin">+</span><span class="org-keyword">)</span><span class="org-string">                  </span><span class="org-comment"># two or more '1's</span>
<span class="org-string">         </span><span class="org-builtin">\</span><span class="org-type">1</span><span class="org-builtin">+</span><span class="org-string">                    </span><span class="org-comment"># one or more repetitions of previous group</span>
<span class="org-string">       </span><span class="org-builtin">)</span>
<span class="org-string">       </span><span class="org-builtin">\z</span><span class="org-string">                       </span><span class="org-comment"># end of string</span>
<span class="org-string">   </span><span class="org-constant">}</span><span class="org-cperl-nonoverridable">xms</span>;                        <span class="org-comment-delimiter"># </span><span class="org-comment">eXtended syntax, allows comments and spaces</span>
<span class="org-rainbow-delimiters-depth-1">}</span>

<span class="org-cperl-nonoverridable">print</span> <span class="org-type">join</span> <span class="org-string">' '</span>, <span class="org-cperl-nonoverridable">grep</span> <span class="org-rainbow-delimiters-depth-1">{</span> is_prime<span class="org-rainbow-delimiters-depth-2">(</span>$_<span class="org-rainbow-delimiters-depth-2">)</span> <span class="org-rainbow-delimiters-depth-1">}</span> <span class="org-highlight-numbers-number">0</span>..<span class="org-highlight-numbers-number">100</span>;
</pre>
</div>

<p>
First, we construct a string consisting of <code>$number</code> occurrences of the digit
<code>1</code>. We could have selected any other character instead of <code>1</code>; what&rsquo;s
important is that the string has length <code>$number</code>.
</p>

<p>
The regexp itself matches any string of <code>1</code>&rsquo;s whose length is zero, one, or a
composite number. Therefore, we use Perl&rsquo;s <code>!~</code> operator, which returns a true
value if the input string does <i>not</i> match the pattern. That&rsquo;ll tell us
whether or not <code>$number</code> is <i>prime</i>.
</p>
</div>
<div id="outline-container-org4f61a99" class="outline-3">
<h3 id="regexp-constructs"><a id="org4f61a99"></a><span class="section-number-3">2.1</span> Regexp constructs used</h3>
<div class="outline-text-3" id="text-regexp-constructs">
<p>
<code>\A</code> and <code>\z</code> specify that the match operator should return true only when
the entire string fits the pattern, instead of the default behavior, which is
to return true if any <i>substring</i> of the input fits the pattern.
</p>

<p>
<code>1?</code> is straightforward; it matches exactly zero or one occurrences of <code>1</code>.
</p>

<p>
<code>(11+)</code> matches two or more occurrences of <code>1</code>. Because of the parentheses,
it also <i>captures</i> the matching substring, which can then be referred to
later in the pattern using the <code>\1</code> syntax (1 because it&rsquo;s the first
capturing group).
</p>

<p>
<code>\1+</code> then matches one or more occurrences of whatever was matched by
<code>(11+?)</code>.
</p>
</div>
</div>
<div id="outline-container-org07db3cd" class="outline-3">
<h3 id="intuition"><a id="org07db3cd"></a><span class="section-number-3">2.2</span> Intuition</h3>
<div class="outline-text-3" id="text-intuition">
<p>
The length of the substring captured by <code>(11+?)</code> acts like a trial divisor.
The left-hand side of the alternation bar <code>|</code> is fairly simple. It takes care
of the edge cases of 0 and 1, which are not prime or composite.
</p>

<p>
<code>(11+)</code> acts as a trial divisor. It matches a substring of two or more <code>1</code>&rsquo;s,
capturing however many it matched into group 1. The length of that substring
is divided into the rest of the string.
</p>

<p>
<code>\1+</code> tries to match a substring whose length is some whole number multiple
of the trial divisor. If it reaches exactly the end of the string (of length
<code>$number</code>), then <code>$number</code> must be divisible by the trial divisor.
</p>

<p>
In total, the right-hand side of the alternation bar <code>|</code> ends up matching two
or more occurrences of the trial divisor string. Therefore, if it matches the
whole string, then <code>$number</code> must be composite.
</p>
</div>
</div>
<div id="outline-container-org8c70b7c" class="outline-3">
<h3 id="examples"><a id="org8c70b7c"></a><span class="section-number-3">2.3</span> Examples</h3>
<div class="outline-text-3" id="text-examples">
<p>
Let&rsquo;s see how the regexp would work on a couple of example inputs.
</p>
</div>
<div id="outline-container-org6f68cbd" class="outline-4">
<h4 id="example-1"><a id="org6f68cbd"></a><span class="section-number-4">2.3.1</span> n = 1</h4>
<div class="outline-text-4" id="text-example-1">
<div class="org-src-container">
<pre class="src src-perl"><span class="org-string">'1'</span> <span class="org-negation-char">!</span>~ <span class="org-cperl-nonoverridable">m</span><span class="org-constant">/</span><span class="org-builtin">\A</span><span class="org-string"> </span><span class="org-builtin">(?:</span><span class="org-string"> 1</span><span class="org-builtin">?</span><span class="org-string"> </span><span class="org-keyword">|</span><span class="org-string"> </span><span class="org-keyword">(</span><span class="org-string">11</span><span class="org-builtin">+</span><span class="org-keyword">)</span><span class="org-string"> </span><span class="org-builtin">\</span><span class="org-type">1</span><span class="org-builtin">+</span><span class="org-string"> </span><span class="org-builtin">)</span><span class="org-string"> </span><span class="org-builtin">\z</span><span class="org-constant">/</span><span class="org-cperl-nonoverridable">xms</span>;
</pre>
</div>

<p>
The string <code>'1'</code>, as well as the empty string, matches the left side of the
alternation bar, <code>1?</code>. This is the special case (neither prime nor
composite).
</p>
</div>
</div>
<div id="outline-container-org4bb8f53" class="outline-4">
<h4 id="example-9"><a id="org4bb8f53"></a><span class="section-number-4">2.3.2</span> n = 9 (composite)</h4>
<div class="outline-text-4" id="text-example-9">
<div class="org-src-container">
<pre class="src src-perl"><span class="org-string">'111111111'</span> <span class="org-negation-char">!</span>~ <span class="org-cperl-nonoverridable">m</span><span class="org-constant">/</span><span class="org-builtin">\A</span><span class="org-string"> </span><span class="org-builtin">(?:</span><span class="org-string"> 1</span><span class="org-builtin">?</span><span class="org-string"> </span><span class="org-keyword">|</span><span class="org-string"> </span><span class="org-keyword">(</span><span class="org-string">11</span><span class="org-builtin">+</span><span class="org-keyword">)</span><span class="org-string"> </span><span class="org-builtin">\</span><span class="org-type">1</span><span class="org-builtin">+</span><span class="org-string"> </span><span class="org-builtin">)</span><span class="org-string"> </span><span class="org-builtin">\z</span><span class="org-constant">/</span><span class="org-cperl-nonoverridable">xms</span>;
</pre>
</div>

<p>
We know the left-hand side of the alternation bar won&rsquo;t match, so let&rsquo;s
focus on the right-hand side: <code>(11+)\1+</code>.
</p>

<p>
For the capture group, <code>(11+)</code>, the regexp engine will match it with the
first two <code>1</code>&rsquo;s, then the first three, the first four, etc. For each prefix
of the string matched by <code>(11+)</code>, the engine will attempt to match the
remainder of the string against <code>\1+</code>.
</p>

<p>
When the engine attempts to match the pattern with <code>(11+)</code> corresponding to
<code>11</code>, it will then try to match the rest of the string with <code>(11)+</code>, i.e.,
some positive even number of <code>1</code>&rsquo;s. It will not be able to do so; after
consuming the first 8 characters of the input, the regexp engine will not be
able to match another occurrence of <code>(11)</code>. However, it won&rsquo;t be at the end
of the string yet, so the <code>\z</code> assertion fails and the match must backtrack.
</p>

<p>
When the engine attempts to match the pattern with <code>(11+)</code> corresponding to
<code>111</code>, it will try to match the remaining 6 characters against <code>(111)+</code>, and
will succeed because 3 divides 6 evenly. Therefore, the regular expression
matches and the <code>!~</code> operator returns a false value, telling us that the
number 9 is composite.
</p>
</div>
</div>
<div id="outline-container-orgcefe0f7" class="outline-4">
<h4 id="example-7"><a id="orgcefe0f7"></a><span class="section-number-4">2.3.3</span> n = 7 (prime)</h4>
<div class="outline-text-4" id="text-example-7">
<div class="org-src-container">
<pre class="src src-perl"><span class="org-string">'1111111'</span> <span class="org-negation-char">!</span>~ <span class="org-cperl-nonoverridable">m</span><span class="org-constant">/</span><span class="org-builtin">\A</span><span class="org-string"> </span><span class="org-builtin">(?:</span><span class="org-string"> 1</span><span class="org-builtin">?</span><span class="org-string"> </span><span class="org-keyword">|</span><span class="org-string"> </span><span class="org-keyword">(</span><span class="org-string">11</span><span class="org-builtin">+</span><span class="org-keyword">)</span><span class="org-string"> </span><span class="org-builtin">\</span><span class="org-type">1</span><span class="org-builtin">+</span><span class="org-string"> </span><span class="org-builtin">)</span><span class="org-string"> </span><span class="org-builtin">\z</span><span class="org-constant">/</span><span class="org-cperl-nonoverridable">xms</span>;
</pre>
</div>

<p>
Again, we&rsquo;ll focus on the right-hand side of the alternation bar.
</p>

<p>
<code>(11+)</code> matches prefixes of length 2 or greater. For each of these
&ldquo;divisors&rdquo;, the engine will attempt to match one or more additional
occurrences of the prefix until the end of the string. Since none of (2, 3,
&#x2026;, 6) divide evenly into 7, they cannot match the rest of the pattern
successfully.
</p>

<p>
How about the prefix of length 7, i.e., the entire string being matched by
<code>(11+)</code>? Well, since <code>\1+</code> requires <b>at least one</b> more occurrence of the
captured group to appear before the end of the string, the engine will
reject the input even though 7 divides itself evenly.
</p>
</div>
</div>
</div>
<div id="outline-container-orgf37a7d5" class="outline-3">
<h3 id="performance-enhancements"><a id="orgf37a7d5"></a><span class="section-number-3">2.4</span> Performance enhancements</h3>
<div class="outline-text-3" id="text-performance-enhancements">
<p>
Let&rsquo;s return to the original version of the code:
</p>

<div class="org-src-container">
<pre class="src src-perl"><span class="org-keyword">sub</span> <span class="org-function-name">is_prime</span> <span class="org-rainbow-delimiters-depth-1">{</span>
  <span class="org-keyword">my</span> <span class="org-rainbow-delimiters-depth-2">(</span><span class="org-variable-name">$number</span><span class="org-rainbow-delimiters-depth-2">)</span> = <span class="org-cperl-array">@_</span>;
  <span class="org-keyword">return</span> <span class="org-rainbow-delimiters-depth-2">(</span><span class="org-highlight-numbers-number">1</span> <span class="org-type">x</span> $number<span class="org-rainbow-delimiters-depth-2">)</span> <span class="org-negation-char">!</span>~ <span class="org-cperl-nonoverridable">m</span><span class="org-constant">/</span><span class="org-builtin">\A</span><span class="org-string"> </span><span class="org-builtin">(?:</span><span class="org-string"> 1</span><span class="org-builtin">?</span><span class="org-string"> </span><span class="org-keyword">|</span><span class="org-string"> </span><span class="org-keyword">(</span><span class="org-string">11</span><span class="org-builtin">+?</span><span class="org-keyword">)</span><span class="org-string"> </span><span class="org-builtin">(?&gt;</span><span class="org-string"> </span><span class="org-builtin">\</span><span class="org-type">1</span><span class="org-builtin">+</span><span class="org-string"> </span><span class="org-builtin">)</span><span class="org-string"> </span><span class="org-builtin">)</span><span class="org-string"> </span><span class="org-builtin">\z</span><span class="org-constant">/</span><span class="org-cperl-nonoverridable">xms</span>;
<span class="org-rainbow-delimiters-depth-1">}</span>

<span class="org-cperl-nonoverridable">print</span> <span class="org-type">join</span> <span class="org-string">' '</span>, <span class="org-cperl-nonoverridable">grep</span> <span class="org-rainbow-delimiters-depth-1">{</span> is_prime<span class="org-rainbow-delimiters-depth-2">(</span>$_<span class="org-rainbow-delimiters-depth-2">)</span> <span class="org-rainbow-delimiters-depth-1">}</span> <span class="org-highlight-numbers-number">0</span>..<span class="org-highlight-numbers-number">100</span>;
</pre>
</div>

<p>
In <code>(11+?)</code>, the <code>?</code> modifies the <code>+</code> quantifier, causing it to become
&ldquo;non-greedy&rdquo;. This means the regexp engine will try to match as few
occurrences of the preceding pattern as possible. With this modifier, the
engine will try using a divisor of two, then three, then four, and so on
increasing, rather than decreasing from <code>$number</code> down to two (which is the
default behavior for <code>+</code>). Since a composite number is much more likely to be
divisible by 2 than some large divisor, it is much more cost-effective to
start small and work upwards.
</p>

<p>
<code>(?&gt; )</code> prevents the subpattern contained within from backtracking. For
example, in <code>(?&gt; \1+ )</code>, the regexp engine will match as many occurrences of
<code>\1</code> as possible. If it fails to match the remainder of the pattern, it won&rsquo;t
retry the match with fewer occurrences, since that would be futile&#x2013;it
definitely won&rsquo;t reach the end of the string (<code>\z</code>) after matching <i>fewer</i>
characters than before!
</p>

<p>
With these performance enhancements in place, the regexp-based primality test
enjoys the speed of a sensible implementation of a naïve trial-division
algorithm, if not a more sophisticated primality testing algorithm.
</p>
</div>
</div>
</div>
<div id="outline-container-org36f4a2a" class="outline-2">
<h2 id="closing-remarks"><a id="org36f4a2a"></a><span class="section-number-2">3</span> Closing remarks</h2>
<div class="outline-text-2" id="text-closing-remarks">
<p>
Regular expressions are more powerful than you think. In fact, they&rsquo;re
strictly more powerful than what mathematicians formally define as &ldquo;regular
expressions&rdquo;, since they include features like backreferences (<code>\1</code>) and
look-ahead/-behind (<code>(?=)</code>).
</p>

<p>
That being said, never use regular expressions to test primality.
</p>
</div>
</div>
<div id="footnotes">
<h2 class="footnotes">Footnotes: </h2>
<div id="text-footnotes">

<div class="footdef"><sup><a id="fn.1" class="footnum" href="#fnr.1">1</a></sup> <div class="footpara"><p class="footpara">
This material is provided for entertainment purposes only. Never ever do
this, except to show off. Please.
</p></div></div>

<div class="footdef"><sup><a id="fn.2" class="footnum" href="#fnr.2">2</a></sup> <div class="footpara"><p class="footpara">
Damian Conway, <i>Perl Best Practices</i>. Page 235.
</p></div></div>


</div>
</div>]]></content><author><name>Aaron L. Zeng</name></author><category term="uncategorized" /><category term="perl" /><summary type="html"><![CDATA[If you&rsquo;ve worked extensively with strings, you&rsquo;ve probably encountered regular expressions (abbreviated &ldquo;regex&rdquo; or &ldquo;regexp&rdquo;, depending on your political party and the current moon phase&#x2014;I&rsquo;ll use regexp below).]]></summary></entry><entry><title type="html">Emacs Plugins in OCaml: Putting It All Together (part 4)</title><link href="https://blag.bcc32.com/ecaml-getting-started/2017/11/19/emacs-plugins-in-ocaml-4/" rel="alternate" type="text/html" title="Emacs Plugins in OCaml: Putting It All Together (part 4)" /><published>2017-11-19T16:00:00-05:00</published><updated>2017-11-19T16:00:00-05:00</updated><id>https://blag.bcc32.com/ecaml-getting-started/2017/11/19/emacs-plugins-in-ocaml-4</id><content type="html" xml:base="https://blag.bcc32.com/ecaml-getting-started/2017/11/19/emacs-plugins-in-ocaml-4/"><![CDATA[<p class="alert alert-info">
This post is part 4 of a
<a class="alert-link" href="/categories/ecaml-getting-started/">series</a>
(<a class="alert-link" href="/ecaml-getting-started/2017/11/12/emacs-plugins-in-ocaml-3/">prev</a>).
The full code is available on
<a class="alert-link" href="https://github.com/bcc32/ecaml-bf">GitHub</a>.
</p>

<p>
In previous parts, we:
</p>

<ol class="org-ol">
<li>figured out how to compile OCaml code into an Emacs plugin</li>
<li>wrote a brainfuck interpreter library</li>
<li>explored features of the Ecaml library</li>
</ol>

<p>
Now, it&rsquo;s time to put everything together. Our final plugin will provide us a
way to execute brainfuck code in a buffer without leaving Emacs, and with the
benefit of the type safety and performance of OCaml.
</p>

<div id="table-of-contents">
<h2>Table of Contents</h2>
<div id="text-table-of-contents">
<ul>
<li><a href="#setup">1. Setup</a></li>
<li><a href="#plugin">2. Plugin</a>
<ul>
<li><a href="#basic-function">2.1. Basic function</a></li>
<li><a href="#interacting-with-emacs">2.2. Interacting with Emacs</a></li>
<li><a href="#major-mode-and-key-bindings">2.3. Major mode and key bindings</a></li>
<li><a href="#automatically-starting">2.4. Automatically starting <code>bf-mode</code></a></li>
<li><a href="#finishing-up">2.5. Finishing up</a></li>
</ul>
</li>
<li><a href="#demo">3. Demo</a></li>
<li><a href="#orgf7f213d">4. Hooray!!!</a></li>
</ul>
</div>
</div>
<div id="outline-container-orgcf956b4" class="outline-2">
<h2 id="setup"><a id="orgcf956b4"></a><span class="section-number-2">1</span> Setup</h2>
<div class="outline-text-2" id="text-setup">
<p>
First, we need to modify our <code>jbuild</code> file from <a href="/ecaml-getting-started/2017/11/05/emacs-plugins-in-ocaml-1/#compiling-our-plugin">part 1</a>.
</p>

<div class="org-src-container">
<pre class="src src-tuareg-jbuild"><span class="linenr"> 1: </span>(jbuild_version 1)
<span class="linenr"> 2: </span>
<span class="linenr"> 3: </span>(executables
<span class="linenr"> 4: </span> ((names     (plugin))
<span class="linenr"> 5: </span>  (libraries (bf_lib ecaml))
<span class="linenr"> 6: </span>  (preprocess (pps (ppx_here)))))
<span class="linenr"> 7: </span>
<span class="linenr"> 8: </span>(rule (copy plugin.exe ecaml-bf.so))
<span class="linenr"> 9: </span>
<span class="linenr">10: </span>(alias
<span class="linenr">11: </span> ((name plugin)
<span class="linenr">12: </span>  (deps (ecaml-bf.so))))
<span class="linenr">13: </span>
<span class="linenr">14: </span>(alias
<span class="linenr">15: </span> ((name runtest)
<span class="linenr">16: </span>  (deps ((alias plugin)))
<span class="linenr">17: </span>  (action (run emacs -Q -L . --batch --eval "(require 'ecaml-bf)"))))
</pre>
</div>

<p>
Only the <code>executables</code> rule was modified, adding <code>bf_lib</code> (our interpreter
library) to the list of required libraries and adding a <code>preprocess</code> rule for
<code>ppx_here</code>, which provides the <code>[%here]</code> syntax we&rsquo;ll be using with <code>defun</code>.
</p>
</div>
</div>
<div id="outline-container-orgb9c3913" class="outline-2">
<h2 id="plugin"><a id="orgb9c3913"></a><span class="section-number-2">2</span> Plugin</h2>
<div class="outline-text-2" id="text-plugin">
</div>
<div id="outline-container-org2747fe7" class="outline-3">
<h3 id="basic-function"><a id="org2747fe7"></a><span class="section-number-3">2.1</span> Basic function</h3>
<div class="outline-text-3" id="text-basic-function">
<p>
To begin with, we&rsquo;re just going to allow Emacs to evaluate brainfuck code
through our interpreter, non-interactively (i.e., from Lisp code). We&rsquo;ll
define a function named <code>bf-eval</code> that just wraps our library code:
</p>

<div class="org-src-container">
<pre class="src src-ocaml"><span class="linenr"> 1: </span><span class="org-tuareg-font-lock-governing">open </span><span class="org-tuareg-font-lock-module">Ecaml</span>
<span class="linenr"> 2: </span>
<span class="linenr"> 3: </span><span class="org-tuareg-font-lock-governing">let</span> <span class="org-function-name">eval_program</span><span class="org-variable-name"> program input</span> <span class="org-tuareg-font-lock-operator">=</span>
<span class="linenr"> 4: </span>  <span class="org-tuareg-font-lock-governing">let</span> <span class="org-variable-name">program</span> <span class="org-tuareg-font-lock-operator">=</span> <span class="org-tuareg-font-lock-module">Value.</span>to_utf8_bytes_exn program <span class="org-tuareg-font-lock-governing">in</span>
<span class="linenr"> 5: </span>  <span class="org-tuareg-font-lock-governing">let</span> <span class="org-variable-name">input</span>   <span class="org-tuareg-font-lock-operator">=</span> <span class="org-tuareg-font-lock-module">Value.</span>to_utf8_bytes_exn input   <span class="org-tuareg-font-lock-governing">in</span>
<span class="linenr"> 6: </span>  <span class="org-tuareg-font-lock-module">Bf_lib.Program.</span>run' <span class="org-tuareg-font-lock-operator">~</span>program <span class="org-tuareg-font-lock-operator">~</span>input
<span class="linenr"> 7: </span>  <span class="org-tuareg-font-lock-operator">|&gt;</span> <span class="org-tuareg-font-lock-module">Value.</span>of_utf8_bytes
<span class="linenr"> 8: </span><span class="org-tuareg-font-double-colon">;;</span>
<span class="linenr"> 9: </span>
<span class="linenr">10: </span><span class="org-tuareg-font-lock-governing">let</span> <span class="org-tuareg-font-lock-operator"><span class="org-rainbow-delimiters-depth-1">()</span></span> <span class="org-tuareg-font-lock-operator">=</span>
<span class="linenr">11: </span>  defun <span class="org-tuareg-font-lock-extension-node"><span class="org-rainbow-delimiters-depth-1">[</span></span><span class="org-tuareg-font-lock-extension-node">%here</span><span class="org-tuareg-font-lock-extension-node"><span class="org-rainbow-delimiters-depth-1">]</span></span> <span class="org-tuareg-font-lock-operator"><span class="org-rainbow-delimiters-depth-1">(</span></span><span class="org-tuareg-font-lock-module">Symbol.</span>intern <span class="org-string">"bf-eval"</span><span class="org-tuareg-font-lock-operator"><span class="org-rainbow-delimiters-depth-1">)</span></span>
<span class="linenr">12: </span>    <span class="org-tuareg-font-lock-label">~docstring</span><span class="org-tuareg-font-lock-operator">:</span><span class="org-string">"evaluate [program], a brainfuck program, given [input]"</span>
<span class="linenr">13: </span>    <span class="org-tuareg-font-lock-label">~args</span><span class="org-tuareg-font-lock-operator">:</span><span class="org-tuareg-font-lock-operator"><span class="org-rainbow-delimiters-depth-1">[</span></span> <span class="org-tuareg-font-lock-module">Symbol.</span>intern <span class="org-string">"program"</span>
<span class="linenr">14: </span>          <span class="org-tuareg-font-lock-operator">;</span> <span class="org-tuareg-font-lock-module">Symbol.</span>intern <span class="org-string">"input"</span> <span class="org-tuareg-font-lock-operator"><span class="org-rainbow-delimiters-depth-1">]</span></span>
<span class="linenr">15: </span>    <span class="org-tuareg-font-lock-operator"><span class="org-rainbow-delimiters-depth-1">(</span></span><span class="org-keyword">function</span>
<span class="linenr">16: </span>      <span class="org-tuareg-font-lock-operator">|</span> <span class="org-tuareg-font-lock-operator"><span class="org-rainbow-delimiters-depth-2">[</span></span><span class="org-tuareg-font-lock-operator">|</span> program<span class="org-tuareg-font-lock-operator">;</span> input <span class="org-tuareg-font-lock-operator">|</span><span class="org-tuareg-font-lock-operator"><span class="org-rainbow-delimiters-depth-2">]</span></span> <span class="org-tuareg-font-lock-operator">-&gt;</span> eval_program program input
<span class="linenr">17: </span>      <span class="org-tuareg-font-lock-operator">|</span> _ <span class="org-tuareg-font-lock-operator">-&gt;</span> <span class="org-builtin">invalid_arg</span> <span class="org-string">"wrong arity"</span><span class="org-tuareg-font-lock-operator"><span class="org-rainbow-delimiters-depth-1">)</span></span>
<span class="linenr">18: </span><span class="org-tuareg-font-double-colon">;;</span>
</pre>
</div>

<p>
I split up the definition into two parts. <code>eval_program</code> wraps
<code>Bf_lib.Program.run'</code> and does the work of converting Emacs values (in this
case, strings) into OCaml values and vice versa.
</p>

<p>
The second part calls <code>defun</code> to register the function with Emacs, providing
a docstring and argument names and checking the number of arguments passed.
</p>

<p>
We can try running it:
</p>

<div class="org-src-container">
<pre class="src src-sh">jbuilder build @plugin
emacs -Q -L _build/default/src --batch --eval <span class="org-string">"(require 'ecaml-bf)"</span> <span class="org-sh-escaped-newline">\</span>
      --eval <span class="org-string">'(print (bf-eval ",+." "A"))'</span>
<span class="org-comment-delimiter"># </span><span class="org-comment">B</span>
</pre>
</div>

<p>
The program <code>,+.</code> simply inputs a byte, increments it, and prints the result,
so it transforms <code>"A"</code> into <code>"B"</code>.
</p>

<p>
If we load the plugin into a normal (non-batch mode) Emacs, we wouldn&rsquo;t be
able to call the function as an <kbd>M-x</kbd> command, since we
haven&rsquo;t marked it as an interactive command yet.
</p>
</div>
</div>
<div id="outline-container-org17bcafd" class="outline-3">
<h3 id="interacting-with-emacs"><a id="org17bcafd"></a><span class="section-number-3">2.2</span> Interacting with Emacs</h3>
<div class="outline-text-3" id="text-interacting-with-emacs">
<p>
Next, let&rsquo;s write a function that evaluates the brainfuck code in the current
buffer. <code>bf-eval-buffer</code> will be an interactive command, so it can be called
via <kbd>M-x bf-eval-buffer</kbd> and will prompt the user for a
string to serve as the input to the program.
</p>

<div class="org-src-container">
<pre class="src src-ocaml"><span class="linenr">20: </span><span class="org-tuareg-font-lock-governing">let</span> <span class="org-function-name">eval_current_buffer</span><span class="org-variable-name"> input</span> <span class="org-tuareg-font-lock-operator">=</span>
<span class="linenr">21: </span>  <span class="org-tuareg-font-lock-governing">let</span> <span class="org-variable-name">program</span> <span class="org-tuareg-font-lock-operator">=</span>
<span class="linenr">22: </span>    <span class="org-tuareg-font-lock-module">Current_buffer.</span>contents <span class="org-tuareg-font-lock-operator"><span class="org-rainbow-delimiters-depth-1">()</span></span>
<span class="linenr">23: </span>    <span class="org-tuareg-font-lock-operator">|&gt;</span> <span class="org-tuareg-font-lock-module">Text.</span>to_utf8_bytes
<span class="linenr">24: </span>  <span class="org-tuareg-font-lock-governing">in</span>
<span class="linenr">25: </span>  <span class="org-tuareg-font-lock-governing">let</span> <span class="org-variable-name">input</span> <span class="org-tuareg-font-lock-operator">=</span> <span class="org-tuareg-font-lock-module">Value.</span>to_utf8_bytes_exn input <span class="org-tuareg-font-lock-governing">in</span>
<span class="linenr">26: </span>  <span class="org-tuareg-font-lock-module">Bf_lib.Program.</span>run' <span class="org-tuareg-font-lock-operator">~</span>program <span class="org-tuareg-font-lock-operator">~</span>input
<span class="linenr">27: </span><span class="org-tuareg-font-double-colon">;;</span>
<span class="linenr">28: </span>
<span class="linenr">29: </span><span class="org-tuareg-font-lock-governing">let</span> <span class="org-tuareg-font-lock-operator"><span class="org-rainbow-delimiters-depth-1">()</span></span> <span class="org-tuareg-font-lock-operator">=</span>
<span class="linenr">30: </span>  defun <span class="org-tuareg-font-lock-extension-node"><span class="org-rainbow-delimiters-depth-1">[</span></span><span class="org-tuareg-font-lock-extension-node">%here</span><span class="org-tuareg-font-lock-extension-node"><span class="org-rainbow-delimiters-depth-1">]</span></span> <span class="org-tuareg-font-lock-operator"><span class="org-rainbow-delimiters-depth-1">(</span></span><span class="org-tuareg-font-lock-module">Symbol.</span>intern <span class="org-string">"bf-eval-buffer"</span><span class="org-tuareg-font-lock-operator"><span class="org-rainbow-delimiters-depth-1">)</span></span>
<span class="linenr">31: </span>    <span class="org-tuareg-font-lock-label">~docstring</span><span class="org-tuareg-font-lock-operator">:</span><span class="org-string">"evaluate the current buffer as brainfuck code"</span>
<span class="linenr">32: </span>    <span class="org-tuareg-font-lock-label">~interactive</span><span class="org-tuareg-font-lock-operator">:</span><span class="org-string">"sInput: "</span>
<span class="linenr">33: </span>    <span class="org-tuareg-font-lock-label">~args</span><span class="org-tuareg-font-lock-operator">:</span><span class="org-tuareg-font-lock-operator"><span class="org-rainbow-delimiters-depth-1">[</span></span> <span class="org-tuareg-font-lock-module">Symbol.</span>intern <span class="org-string">"input"</span> <span class="org-tuareg-font-lock-operator"><span class="org-rainbow-delimiters-depth-1">]</span></span>
<span class="linenr">34: </span>    <span class="org-tuareg-font-lock-operator"><span class="org-rainbow-delimiters-depth-1">(</span></span><span class="org-keyword">function</span>
<span class="linenr">35: </span>      <span class="org-tuareg-font-lock-operator">|</span> <span class="org-tuareg-font-lock-operator"><span class="org-rainbow-delimiters-depth-2">[</span></span><span class="org-tuareg-font-lock-operator">|</span> input <span class="org-tuareg-font-lock-operator">|</span><span class="org-tuareg-font-lock-operator"><span class="org-rainbow-delimiters-depth-2">]</span></span> <span class="org-tuareg-font-lock-operator">-&gt;</span>
<span class="linenr">36: </span>        <span class="org-tuareg-font-lock-governing">let</span> <span class="org-variable-name">output</span> <span class="org-tuareg-font-lock-operator">=</span> eval_current_buffer input <span class="org-tuareg-font-lock-governing">in</span>
<span class="linenr">37: </span>        messagef <span class="org-string">"Output: %s"</span> output<span class="org-tuareg-font-lock-operator">;</span>
<span class="linenr">38: </span>        <span class="org-tuareg-font-lock-module">Value.</span>nil
<span class="linenr">39: </span>      <span class="org-tuareg-font-lock-operator">|</span> _ <span class="org-tuareg-font-lock-operator">-&gt;</span> <span class="org-builtin">invalid_arg</span> <span class="org-string">"wrong arity"</span><span class="org-tuareg-font-lock-operator"><span class="org-rainbow-delimiters-depth-1">)</span></span>
<span class="linenr">40: </span><span class="org-tuareg-font-double-colon">;;</span>
</pre>
</div>

<p>
The argument to <code>interactive</code>, <code>"sInput: "</code>, causes the command to prompt the
user for a string as its first argument (the minibuffer will contain the
prompt <code>Input:</code>).
</p>
</div>
</div>
<div id="outline-container-orge1a5b55" class="outline-3">
<h3 id="major-mode-and-key-bindings"><a id="orge1a5b55"></a><span class="section-number-3">2.3</span> Major mode and key bindings</h3>
<div class="outline-text-3" id="text-major-mode-and-key-bindings">
<p>
Next, we&rsquo;ll define a major mode for this new feature, as well as a key
binding so that users can easily access the command without using
<kbd>M-x</kbd>.
</p>

<div class="org-src-container">
<pre class="src src-ocaml"><span class="linenr">42: </span><span class="org-tuareg-font-lock-governing">let</span> <span class="org-variable-name">mode_name</span> <span class="org-tuareg-font-lock-operator">=</span> <span class="org-tuareg-font-lock-module">Symbol.</span>intern <span class="org-string">"bf-mode"</span>
<span class="linenr">43: </span>
<span class="linenr">44: </span><span class="org-tuareg-font-lock-governing">let</span> <span class="org-tuareg-font-lock-operator"><span class="org-rainbow-delimiters-depth-1">()</span></span> <span class="org-tuareg-font-lock-operator">=</span>
<span class="linenr">45: </span>  <span class="org-comment-delimiter">(* </span><span class="org-comment">define [bf-mode] as a major mode </span><span class="org-comment-delimiter">*)</span>
<span class="linenr">46: </span>  <span class="org-tuareg-font-lock-governing">let</span> <span class="org-variable-name">mode</span> <span class="org-tuareg-font-lock-operator">=</span>
<span class="linenr">47: </span>    <span class="org-tuareg-font-lock-module">Major_mode.</span>define_derived_mode <span class="org-tuareg-font-lock-label">~parent</span><span class="org-tuareg-font-lock-operator">:</span><span class="org-tuareg-font-lock-module">Major_mode.</span>fundamental
<span class="linenr">48: </span>      <span class="org-tuareg-font-lock-extension-node"><span class="org-rainbow-delimiters-depth-1">[</span></span><span class="org-tuareg-font-lock-extension-node">%here</span><span class="org-tuareg-font-lock-extension-node"><span class="org-rainbow-delimiters-depth-1">]</span></span>
<span class="linenr">49: </span>      <span class="org-tuareg-font-lock-label">~change_command</span><span class="org-tuareg-font-lock-operator">:</span>mode_name
<span class="linenr">50: </span>      <span class="org-tuareg-font-lock-label">~docstring</span><span class="org-tuareg-font-lock-operator">:</span><span class="org-string">"Major mode for interacting with brainfuck code."</span>
<span class="linenr">51: </span>      <span class="org-tuareg-font-lock-label">~initialize</span><span class="org-tuareg-font-lock-operator">:</span><span class="org-builtin">ignore</span>
<span class="linenr">52: </span>      <span class="org-tuareg-font-lock-label">~mode_line</span><span class="org-tuareg-font-lock-operator">:</span><span class="org-string">"brainfuck"</span>
<span class="linenr">53: </span>  <span class="org-tuareg-font-lock-governing">in</span>
<span class="linenr">54: </span>  <span class="org-comment-delimiter">(* </span><span class="org-comment">bind [C-x C-e] to [bf-eval-buffer] </span><span class="org-comment-delimiter">*)</span>
<span class="linenr">55: </span>  <span class="org-tuareg-font-lock-governing">let</span> <span class="org-variable-name">keymap</span> <span class="org-tuareg-font-lock-operator">=</span> <span class="org-tuareg-font-lock-module">Major_mode.</span>keymap mode <span class="org-tuareg-font-lock-governing">in</span>
<span class="linenr">56: </span>  <span class="org-tuareg-font-lock-module">Keymap.</span>define_key keymap <span class="org-tuareg-font-lock-operator"><span class="org-rainbow-delimiters-depth-1">(</span></span><span class="org-tuareg-font-lock-module">Key_sequence.</span>create_exn <span class="org-string">"C-x C-e"</span><span class="org-tuareg-font-lock-operator"><span class="org-rainbow-delimiters-depth-1">)</span></span>
<span class="linenr">57: </span>    <span class="org-tuareg-font-lock-operator"><span class="org-rainbow-delimiters-depth-1">(</span></span><span class="org-tuareg-font-lock-constructor">Command</span> <span class="org-tuareg-font-lock-operator"><span class="org-rainbow-delimiters-depth-2">(</span></span><span class="org-tuareg-font-lock-module">Command.</span>of_value_exn <span class="org-tuareg-font-lock-operator"><span class="org-rainbow-delimiters-depth-3">(</span></span><span class="org-tuareg-font-lock-module">Value.</span>intern <span class="org-string">"bf-eval-buffer"</span><span class="org-tuareg-font-lock-operator"><span class="org-rainbow-delimiters-depth-3">)</span></span><span class="org-tuareg-font-lock-operator"><span class="org-rainbow-delimiters-depth-2">)</span></span><span class="org-tuareg-font-lock-operator"><span class="org-rainbow-delimiters-depth-1">)</span></span>
<span class="linenr">58: </span><span class="org-tuareg-font-double-colon">;;</span>
</pre>
</div>

<p>
Our major mode doesn&rsquo;t need anything to be initialized, so we simply pass
<code>ignore</code> to that argument. <code>change_command</code> determines the name of the mode,
i.e., the name of the command that sets the major mode, in this case,
<code>bf-mode</code>. <code>docstring</code> provides the documentation that is displayed by
<code>describe-mode</code> (<kbd>C-h m</kbd>), while <code>mode_line</code> sets the
display name of the mode that appears, unsurprisingly, in the mode line.
</p>

<p>
Additionally, we bind the sequence <kbd>C-x C-e</kbd> to
<code>bf-eval-buffer</code> in the keymap of our major mode, so that when the mode is
activated, pressing that key sequence will trigger the command
<code>bf-eval=buffer</code>.
</p>
</div>
</div>
<div id="outline-container-org91f97ba" class="outline-3">
<h3 id="automatically-starting"><a id="org91f97ba"></a><span class="section-number-3">2.4</span> Automatically starting <code>bf-mode</code></h3>
<div class="outline-text-3" id="text-automatically-starting">
<p>
Ecaml&rsquo;s <code>Auto_mode_alist</code> module provides a clean interface for manipulating
the <code>auto-mode-alist</code> variable, which determines how Emacs decides which
major mode to use when you open a file, like starting <code>c-mode</code> when you open
a file named <code>foo.c</code>. We&rsquo;ll register <code>bf-mode</code> for <code>*.b</code> files:
</p>

<div class="org-src-container">
<pre class="src src-ocaml"><span class="linenr">60: </span><span class="org-tuareg-font-lock-governing">let</span> <span class="org-tuareg-font-lock-operator"><span class="org-rainbow-delimiters-depth-1">()</span></span> <span class="org-tuareg-font-lock-operator">=</span>
<span class="linenr">61: </span>  <span class="org-comment-delimiter">(* </span><span class="org-comment">automatically start [bf-mode] upon opening a [*.b] file </span><span class="org-comment-delimiter">*)</span>
<span class="linenr">62: </span>  <span class="org-tuareg-font-lock-module">Auto_mode_alist.</span>add <span class="org-tuareg-font-lock-module">Auto_mode_alist.Entry.</span><span class="org-tuareg-font-lock-operator"><span class="org-rainbow-delimiters-depth-1">(</span></span>
<span class="linenr">63: </span>    <span class="org-tuareg-font-lock-operator"><span class="org-rainbow-delimiters-depth-2">[</span></span> <span class="org-tuareg-font-lock-operator"><span class="org-rainbow-delimiters-depth-3">{</span></span> delete_suffix_and_recur <span class="org-tuareg-font-lock-operator">=</span> <span class="org-constant">false</span>
<span class="linenr">64: </span>      <span class="org-tuareg-font-lock-operator">;</span> filename_match <span class="org-tuareg-font-lock-operator">=</span> <span class="org-tuareg-font-lock-module">Regexp.</span>of_pattern <span class="org-string">"\\.b\\'"</span>
<span class="linenr">65: </span>      <span class="org-tuareg-font-lock-operator">;</span> function_ <span class="org-tuareg-font-lock-operator">=</span> <span class="org-tuareg-font-lock-constructor">Some</span> mode_name <span class="org-tuareg-font-lock-operator"><span class="org-rainbow-delimiters-depth-3">}</span></span> <span class="org-tuareg-font-lock-operator"><span class="org-rainbow-delimiters-depth-2">]</span></span><span class="org-tuareg-font-lock-operator"><span class="org-rainbow-delimiters-depth-1">)</span></span>
<span class="linenr">66: </span><span class="org-tuareg-font-double-colon">;;</span>
</pre>
</div>

<p>
The <code>delete_suffix_and_recur</code> field allows an entry to defer to another mode
when the file may have more than one extension.<sup><a id="fnr.1" class="footref" href="#fn.1">1</a></sup> <code>filename_match</code>
specifies the regular expression that determines which filenames activate the
major mode. Finally, <code>function_</code> specifies what mode to activate when the
file name matches <code>filename_match</code><sup><a id="fnr.2" class="footref" href="#fn.2">2</a></sup>.
</p>
</div>
</div>
<div id="outline-container-org40f5e92" class="outline-3">
<h3 id="finishing-up"><a id="org40f5e92"></a><span class="section-number-3">2.5</span> Finishing up</h3>
<div class="outline-text-3" id="text-finishing-up">
<p>
Last but not least, we declare to Emacs that our plugin is done initializing
and provides the feature <code>ecaml-bf</code> that it so kindly requested:
</p>

<div class="org-src-container">
<pre class="src src-ocaml"><span class="linenr">68: </span><span class="org-tuareg-font-lock-governing">let</span> <span class="org-tuareg-font-lock-operator"><span class="org-rainbow-delimiters-depth-1">()</span></span> <span class="org-tuareg-font-lock-operator">=</span> provide <span class="org-tuareg-font-lock-operator"><span class="org-rainbow-delimiters-depth-1">(</span></span><span class="org-tuareg-font-lock-module">Symbol.</span>intern <span class="org-string">"ecaml-bf"</span><span class="org-tuareg-font-lock-operator"><span class="org-rainbow-delimiters-depth-1">)</span></span>
</pre>
</div>
</div>
</div>
</div>
<div id="outline-container-orgea4aec1" class="outline-2">
<h2 id="demo"><a id="orgea4aec1"></a><span class="section-number-2">3</span> Demo</h2>
<div class="outline-text-2" id="text-demo">
<p>
(You may need to open the video in full screen in order to read anything,
sorry!)
</p>

<video controls>
    <source type="video/mp4"       src="/assets/videos/ecaml-bf-demo.mp4">
    <source type="video/webm"      src="/assets/videos/ecaml-bf-demo.webm">
    <source type="video/quicktime" src="/assets/videos/ecaml-bf-demo.mov">
</video>
</div>
</div>
<div id="outline-container-orgf7f213d" class="outline-2">
<h2 id="orgf7f213d"><span class="section-number-2">4</span> Hooray!!!</h2>
<div class="outline-text-2" id="text-4">
<div class="org-src-container">
<pre class="src src-sh">jbuilder build @plugin
emacs -Q -L _build/default/src --eval <span class="org-string">"(require 'ecaml-bf)"</span>
</pre>
</div>

<p>
Try it out for yourself!
</p>

<p>
All of the code&rsquo;s on <a href="https://github.com/bcc32/ecaml-bf">GitHub</a>. Bug reports and patches welcome.
</p>
</div>
</div>
<div id="footnotes">
<h2 class="footnotes">Footnotes: </h2>
<div id="text-footnotes">

<div class="footdef"><sup><a id="fn.1" class="footnum" href="#fnr.1">1</a></sup> <div class="footpara"><p class="footpara">
For example, you could have an entry that is activated by opening a file
with a <code>.gz</code> suffix, such as <code>foo.c.gz</code>. It might decompress the file and then
recursively activate the correct major mode based on the remainder of the
filename.
</p></div></div>

<div class="footdef"><sup><a id="fn.2" class="footnum" href="#fnr.2">2</a></sup> <div class="footpara"><p class="footpara">
It actually doesn&rsquo;t have to be a mode change command; it can just be any
function.
</p></div></div>


</div>
</div>]]></content><author><name>Aaron L. Zeng</name></author><category term="ecaml-getting-started" /><category term="ecaml" /><category term="emacs" /><category term="ocaml" /><summary type="html"><![CDATA[This post is part 4 of a series (prev). The full code is available on GitHub.]]></summary></entry><entry><title type="html">Emacs Plugins in OCaml: Brain-what?! (part 2)</title><link href="https://blag.bcc32.com/ecaml-getting-started/2017/11/12/emacs-plugins-in-ocaml-2/" rel="alternate" type="text/html" title="Emacs Plugins in OCaml: Brain-what?! (part 2)" /><published>2017-11-12T16:00:00-05:00</published><updated>2017-11-12T16:00:00-05:00</updated><id>https://blag.bcc32.com/ecaml-getting-started/2017/11/12/emacs-plugins-in-ocaml-2</id><content type="html" xml:base="https://blag.bcc32.com/ecaml-getting-started/2017/11/12/emacs-plugins-in-ocaml-2/"><![CDATA[<p class="alert alert-info">
This post is part 2 of a
<a class="alert-link" href="/categories/ecaml-getting-started/">series</a>
(<a class="alert-link" href="/ecaml-getting-started/2017/11/05/emacs-plugins-in-ocaml-1/">prev</a>/<a
class="alert-link" href="/ecaml-getting-started/2017/11/12/emacs-plugins-in-ocaml-3/">next</a>).
The full code is available on
<a class="alert-link" href="https://github.com/bcc32/ecaml-bf">GitHub</a>.
</p>

<p>
Let&rsquo;s write the core of our plugin, a <a href="https://en.wikipedia.org/wiki/Brainfuck">brainfuck</a> interpreter. We&rsquo;ll make it a
library so it can be easily compiled into our plugin later, but this step
actually doesn&rsquo;t involve Emacs or Ecaml at all.
</p>

<p>
If you&rsquo;re not super interested in the interpreter code and just want to get to
the Emacs/Ecaml part, skim <a href="#intfc">the interface</a> briefly and then move on to the <a href="/ecaml-getting-started/2017/11/12/emacs-plugins-in-ocaml-3/">next</a>
post.
</p>

<div id="table-of-contents">
<h2>Table of Contents</h2>
<div id="text-table-of-contents">
<ul>
<li><a href="#what-the-brainfuck">1. What the brainfuck?</a>
<ul>
<li><a href="#commands">1.1. Commands</a></li>
<li><a href="#lets-c-that-again">1.2. Let&rsquo;s C that again?</a></li>
</ul>
</li>
<li><a href="#an-interpreter">2. An interpreter</a>
<ul>
<li><a href="#intfc">2.1. The interface</a></li>
<li><a href="#impl-mem">2.2. Memory</a></li>
<li><a href="#impl-io">2.3. Input and output</a></li>
<li><a href="#impl-lexer">2.4. Lexer</a></li>
<li><a href="#impl-repr">2.5. Program representation</a></li>
<li><a href="#impl-parser">2.6. Parser</a></li>
<li><a href="#impl-runner">2.7. Runner</a></li>
<li><a href="#impl-wrapper">2.8. Wrapper</a></li>
</ul>
</li>
<li><a href="#running">3. Trial run</a></li>
<li><a href="#to-be-continued">4. To be continued</a></li>
</ul>
</div>
</div>

<div id="outline-container-org16a103a" class="outline-2">
<h2 id="what-the-brainfuck"><a id="org16a103a"></a><span class="section-number-2">1</span> What the brainfuck?</h2>
<div class="outline-text-2" id="text-what-the-brainfuck">
<blockquote>
<p>
Brainfuck is an esoteric programming language created in 1993 by Urban Müller,
and notable for its extreme minimalism.
</p>

<p>
&#x2013; Wikipedia
</p>
</blockquote>

<p>
(The rest of this section is also based off of the <a href="https://en.wikipedia.org/wiki/Brainfuck">brainfuck</a> Wikipedia page).
</p>

<p>
The language has only eight commands, each denoted by a single character. All
commands operate on a single, infinite array of zero-initialized bytes. There
is also a single data pointer that points to some position in the array, plus
an instruction pointer that keeps track of where we are in the program.
</p>
</div>

<div id="outline-container-orgae84d94" class="outline-3">
<h3 id="commands"><a id="orgae84d94"></a><span class="section-number-3">1.1</span> Commands</h3>
<div class="outline-text-3" id="text-commands">
<p>
The commands are shown in the table below. Any characters in the source file
other than the 8 commands are treated as comments and ignored.
</p>

<p>
(&ldquo;Array element&rdquo; refers to the element currently pointed to by the data
pointer.)
</p>

<table border="2" cellspacing="0" cellpadding="6" rules="groups" frame="hsides">


<colgroup>
<col  class="org-left" />

<col  class="org-left" />
</colgroup>
<thead>
<tr>
<th scope="col" class="org-left">character</th>
<th scope="col" class="org-left">meaning</th>
</tr>
</thead>
<tbody>
<tr>
<td class="org-left"><code>&gt;</code></td>
<td class="org-left">increment the data pointer</td>
</tr>

<tr>
<td class="org-left"><code>&lt;</code></td>
<td class="org-left">decrement the data pointer</td>
</tr>

<tr>
<td class="org-left"><code>+</code></td>
<td class="org-left">increment the array element</td>
</tr>

<tr>
<td class="org-left"><code>-</code></td>
<td class="org-left">decrement the array element</td>
</tr>

<tr>
<td class="org-left"><code>.</code></td>
<td class="org-left">print out the array element (byte)</td>
</tr>

<tr>
<td class="org-left"><code>,</code></td>
<td class="org-left">read into the array element (byte)</td>
</tr>

<tr>
<td class="org-left"><code>[</code></td>
<td class="org-left">skip to <code>]</code> if byte is zero</td>
</tr>

<tr>
<td class="org-left"><code>]</code></td>
<td class="org-left">jump back to <code>[</code> if byte is nonzero</td>
</tr>
</tbody>
</table>

<p>
Reading and printing are based on bytes. For example, if the data pointer
currently points to an array element of (decimal) value 97, and the program
executes command <code>.</code>, the byte 0x61 will be printed, and it will show up in
your terminal as <i>LATIN SMALL LETTER A</i>.
</p>
</div>
</div>
<div id="outline-container-org5a9e177" class="outline-3">
<h3 id="lets-c-that-again"><a id="org5a9e177"></a><span class="section-number-3">1.2</span> Let&rsquo;s C that again?</h3>
<div class="outline-text-3" id="text-lets-c-that-again">
<p>
If you&rsquo;re familiar with C, you may find the following equivalence between
brainfuck and C helpful.
</p>

<p>
We&rsquo;ll assume the following <code>main()</code>:
</p>

<div class="org-src-container">
<pre class="src src-c"><span class="org-type">int</span> <span class="org-function-name">main</span><span class="org-rainbow-delimiters-depth-1">()</span> <span class="org-rainbow-delimiters-depth-1">{</span>
  <span class="org-type">uint8_t</span> <span class="org-variable-name">memory</span><span class="org-rainbow-delimiters-depth-2">[</span>QUITE_BIG<span class="org-rainbow-delimiters-depth-2">]</span> = <span class="org-rainbow-delimiters-depth-2">{</span><span class="org-highlight-numbers-number">0</span><span class="org-rainbow-delimiters-depth-2">}</span>;
  <span class="org-type">uint8_t</span> *<span class="org-variable-name">ptr</span> = &amp;memory<span class="org-rainbow-delimiters-depth-2">[</span><span class="org-highlight-numbers-number">0</span><span class="org-rainbow-delimiters-depth-2">]</span>;

  <span class="org-comment-delimiter">// </span><span class="org-comment">brainfuck code goes here</span>
<span class="org-rainbow-delimiters-depth-1">}</span>
</pre>
</div>

<table border="2" cellspacing="0" cellpadding="6" rules="groups" frame="hsides">


<colgroup>
<col  class="org-left" />

<col  class="org-left" />
</colgroup>
<thead>
<tr>
<th scope="col" class="org-left">brainfuck</th>
<th scope="col" class="org-left">C</th>
</tr>
</thead>
<tbody>
<tr>
<td class="org-left"><code>&gt;</code></td>
<td class="org-left"><code>++ptr;</code></td>
</tr>

<tr>
<td class="org-left"><code>&lt;</code></td>
<td class="org-left"><code>--ptr;</code></td>
</tr>

<tr>
<td class="org-left"><code>+</code></td>
<td class="org-left"><code>++*ptr;</code></td>
</tr>

<tr>
<td class="org-left"><code>-</code></td>
<td class="org-left"><code>--*ptr;</code></td>
</tr>

<tr>
<td class="org-left"><code>.</code></td>
<td class="org-left"><code>putchar(*ptr);</code></td>
</tr>

<tr>
<td class="org-left"><code>,</code></td>
<td class="org-left"><code>*ptr = getchar();</code></td>
</tr>

<tr>
<td class="org-left"><code>[</code></td>
<td class="org-left"><code>while (*ptr) {</code></td>
</tr>

<tr>
<td class="org-left"><code>]</code></td>
<td class="org-left"><code>}</code></td>
</tr>
</tbody>
</table>
</div>
</div>
</div>
<div id="outline-container-orgd8707a5" class="outline-2">
<h2 id="an-interpreter"><a id="orgd8707a5"></a><span class="section-number-2">2</span> An interpreter</h2>
<div class="outline-text-2" id="text-an-interpreter">
<p>
For our simple implementation, we&rsquo;ll assume that our array has a size of 10<sup>6</sup>
bytes (1MB), and that the data pointer starts on the leftmost side of the
array.
</p>

<p>
Let&rsquo;s set up our workspace&#x2026;
</p>

<div class="org-src-container">
<pre class="src src-sh">mkdir -p ~/ecaml-bf/lib/bf_lang
<span class="org-builtin">cd</span> ~/ecaml-bf/lib/bf_lang
touch bf_lang.ml<span class="org-rainbow-delimiters-depth-1">{</span>,i<span class="org-rainbow-delimiters-depth-1">}</span>
vi jbuild
</pre>
</div>

<p>
I&rsquo;m using Jane Street&rsquo;s <a href="https://github.com/janestreet/core_kernel"><code>Core_kernel</code></a> standard library overlay since I&rsquo;m
familiar with its idioms, and it&rsquo;ll already be installed since <a href="https://github.com/janestreet/ecaml">Ecaml</a> depends
on it. Here&rsquo;s our <code>jbuild</code>:
</p>

<div class="org-src-container">
<pre class="src src-tuareg-jbuild"><span class="org-rainbow-delimiters-depth-1">(</span><span class="org-keyword">jbuild_version</span> <span class="org-highlight-numbers-number">1</span><span class="org-rainbow-delimiters-depth-1">)</span>

<span class="org-rainbow-delimiters-depth-1">(</span><span class="org-keyword">library</span>
 <span class="org-rainbow-delimiters-depth-2">(</span><span class="org-rainbow-delimiters-depth-3">(</span><span class="org-constant">name</span>      bf_lib<span class="org-rainbow-delimiters-depth-3">)</span>
  <span class="org-rainbow-delimiters-depth-3">(</span><span class="org-constant">synopsis</span>  <span class="org-string">"An interpreter library for brainfuck."</span><span class="org-rainbow-delimiters-depth-3">)</span>
  <span class="org-rainbow-delimiters-depth-3">(</span><span class="org-constant">libraries</span> <span class="org-rainbow-delimiters-depth-4">(</span>core_kernel<span class="org-rainbow-delimiters-depth-4">)</span><span class="org-rainbow-delimiters-depth-3">)</span><span class="org-rainbow-delimiters-depth-2">)</span><span class="org-rainbow-delimiters-depth-1">)</span>
</pre>
</div>
</div>
<div id="outline-container-org7c84eb3" class="outline-3">
<h3 id="intfc"><a id="org7c84eb3"></a><span class="section-number-3">2.1</span> The interface</h3>
<div class="outline-text-3" id="text-intfc">
<p>
Here&rsquo;s what our library&rsquo;s interface will look like:
</p>

<div class="org-src-container">
<pre class="src src-ocaml"><span class="linenr"> 1: </span><span class="org-tuareg-font-lock-governing">open! </span><span class="org-tuareg-font-lock-module">Core_kernel</span>
<span class="linenr"> 2: </span>
<span class="linenr"> 3: </span><span class="org-tuareg-font-lock-governing">module</span> <span class="org-tuareg-font-lock-module">Input</span> <span class="org-tuareg-font-lock-operator">:</span> <span class="org-tuareg-font-lock-governing">sig</span>
<span class="linenr"> 4: </span>  <span class="org-doc">(** Stateful input source. *)</span>
<span class="linenr"> 5: </span>  <span class="org-tuareg-font-lock-governing">type</span> <span class="org-type">t</span>
<span class="linenr"> 6: </span>
<span class="linenr"> 7: </span>  <span class="org-tuareg-font-lock-governing">val</span> <span class="org-function-name">of_string</span> <span class="org-tuareg-font-lock-operator">:</span> string <span class="org-tuareg-font-lock-operator">-&gt;</span> t
<span class="linenr"> 8: </span>
<span class="linenr"> 9: </span>  <span class="org-tuareg-font-lock-governing">val</span> <span class="org-function-name">of_chan</span> <span class="org-tuareg-font-lock-operator">:</span> <span class="org-tuareg-font-lock-module">In_channel.</span>t <span class="org-tuareg-font-lock-operator">-&gt;</span> t
<span class="linenr">10: </span><span class="org-tuareg-font-lock-governing">end</span>
<span class="linenr">11: </span>
<span class="linenr">12: </span><span class="org-tuareg-font-lock-governing">module</span> <span class="org-tuareg-font-lock-module">Output</span> <span class="org-tuareg-font-lock-operator">:</span> <span class="org-tuareg-font-lock-governing">sig</span>
<span class="linenr">13: </span>  <span class="org-doc">(** Stateful output sink. *)</span>
<span class="linenr">14: </span>  <span class="org-tuareg-font-lock-governing">type</span> <span class="org-type">t</span>
<span class="linenr">15: </span>
<span class="linenr">16: </span>  <span class="org-tuareg-font-lock-governing">val</span> <span class="org-function-name">of_buffer</span> <span class="org-tuareg-font-lock-operator">:</span> <span class="org-tuareg-font-lock-module">Buffer.</span>t <span class="org-tuareg-font-lock-operator">-&gt;</span> t
<span class="linenr">17: </span>
<span class="linenr">18: </span>  <span class="org-tuareg-font-lock-governing">val</span> <span class="org-function-name">of_chan</span> <span class="org-tuareg-font-lock-operator">:</span> <span class="org-tuareg-font-lock-module">Out_channel.</span>t <span class="org-tuareg-font-lock-operator">-&gt;</span> t
<span class="linenr">19: </span><span class="org-tuareg-font-lock-governing">end</span>
<span class="linenr">20: </span>
<span class="linenr">21: </span><span class="org-tuareg-font-lock-governing">module</span> <span class="org-tuareg-font-lock-module">Memory</span> <span class="org-tuareg-font-lock-operator">:</span> <span class="org-tuareg-font-lock-governing">sig</span>
<span class="linenr">22: </span>  <span class="org-doc">(** A big byte array. *)</span>
<span class="linenr">23: </span>  <span class="org-tuareg-font-lock-governing">type</span> <span class="org-type">t</span> <span class="org-tuareg-font-lock-operator">=</span> <span class="org-tuareg-font-lock-module">Bigstring.</span>t
<span class="linenr">24: </span>
<span class="linenr">25: </span>  <span class="org-doc">(** [create_fresh n] creates a zero-initialized memory of size [n] bytes. *)</span>
<span class="linenr">26: </span>  <span class="org-tuareg-font-lock-governing">val</span> <span class="org-function-name">create_fresh</span> <span class="org-tuareg-font-lock-operator">:</span> int <span class="org-tuareg-font-lock-operator">-&gt;</span> t
<span class="linenr">27: </span><span class="org-tuareg-font-lock-governing">end</span>
<span class="linenr">28: </span>
<span class="linenr">29: </span><span class="org-tuareg-font-lock-governing">module</span> <span class="org-tuareg-font-lock-module">Program</span> <span class="org-tuareg-font-lock-operator">:</span> <span class="org-tuareg-font-lock-governing">sig</span>
<span class="linenr">30: </span>  <span class="org-doc">(** An immutable representation of a brainfuck program. *)</span>
<span class="linenr">31: </span>  <span class="org-tuareg-font-lock-governing">type</span> <span class="org-type">t</span>
<span class="linenr">32: </span>
<span class="linenr">33: </span>  <span class="org-doc">(** "Compile" a program from its source code. *)</span>
<span class="linenr">34: </span>  <span class="org-tuareg-font-lock-governing">val</span> <span class="org-function-name">parse</span> <span class="org-tuareg-font-lock-operator">:</span> <span class="org-tuareg-font-lock-module">Input.</span>t <span class="org-tuareg-font-lock-operator">-&gt;</span> t
<span class="linenr">35: </span>
<span class="linenr">36: </span>  <span class="org-doc">(** [run] mutates [memory], consumes [input], and writes to [output]. *)</span>
<span class="linenr">37: </span>  <span class="org-tuareg-font-lock-governing">val</span> <span class="org-function-name">run</span>
<span class="linenr">38: </span>    <span class="org-tuareg-font-lock-operator">:</span>  t
<span class="linenr">39: </span>    <span class="org-tuareg-font-lock-operator">-&gt;</span> <span class="org-tuareg-font-lock-label">memory</span> <span class="org-tuareg-font-lock-operator">:</span> <span class="org-tuareg-font-lock-module">Memory.</span>t
<span class="linenr">40: </span>    <span class="org-tuareg-font-lock-operator">-&gt;</span> <span class="org-tuareg-font-lock-label">input</span>  <span class="org-tuareg-font-lock-operator">:</span> <span class="org-tuareg-font-lock-module">Input.</span>t
<span class="linenr">41: </span>    <span class="org-tuareg-font-lock-operator">-&gt;</span> <span class="org-tuareg-font-lock-label">output</span> <span class="org-tuareg-font-lock-operator">:</span> <span class="org-tuareg-font-lock-module">Output.</span>t
<span class="linenr">42: </span>    <span class="org-tuareg-font-lock-operator">-&gt;</span> unit
<span class="linenr">43: </span>
<span class="linenr">44: </span>  <span class="org-doc">(** A simple interface for [run]. *)</span>
<span class="linenr">45: </span>  <span class="org-tuareg-font-lock-governing">val</span> <span class="org-function-name">run'</span>
<span class="linenr">46: </span>    <span class="org-tuareg-font-lock-operator">:</span>  <span class="org-tuareg-font-lock-label">program</span> <span class="org-tuareg-font-lock-operator">:</span> string
<span class="linenr">47: </span>    <span class="org-tuareg-font-lock-operator">-&gt;</span> <span class="org-tuareg-font-lock-label">input</span>   <span class="org-tuareg-font-lock-operator">:</span> string
<span class="linenr">48: </span>    <span class="org-tuareg-font-lock-operator">-&gt;</span> string                   <span class="org-comment-delimiter">(* </span><span class="org-comment">output </span><span class="org-comment-delimiter">*)</span>
<span class="linenr">49: </span><span class="org-tuareg-font-lock-governing">end</span>
</pre>
</div>

<p>
<code>Input.t</code> and <code>Output.t</code> represent possible sources of input and output for
our programs (and for the source of the program text). The <code>of_*</code> functions
convert other types into things our program can read from. For example, the
return value of <code>Input.of_string s</code> returns a value that keeps track of how
far into the string we&rsquo;ve read.
</p>

<p>
<code>Memory.t</code> represents the working memory of the programs, i.e., a very large
byte array.
</p>

<p>
Finally, <code>Program</code>, the most exciting module of our interpreter, can be used
in one of two ways:
</p>

<ol class="org-ol">
<li><code>run</code>, the lower-level interface, gives the user full control over the
input and output source of the program. It also allows the user to parse
the program just once and reuse the result multiple times.</li>
<li><code>run'</code>, a simplified interface, simply accepts strings as input and
returns a string as output.</li>
</ol>
</div>
</div>
<div id="outline-container-orgf91352c" class="outline-3">
<h3 id="impl-mem"><a id="orgf91352c"></a><span class="section-number-3">2.2</span> Memory</h3>
<div class="outline-text-3" id="text-impl-mem">
<p>
Memory needs to be a large array of bytes. While this sounds like a job for
type <code>bytes</code>, we&rsquo;ll actually use <code>Bigstring.t</code>, which doesn&rsquo;t have the same
size limitations<sup><a id="fnr.1" class="footref" href="#fn.1">1</a></sup> as <code>bytes</code>.
</p>

<p>
<code>Bigstring.t</code> is a type alias for <code>(char, Bigarray.int8_unsigned_elt,
   Bigarray.c_layout) Bigarray.Array1.t</code> from the <code>bigarray</code> library. You can
think of it as basically being a pointer to a big contiguous area of
<code>malloc()</code>&rsquo;d memory.
</p>

<p>
We&rsquo;ll also add a new function for creating a zero-initialized memory of a
given length. <code>Bigstring.create</code> by itself just <i>allocates</i> the memory, but
doesn&rsquo;t <i>initialize</i> it, so it could have arbitrary contents.
</p>

<div class="org-src-container">
<pre class="src src-ocaml"><span class="linenr">35: </span><span class="org-tuareg-font-lock-governing">module</span> <span class="org-tuareg-font-lock-module">Memory</span> <span class="org-tuareg-font-lock-operator">=</span> <span class="org-tuareg-font-lock-governing">struct</span>
<span class="linenr">36: </span>  <span class="org-tuareg-font-lock-governing">type</span> <span class="org-type">t</span> <span class="org-tuareg-font-lock-operator">=</span> <span class="org-tuareg-font-lock-module">Bigstring.</span>t
<span class="linenr">37: </span>
<span class="linenr">38: </span>  <span class="org-tuareg-font-lock-governing">let</span> <span class="org-function-name">create_fresh</span><span class="org-variable-name"> n</span> <span class="org-tuareg-font-lock-operator">=</span>
<span class="linenr">39: </span>    <span class="org-tuareg-font-lock-governing">let</span> <span class="org-variable-name">memory</span> <span class="org-tuareg-font-lock-operator">=</span> <span class="org-tuareg-font-lock-module">Bigstring.</span>create n <span class="org-tuareg-font-lock-governing">in</span>
<span class="linenr">40: </span>    <span class="org-tuareg-font-lock-module">Bigarray.Array1.</span>fill memory <span class="org-string">'\000'</span><span class="org-tuareg-font-lock-operator">;</span>
<span class="linenr">41: </span>    memory
<span class="linenr">42: </span>  <span class="org-tuareg-font-double-colon">;;</span>
<span class="linenr">43: </span><span class="org-tuareg-font-lock-governing">end</span>
</pre>
</div>
</div>
</div>
<div id="outline-container-org24c5a2b" class="outline-3">
<h3 id="impl-io"><a id="org24c5a2b"></a><span class="section-number-3">2.3</span> Input and output</h3>
<div class="outline-text-3" id="text-impl-io">
<p>
Since we want our library to be fairly general-purpose, we&rsquo;ll accept input
and produce output to a generic interface, with adapters for common
input/output sources/sinks such as <code>in_channel</code> or <code>Buffer.t</code>.
</p>

<p>
Input will be a closure that returns a <code>char option</code> (it returns <code>None</code> at
the end of input). We&rsquo;ll also define an <code>iter</code> function that repeatedly calls
its argument with the next character from the input until the input is
exhausted.
</p>

<div class="org-src-container">
<pre class="src src-ocaml"><span class="linenr"> 3: </span><span class="org-tuareg-font-lock-governing">module</span> <span class="org-tuareg-font-lock-module">Input</span> <span class="org-tuareg-font-lock-operator">=</span> <span class="org-tuareg-font-lock-governing">struct</span>
<span class="linenr"> 4: </span>  <span class="org-comment-delimiter">(* </span><span class="org-comment">[t]'s return None at the end of input. </span><span class="org-comment-delimiter">*)</span>
<span class="linenr"> 5: </span>  <span class="org-tuareg-font-lock-governing">type</span> <span class="org-type">t</span> <span class="org-tuareg-font-lock-operator">=</span> unit <span class="org-tuareg-font-lock-operator">-&gt;</span> char option
<span class="linenr"> 6: </span>
<span class="linenr"> 7: </span>  <span class="org-tuareg-font-lock-governing">let</span> <span class="org-function-name">of_string</span><span class="org-variable-name"> str</span> <span class="org-tuareg-font-lock-operator">=</span>
<span class="linenr"> 8: </span>    <span class="org-tuareg-font-lock-governing">let</span> <span class="org-variable-name">i</span> <span class="org-tuareg-font-lock-operator">=</span> <span class="org-builtin">ref</span> <span class="org-highlight-numbers-number">0</span> <span class="org-tuareg-font-lock-governing">in</span>
<span class="linenr"> 9: </span>    <span class="org-keyword">fun</span> <span class="org-tuareg-font-lock-operator"><span class="org-rainbow-delimiters-depth-1">()</span></span> <span class="org-tuareg-font-lock-operator">-&gt;</span>
<span class="linenr">10: </span>      <span class="org-keyword">if</span> <span class="org-tuareg-font-lock-operator">!</span>i <span class="org-tuareg-font-lock-operator">&gt;=</span> <span class="org-tuareg-font-lock-module">String.</span>length str
<span class="linenr">11: </span>      <span class="org-keyword">then</span> <span class="org-tuareg-font-lock-constructor">None</span>
<span class="linenr">12: </span>      <span class="org-keyword">else</span> <span class="org-tuareg-font-lock-operator"><span class="org-rainbow-delimiters-depth-1">(</span></span>
<span class="linenr">13: </span>        <span class="org-tuareg-font-lock-governing">let</span> <span class="org-variable-name">char</span> <span class="org-tuareg-font-lock-operator">=</span> str.<span class="org-tuareg-font-lock-operator"><span class="org-rainbow-delimiters-depth-2">[</span></span> <span class="org-tuareg-font-lock-operator">!</span>i <span class="org-tuareg-font-lock-operator"><span class="org-rainbow-delimiters-depth-2">]</span></span> <span class="org-tuareg-font-lock-governing">in</span>
<span class="linenr">14: </span>        incr i<span class="org-tuareg-font-lock-operator">;</span>
<span class="linenr">15: </span>        <span class="org-tuareg-font-lock-constructor">Some</span> char<span class="org-tuareg-font-lock-operator"><span class="org-rainbow-delimiters-depth-1">)</span></span>
<span class="linenr">16: </span>  <span class="org-tuareg-font-double-colon">;;</span>
<span class="linenr">17: </span>
<span class="linenr">18: </span>  <span class="org-tuareg-font-lock-governing">let</span> <span class="org-function-name">of_chan</span><span class="org-variable-name"> chan</span> <span class="org-tuareg-font-lock-operator">=</span> <span class="org-keyword">fun</span> <span class="org-tuareg-font-lock-operator"><span class="org-rainbow-delimiters-depth-1">()</span></span> <span class="org-tuareg-font-lock-operator">-&gt;</span> <span class="org-tuareg-font-lock-module">In_channel.</span>input_char chan
<span class="linenr">19: </span>
<span class="linenr">20: </span>  <span class="org-tuareg-font-lock-governing">let</span> <span class="org-tuareg-font-lock-governing">rec</span> <span class="org-function-name">iter</span><span class="org-variable-name"> t </span><span class="org-tuareg-font-lock-operator">~</span><span class="org-variable-name">f</span> <span class="org-tuareg-font-lock-operator">=</span>
<span class="linenr">21: </span>    <span class="org-keyword">match</span> t <span class="org-tuareg-font-lock-operator"><span class="org-rainbow-delimiters-depth-1">()</span></span> <span class="org-keyword">with</span>
<span class="linenr">22: </span>    <span class="org-tuareg-font-lock-operator">|</span> <span class="org-tuareg-font-lock-constructor">None</span>   <span class="org-tuareg-font-lock-operator">-&gt;</span> <span class="org-tuareg-font-lock-operator"><span class="org-rainbow-delimiters-depth-1">()</span></span>
<span class="linenr">23: </span>    <span class="org-tuareg-font-lock-operator">|</span> <span class="org-tuareg-font-lock-constructor">Some</span> c <span class="org-tuareg-font-lock-operator">-&gt;</span> f c<span class="org-tuareg-font-lock-operator">;</span> iter t <span class="org-tuareg-font-lock-operator">~</span>f
<span class="linenr">24: </span>  <span class="org-tuareg-font-double-colon">;;</span>
<span class="linenr">25: </span><span class="org-tuareg-font-lock-governing">end</span>
</pre>
</div>

<p>
Output will be represented as a function that accepts a <code>char</code>. It&rsquo;s a bit
simpler since we don&rsquo;t have to keep track of any state (<code>Buffer</code> and
<code>Out_channel</code> do it for us).
</p>

<div class="org-src-container">
<pre class="src src-ocaml"><span class="linenr">27: </span><span class="org-tuareg-font-lock-governing">module</span> <span class="org-tuareg-font-lock-module">Output</span> <span class="org-tuareg-font-lock-operator">=</span> <span class="org-tuareg-font-lock-governing">struct</span>
<span class="linenr">28: </span>  <span class="org-tuareg-font-lock-governing">type</span> <span class="org-type">t</span> <span class="org-tuareg-font-lock-operator">=</span> char <span class="org-tuareg-font-lock-operator">-&gt;</span> unit
<span class="linenr">29: </span>
<span class="linenr">30: </span>  <span class="org-tuareg-font-lock-governing">let</span> <span class="org-function-name">of_buffer</span><span class="org-variable-name"> buf</span> <span class="org-tuareg-font-lock-operator">=</span> <span class="org-keyword">fun</span> <span class="org-variable-name">char</span> <span class="org-tuareg-font-lock-operator">-&gt;</span> <span class="org-tuareg-font-lock-module">Buffer.</span>add_char buf char
<span class="linenr">31: </span>
<span class="linenr">32: </span>  <span class="org-tuareg-font-lock-governing">let</span> <span class="org-function-name">of_chan</span><span class="org-variable-name"> chan</span> <span class="org-tuareg-font-lock-operator">=</span> <span class="org-tuareg-font-lock-module">Pervasives.</span>output_char chan
<span class="linenr">33: </span><span class="org-tuareg-font-lock-governing">end</span>
</pre>
</div>

<p>
(<code>of_buffer</code> could have been written as simply <code>Buffer.add_char</code>, and
similarly for <code>of_chan</code>, but I wrote them out all the way<sup><a id="fnr.2" class="footref" href="#fn.2">2</a></sup> for clarity.)
</p>
</div>
</div>
<div id="outline-container-org6d501cc" class="outline-3">
<h3 id="impl-lexer"><a id="org6d501cc"></a><span class="section-number-3">2.4</span> Lexer</h3>
<div class="outline-text-3" id="text-impl-lexer">
<p>
Writing the lexer is straightforward. We can simply scan each character and
produce either a command token or <code>None</code> to ignore a non-command character.
</p>

<div class="org-src-container">
<pre class="src src-ocaml"><span class="linenr">46: </span><span class="org-tuareg-font-lock-governing">module</span> <span class="org-tuareg-font-lock-module">Command</span> <span class="org-tuareg-font-lock-operator">=</span> <span class="org-tuareg-font-lock-governing">struct</span>
<span class="linenr">47: </span>  <span class="org-tuareg-font-lock-governing">type</span> <span class="org-type">t</span> <span class="org-tuareg-font-lock-operator">=</span>
<span class="linenr">48: </span>    <span class="org-tuareg-font-lock-operator">|</span> <span class="org-tuareg-font-lock-constructor">Left</span>
<span class="linenr">49: </span>    <span class="org-tuareg-font-lock-operator">|</span> <span class="org-tuareg-font-lock-constructor">Right</span>
<span class="linenr">50: </span>    <span class="org-tuareg-font-lock-operator">|</span> <span class="org-tuareg-font-lock-constructor">Decrement</span>
<span class="linenr">51: </span>    <span class="org-tuareg-font-lock-operator">|</span> <span class="org-tuareg-font-lock-constructor">Increment</span>
<span class="linenr">52: </span>    <span class="org-tuareg-font-lock-operator">|</span> <span class="org-tuareg-font-lock-constructor">Input</span>
<span class="linenr">53: </span>    <span class="org-tuareg-font-lock-operator">|</span> <span class="org-tuareg-font-lock-constructor">Output</span>
<span class="linenr">54: </span>    <span class="org-tuareg-font-lock-operator">|</span> <span class="org-tuareg-font-lock-constructor">Loop_begin</span>
<span class="linenr">55: </span>    <span class="org-tuareg-font-lock-operator">|</span> <span class="org-tuareg-font-lock-constructor">Loop_end</span>
<span class="linenr">56: </span>
<span class="linenr">57: </span>  <span class="org-tuareg-font-lock-governing">let</span> <span class="org-function-name">of_char</span> <span class="org-tuareg-font-lock-operator">=</span> <span class="org-keyword">function</span>
<span class="linenr">58: </span>    <span class="org-tuareg-font-lock-operator">|</span> <span class="org-string">'&lt;'</span> <span class="org-tuareg-font-lock-operator">-&gt;</span> <span class="org-tuareg-font-lock-constructor">Some</span> <span class="org-tuareg-font-lock-constructor">Left</span>
<span class="linenr">59: </span>    <span class="org-tuareg-font-lock-operator">|</span> <span class="org-string">'&gt;'</span> <span class="org-tuareg-font-lock-operator">-&gt;</span> <span class="org-tuareg-font-lock-constructor">Some</span> <span class="org-tuareg-font-lock-constructor">Right</span>
<span class="linenr">60: </span>    <span class="org-tuareg-font-lock-operator">|</span> <span class="org-string">'-'</span> <span class="org-tuareg-font-lock-operator">-&gt;</span> <span class="org-tuareg-font-lock-constructor">Some</span> <span class="org-tuareg-font-lock-constructor">Decrement</span>
<span class="linenr">61: </span>    <span class="org-tuareg-font-lock-operator">|</span> <span class="org-string">'+'</span> <span class="org-tuareg-font-lock-operator">-&gt;</span> <span class="org-tuareg-font-lock-constructor">Some</span> <span class="org-tuareg-font-lock-constructor">Increment</span>
<span class="linenr">62: </span>    <span class="org-tuareg-font-lock-operator">|</span> <span class="org-string">','</span> <span class="org-tuareg-font-lock-operator">-&gt;</span> <span class="org-tuareg-font-lock-constructor">Some</span> <span class="org-tuareg-font-lock-constructor">Input</span>
<span class="linenr">63: </span>    <span class="org-tuareg-font-lock-operator">|</span> <span class="org-string">'.'</span> <span class="org-tuareg-font-lock-operator">-&gt;</span> <span class="org-tuareg-font-lock-constructor">Some</span> <span class="org-tuareg-font-lock-constructor">Output</span>
<span class="linenr">64: </span>    <span class="org-tuareg-font-lock-operator">|</span> <span class="org-string">'['</span> <span class="org-tuareg-font-lock-operator">-&gt;</span> <span class="org-tuareg-font-lock-constructor">Some</span> <span class="org-tuareg-font-lock-constructor">Loop_begin</span>
<span class="linenr">65: </span>    <span class="org-tuareg-font-lock-operator">|</span> <span class="org-string">']'</span> <span class="org-tuareg-font-lock-operator">-&gt;</span> <span class="org-tuareg-font-lock-constructor">Some</span> <span class="org-tuareg-font-lock-constructor">Loop_end</span>
<span class="linenr">66: </span>    <span class="org-tuareg-font-lock-operator">|</span> _   <span class="org-tuareg-font-lock-operator">-&gt;</span> <span class="org-tuareg-font-lock-constructor">None</span>
<span class="linenr">67: </span>  <span class="org-tuareg-font-double-colon">;;</span>
<span class="linenr">68: </span><span class="org-tuareg-font-lock-governing">end</span>
</pre>
</div>
</div>
</div>
<div id="outline-container-org652d213" class="outline-3">
<h3 id="impl-repr"><a id="org652d213"></a><span class="section-number-3">2.5</span> Program representation</h3>
<div class="outline-text-3" id="text-impl-repr">
<p>
A brainfuck program is simply a sequence of commands. Since the looping
commands require the interpreter to jump to the matching bracket, we&rsquo;ll
record the positions of corresponding left and right brackets in a <b>jump
table</b>. This is a hash table that maps the position of a bracket to the
position of its partner, and vice versa.
</p>

<div class="org-src-container">
<pre class="src src-ocaml"><span class="linenr">70: </span><span class="org-tuareg-font-lock-governing">type</span> <span class="org-type">t</span> <span class="org-tuareg-font-lock-operator">=</span>
<span class="linenr">71: </span>  <span class="org-tuareg-font-lock-operator"><span class="org-rainbow-delimiters-depth-1">{</span></span> commands   <span class="org-tuareg-font-lock-operator">:</span> <span class="org-tuareg-font-lock-module">Command.</span>t array   <span class="org-comment-delimiter">(* </span><span class="org-comment">just the sequence of commands </span><span class="org-comment-delimiter">*)</span>
<span class="linenr">72: </span>  <span class="org-tuareg-font-lock-operator">;</span> jump_table <span class="org-tuareg-font-lock-operator">:</span> int <span class="org-tuareg-font-lock-module">Int.Table.</span>t <span class="org-tuareg-font-lock-operator"><span class="org-rainbow-delimiters-depth-1">}</span></span> <span class="org-comment-delimiter">(* </span><span class="org-comment">maps matching brackets' positions to</span>
<span class="linenr">73: </span><span class="org-comment">                                      each others' </span><span class="org-comment-delimiter">*)</span>
</pre>
</div>
</div>
</div>
<div id="outline-container-orga676944" class="outline-3">
<h3 id="impl-parser"><a id="orga676944"></a><span class="section-number-3">2.6</span> Parser</h3>
<div class="outline-text-3" id="text-impl-parser">
<p>
The only job of the parser is to determine the positions of matching pairs of
brackets. It should croak if it finds unmatched bracket pairs in the program
text. We&rsquo;ll define two functions to do that:
</p>

<div class="org-src-container">
<pre class="src src-ocaml"><span class="linenr">75: </span><span class="org-tuareg-font-lock-governing">let</span> <span class="org-variable-name">extra_left</span> <span class="org-tuareg-font-lock-operator">=</span> invalid_argf <span class="org-string">"unmatched left bracket at command index %d"</span>
<span class="linenr">76: </span><span class="org-tuareg-font-lock-governing">let</span> <span class="org-variable-name">extra_right</span> <span class="org-tuareg-font-lock-operator">=</span> invalid_argf <span class="org-string">"unmatched right bracket at command index %d"</span>
</pre>
</div>

<p>
<code>invalid_argf</code> is a <code>printf</code>-like function; in this case, <code>extra_left</code> and
<code>extra_right</code> have types <code>int -&gt; unit -&gt; 'a</code>, i.e., they accept the current
position and a unit value, raising an exception with the location of a syntax
error. (&ldquo;Command index&rdquo; means the index of the character in program text
representing the unmatched bracket in question, if we ignore all non-command
characters.)
</p>

<p>
Here&rsquo;s the whole parser:
</p>

<div class="org-src-container">
<pre class="src src-ocaml"><span class="linenr"> 78: </span><span class="org-tuareg-font-lock-governing">let</span> <span class="org-function-name">parse</span><span class="org-variable-name"> source_code</span> <span class="org-tuareg-font-lock-operator">=</span>
<span class="linenr"> 79: </span>  <span class="org-comment-delimiter">(* </span><span class="org-comment">a stack </span><span class="org-comment-delimiter">*)</span>
<span class="linenr"> 80: </span>  <span class="org-tuareg-font-lock-governing">let</span> <span class="org-variable-name">open_brackets</span> <span class="org-tuareg-font-lock-operator">=</span> <span class="org-builtin">ref</span> <span class="org-tuareg-font-lock-operator"><span class="org-rainbow-delimiters-depth-1">[]</span></span> <span class="org-tuareg-font-lock-governing">in</span>
<span class="linenr"> 81: </span>  <span class="org-tuareg-font-lock-governing">let</span> <span class="org-variable-name">commands</span>      <span class="org-tuareg-font-lock-operator">=</span> <span class="org-builtin">ref</span> <span class="org-tuareg-font-lock-operator"><span class="org-rainbow-delimiters-depth-1">[]</span></span> <span class="org-tuareg-font-lock-governing">in</span>
<span class="linenr"> 82: </span>  <span class="org-tuareg-font-lock-governing">let</span> <span class="org-variable-name">current_pos</span>   <span class="org-tuareg-font-lock-operator">=</span> <span class="org-builtin">ref</span> <span class="org-highlight-numbers-number">0</span>  <span class="org-tuareg-font-lock-governing">in</span>
<span class="linenr"> 83: </span>  <span class="org-tuareg-font-lock-governing">let</span> <span class="org-variable-name">jump_table</span> <span class="org-tuareg-font-lock-operator">=</span> <span class="org-tuareg-font-lock-module">Int.Table.</span>create <span class="org-tuareg-font-lock-operator"><span class="org-rainbow-delimiters-depth-1">()</span></span> <span class="org-tuareg-font-lock-governing">in</span>
<span class="linenr"> 84: </span>  <span class="org-tuareg-font-lock-governing">let</span> <span class="org-function-name">add_command</span><span class="org-variable-name"> cmd</span> <span class="org-tuareg-font-lock-operator">=</span>
<span class="linenr"> 85: </span>    commands <span class="org-tuareg-font-lock-operator">:=</span> cmd <span class="org-tuareg-font-lock-operator">::</span> <span class="org-tuareg-font-lock-operator">!</span>commands<span class="org-tuareg-font-lock-operator">;</span>
<span class="linenr"> 86: </span>    incr current_pos
<span class="linenr"> 87: </span>  <span class="org-tuareg-font-lock-governing">in</span>
<span class="linenr"> 88: </span>  <span class="org-tuareg-font-lock-governing">let</span> <span class="org-function-name">push_left_bracket</span><span class="org-variable-name"> </span><span class="org-tuareg-font-lock-operator"><span class="org-rainbow-delimiters-depth-1">()</span></span> <span class="org-tuareg-font-lock-operator">=</span>
<span class="linenr"> 89: </span>    open_brackets <span class="org-tuareg-font-lock-operator">:=</span> <span class="org-tuareg-font-lock-operator">!</span>current_pos <span class="org-tuareg-font-lock-operator">::</span> <span class="org-tuareg-font-lock-operator">!</span>open_brackets<span class="org-tuareg-font-lock-operator">;</span>
<span class="linenr"> 90: </span>    add_command <span class="org-tuareg-font-lock-module">Command.</span><span class="org-tuareg-font-lock-constructor">Loop_begin</span>
<span class="linenr"> 91: </span>  <span class="org-tuareg-font-lock-governing">in</span>
<span class="linenr"> 92: </span>  <span class="org-tuareg-font-lock-governing">let</span> <span class="org-function-name">pop_left_bracket</span><span class="org-variable-name"> </span><span class="org-tuareg-font-lock-operator"><span class="org-rainbow-delimiters-depth-1">()</span></span> <span class="org-tuareg-font-lock-operator">=</span>
<span class="linenr"> 93: </span>    <span class="org-keyword">match</span> <span class="org-tuareg-font-lock-operator">!</span>open_brackets <span class="org-keyword">with</span>
<span class="linenr"> 94: </span>    <span class="org-tuareg-font-lock-operator">|</span> <span class="org-tuareg-font-lock-operator"><span class="org-rainbow-delimiters-depth-1">[]</span></span>     <span class="org-tuareg-font-lock-operator">-&gt;</span> extra_right <span class="org-tuareg-font-lock-operator">!</span>current_pos <span class="org-tuareg-font-lock-operator"><span class="org-rainbow-delimiters-depth-1">()</span></span>
<span class="linenr"> 95: </span>    <span class="org-tuareg-font-lock-operator">|</span> hd<span class="org-tuareg-font-lock-operator">::</span>tl <span class="org-tuareg-font-lock-operator">-&gt;</span>
<span class="linenr"> 96: </span>      <span class="org-tuareg-font-lock-module">Hashtbl.</span>add_exn jump_table <span class="org-tuareg-font-lock-label">~key</span><span class="org-tuareg-font-lock-operator">:!</span>current_pos <span class="org-tuareg-font-lock-label">~data</span><span class="org-tuareg-font-lock-operator">:</span>hd<span class="org-tuareg-font-lock-operator">;</span>
<span class="linenr"> 97: </span>      <span class="org-tuareg-font-lock-module">Hashtbl.</span>add_exn jump_table <span class="org-tuareg-font-lock-label">~key</span><span class="org-tuareg-font-lock-operator">:</span>hd <span class="org-tuareg-font-lock-label">~data</span><span class="org-tuareg-font-lock-operator">:!</span>current_pos<span class="org-tuareg-font-lock-operator">;</span>
<span class="linenr"> 98: </span>      open_brackets <span class="org-tuareg-font-lock-operator">:=</span> tl<span class="org-tuareg-font-lock-operator">;</span>
<span class="linenr"> 99: </span>      add_command <span class="org-tuareg-font-lock-module">Command.</span><span class="org-tuareg-font-lock-constructor">Loop_end</span>
<span class="linenr">100: </span>  <span class="org-tuareg-font-lock-governing">in</span>
<span class="linenr">101: </span>  <span class="org-tuareg-font-lock-module">Input.</span>iter source_code <span class="org-tuareg-font-lock-label">~f</span><span class="org-tuareg-font-lock-operator">:</span><span class="org-tuareg-font-lock-operator"><span class="org-rainbow-delimiters-depth-1">(</span></span><span class="org-keyword">fun</span> <span class="org-variable-name">char</span> <span class="org-tuareg-font-lock-operator">-&gt;</span>
<span class="linenr">102: </span>    <span class="org-keyword">match</span> <span class="org-tuareg-font-lock-module">Command.</span>of_char char <span class="org-keyword">with</span>
<span class="linenr">103: </span>    <span class="org-comment-delimiter">(* </span><span class="org-comment">ignore non-command characters </span><span class="org-comment-delimiter">*)</span>
<span class="linenr">104: </span>    <span class="org-tuareg-font-lock-operator">|</span> <span class="org-tuareg-font-lock-constructor">None</span> <span class="org-tuareg-font-lock-operator">-&gt;</span> <span class="org-tuareg-font-lock-operator"><span class="org-rainbow-delimiters-depth-2">()</span></span>
<span class="linenr">105: </span>
<span class="linenr">106: </span>    <span class="org-comment-delimiter">(* </span><span class="org-comment">regular commands just get appended </span><span class="org-comment-delimiter">*)</span>
<span class="linenr">107: </span>    <span class="org-tuareg-font-lock-operator">|</span> <span class="org-tuareg-font-lock-constructor">Some</span> <span class="org-tuareg-font-lock-operator"><span class="org-rainbow-delimiters-depth-2">(</span></span><span class="org-tuareg-font-lock-constructor">Left</span> <span class="org-tuareg-font-lock-operator">|</span> <span class="org-tuareg-font-lock-constructor">Right</span> <span class="org-tuareg-font-lock-operator">|</span> <span class="org-tuareg-font-lock-constructor">Decrement</span> <span class="org-tuareg-font-lock-operator">|</span> <span class="org-tuareg-font-lock-constructor">Increment</span> <span class="org-tuareg-font-lock-operator">|</span> <span class="org-tuareg-font-lock-constructor">Input</span> <span class="org-tuareg-font-lock-operator">|</span> <span class="org-tuareg-font-lock-constructor">Output</span> <span class="org-keyword">as</span> cmd<span class="org-tuareg-font-lock-operator"><span class="org-rainbow-delimiters-depth-2">)</span></span>
<span class="linenr">108: </span>      <span class="org-tuareg-font-lock-operator">-&gt;</span> add_command cmd
<span class="linenr">109: </span>
<span class="linenr">110: </span>    <span class="org-comment-delimiter">(* </span><span class="org-comment">record loop beginning positions </span><span class="org-comment-delimiter">*)</span>
<span class="linenr">111: </span>    <span class="org-tuareg-font-lock-operator">|</span> <span class="org-tuareg-font-lock-constructor">Some</span> <span class="org-tuareg-font-lock-constructor">Loop_begin</span> <span class="org-tuareg-font-lock-operator">-&gt;</span> push_left_bracket <span class="org-tuareg-font-lock-operator"><span class="org-rainbow-delimiters-depth-2">()</span></span>
<span class="linenr">112: </span>
<span class="linenr">113: </span>    <span class="org-comment-delimiter">(* </span><span class="org-comment">record entries in jump table </span><span class="org-comment-delimiter">*)</span>
<span class="linenr">114: </span>    <span class="org-tuareg-font-lock-operator">|</span> <span class="org-tuareg-font-lock-constructor">Some</span> <span class="org-tuareg-font-lock-constructor">Loop_end</span> <span class="org-tuareg-font-lock-operator">-&gt;</span> pop_left_bracket <span class="org-tuareg-font-lock-operator"><span class="org-rainbow-delimiters-depth-2">()</span></span><span class="org-tuareg-font-lock-operator"><span class="org-rainbow-delimiters-depth-1">)</span></span><span class="org-tuareg-font-lock-operator">;</span>
<span class="linenr">115: </span>
<span class="linenr">116: </span>  <span class="org-comment-delimiter">(* </span><span class="org-comment">check for any remaining open left brackets </span><span class="org-comment-delimiter">*)</span>
<span class="linenr">117: </span>  <span class="org-tuareg-font-lock-operator"><span class="org-rainbow-delimiters-depth-1">(</span></span><span class="org-keyword">match</span> <span class="org-tuareg-font-lock-operator">!</span>open_brackets <span class="org-keyword">with</span>
<span class="linenr">118: </span>   <span class="org-tuareg-font-lock-operator">|</span> <span class="org-tuareg-font-lock-operator"><span class="org-rainbow-delimiters-depth-2">[]</span></span> <span class="org-tuareg-font-lock-operator">-&gt;</span> <span class="org-tuareg-font-lock-operator"><span class="org-rainbow-delimiters-depth-2">()</span></span>
<span class="linenr">119: </span>   <span class="org-tuareg-font-lock-operator">|</span> hd <span class="org-tuareg-font-lock-operator">::</span> _ <span class="org-tuareg-font-lock-operator">-&gt;</span> extra_left hd <span class="org-tuareg-font-lock-operator"><span class="org-rainbow-delimiters-depth-2">()</span></span><span class="org-tuareg-font-lock-operator"><span class="org-rainbow-delimiters-depth-1">)</span></span><span class="org-tuareg-font-lock-operator">;</span>
<span class="linenr">120: </span>  <span class="org-tuareg-font-lock-operator"><span class="org-rainbow-delimiters-depth-1">{</span></span> commands <span class="org-tuareg-font-lock-operator">=</span> <span class="org-tuareg-font-lock-module">Array.</span>of_list_rev <span class="org-tuareg-font-lock-operator">!</span>commands
<span class="linenr">121: </span>  <span class="org-tuareg-font-lock-operator">;</span> jump_table <span class="org-tuareg-font-lock-operator"><span class="org-rainbow-delimiters-depth-1">}</span></span>
<span class="linenr">122: </span><span class="org-tuareg-font-double-colon">;;</span>
</pre>
</div>

<p>
Most of the code is fairly straightforward; most of the code is dedicated to
matching bracket pairs, and other commands are just passed straight through.
</p>
</div>
</div>
<div id="outline-container-org202c2a4" class="outline-3">
<h3 id="impl-runner"><a id="org202c2a4"></a><span class="section-number-3">2.7</span> Runner</h3>
<div class="outline-text-3" id="text-impl-runner">
<p>
The runner steps through the program text, executing commands on the given
memory, input, and output. It basically simulates the abstract state machine
that brainfuck code is meant to run on.
</p>

<div class="org-src-container">
<pre class="src src-ocaml"><span class="linenr">124: </span><span class="org-tuareg-font-lock-governing">let</span> <span class="org-function-name">run</span><span class="org-variable-name"> t </span><span class="org-tuareg-font-lock-operator">~</span><span class="org-variable-name">memory </span><span class="org-tuareg-font-lock-operator">~</span><span class="org-variable-name">input </span><span class="org-tuareg-font-lock-operator">~</span><span class="org-variable-name">output</span> <span class="org-tuareg-font-lock-operator">=</span>
<span class="linenr">125: </span>  <span class="org-comment-delimiter">(* </span><span class="org-comment">get/set bytes in memory; bytes are unsigned, wrap around on overflow </span><span class="org-comment-delimiter">*)</span>
<span class="linenr">126: </span>  <span class="org-tuareg-font-lock-governing">let</span> <span class="org-function-name">get</span><span class="org-variable-name"> pos</span>   <span class="org-tuareg-font-lock-operator">=</span> <span class="org-tuareg-font-lock-module">Bigstring.</span>get_uint8 memory <span class="org-tuareg-font-lock-operator">~</span>pos           <span class="org-tuareg-font-lock-governing">in</span>
<span class="linenr">127: </span>  <span class="org-tuareg-font-lock-governing">let</span> <span class="org-function-name">set</span><span class="org-variable-name"> pos x</span> <span class="org-tuareg-font-lock-operator">=</span> <span class="org-tuareg-font-lock-module">Bigstring.</span>set_uint8 memory <span class="org-tuareg-font-lock-operator">~</span>pos <span class="org-tuareg-font-lock-operator"><span class="org-rainbow-delimiters-depth-1">(</span></span>x <span class="org-tuareg-font-lock-operator">%</span> <span class="org-highlight-numbers-number">256</span><span class="org-tuareg-font-lock-operator"><span class="org-rainbow-delimiters-depth-1">)</span></span> <span class="org-tuareg-font-lock-governing">in</span>
<span class="linenr">128: </span>
<span class="linenr">129: </span>  <span class="org-comment-delimiter">(* </span><span class="org-comment">we can actually always jump one past the matching bracket </span><span class="org-comment-delimiter">*)</span>
<span class="linenr">130: </span>  <span class="org-tuareg-font-lock-governing">let</span> <span class="org-function-name">jump</span><span class="org-variable-name"> pc</span> <span class="org-tuareg-font-lock-operator">=</span> <span class="org-highlight-numbers-number">1</span> <span class="org-tuareg-font-lock-operator">+</span> <span class="org-tuareg-font-lock-module">Hashtbl.</span>find_exn t.jump_table pc <span class="org-tuareg-font-lock-governing">in</span>
<span class="linenr">131: </span>
<span class="linenr">132: </span>  <span class="org-comment-delimiter">(* </span><span class="org-comment">pc  :: program counter</span>
<span class="linenr">133: </span><span class="org-comment">     pos :: data pointer </span><span class="org-comment-delimiter">*)</span>
<span class="linenr">134: </span>  <span class="org-tuareg-font-lock-governing">let</span> <span class="org-tuareg-font-lock-governing">rec</span> <span class="org-function-name">loop</span><span class="org-variable-name"> </span><span class="org-tuareg-font-lock-operator">~</span><span class="org-variable-name">pc </span><span class="org-tuareg-font-lock-operator">~</span><span class="org-variable-name">pos</span> <span class="org-tuareg-font-lock-operator">=</span>
<span class="linenr">135: </span>    <span class="org-keyword">match</span> t.commands.<span class="org-tuareg-font-lock-operator"><span class="org-rainbow-delimiters-depth-1">(</span></span> pc <span class="org-tuareg-font-lock-operator"><span class="org-rainbow-delimiters-depth-1">)</span></span> <span class="org-keyword">with</span>
<span class="linenr">136: </span>    <span class="org-tuareg-font-lock-operator">|</span> <span class="org-tuareg-font-lock-constructor">Left</span>  <span class="org-tuareg-font-lock-operator">-&gt;</span> loop <span class="org-tuareg-font-lock-label">~pc</span><span class="org-tuareg-font-lock-operator">:</span><span class="org-tuareg-font-lock-operator"><span class="org-rainbow-delimiters-depth-1">(</span></span>pc <span class="org-tuareg-font-lock-operator">+</span> <span class="org-highlight-numbers-number">1</span><span class="org-tuareg-font-lock-operator"><span class="org-rainbow-delimiters-depth-1">)</span></span> <span class="org-tuareg-font-lock-label">~pos</span><span class="org-tuareg-font-lock-operator">:</span><span class="org-tuareg-font-lock-operator"><span class="org-rainbow-delimiters-depth-1">(</span></span>pos <span class="org-tuareg-font-lock-operator">-</span> <span class="org-highlight-numbers-number">1</span><span class="org-tuareg-font-lock-operator"><span class="org-rainbow-delimiters-depth-1">)</span></span>
<span class="linenr">137: </span>    <span class="org-tuareg-font-lock-operator">|</span> <span class="org-tuareg-font-lock-constructor">Right</span> <span class="org-tuareg-font-lock-operator">-&gt;</span> loop <span class="org-tuareg-font-lock-label">~pc</span><span class="org-tuareg-font-lock-operator">:</span><span class="org-tuareg-font-lock-operator"><span class="org-rainbow-delimiters-depth-1">(</span></span>pc <span class="org-tuareg-font-lock-operator">+</span> <span class="org-highlight-numbers-number">1</span><span class="org-tuareg-font-lock-operator"><span class="org-rainbow-delimiters-depth-1">)</span></span> <span class="org-tuareg-font-lock-label">~pos</span><span class="org-tuareg-font-lock-operator">:</span><span class="org-tuareg-font-lock-operator"><span class="org-rainbow-delimiters-depth-1">(</span></span>pos <span class="org-tuareg-font-lock-operator">+</span> <span class="org-highlight-numbers-number">1</span><span class="org-tuareg-font-lock-operator"><span class="org-rainbow-delimiters-depth-1">)</span></span>
<span class="linenr">138: </span>
<span class="linenr">139: </span>    <span class="org-comment-delimiter">(* </span><span class="org-comment">parenthesized for clarity </span><span class="org-comment-delimiter">*)</span>
<span class="linenr">140: </span>    <span class="org-tuareg-font-lock-operator">|</span> <span class="org-tuareg-font-lock-constructor">Decrement</span> <span class="org-tuareg-font-lock-operator">-&gt;</span> set pos <span class="org-tuareg-font-lock-operator"><span class="org-rainbow-delimiters-depth-1">(</span></span><span class="org-tuareg-font-lock-operator"><span class="org-rainbow-delimiters-depth-2">(</span></span>get pos<span class="org-tuareg-font-lock-operator"><span class="org-rainbow-delimiters-depth-2">)</span></span> <span class="org-tuareg-font-lock-operator">-</span> <span class="org-highlight-numbers-number">1</span><span class="org-tuareg-font-lock-operator"><span class="org-rainbow-delimiters-depth-1">)</span></span><span class="org-tuareg-font-lock-operator">;</span> loop <span class="org-tuareg-font-lock-label">~pc</span><span class="org-tuareg-font-lock-operator">:</span><span class="org-tuareg-font-lock-operator"><span class="org-rainbow-delimiters-depth-1">(</span></span>pc <span class="org-tuareg-font-lock-operator">+</span> <span class="org-highlight-numbers-number">1</span><span class="org-tuareg-font-lock-operator"><span class="org-rainbow-delimiters-depth-1">)</span></span> <span class="org-tuareg-font-lock-operator">~</span>pos
<span class="linenr">141: </span>    <span class="org-tuareg-font-lock-operator">|</span> <span class="org-tuareg-font-lock-constructor">Increment</span> <span class="org-tuareg-font-lock-operator">-&gt;</span> set pos <span class="org-tuareg-font-lock-operator"><span class="org-rainbow-delimiters-depth-1">(</span></span><span class="org-tuareg-font-lock-operator"><span class="org-rainbow-delimiters-depth-2">(</span></span>get pos<span class="org-tuareg-font-lock-operator"><span class="org-rainbow-delimiters-depth-2">)</span></span> <span class="org-tuareg-font-lock-operator">+</span> <span class="org-highlight-numbers-number">1</span><span class="org-tuareg-font-lock-operator"><span class="org-rainbow-delimiters-depth-1">)</span></span><span class="org-tuareg-font-lock-operator">;</span> loop <span class="org-tuareg-font-lock-label">~pc</span><span class="org-tuareg-font-lock-operator">:</span><span class="org-tuareg-font-lock-operator"><span class="org-rainbow-delimiters-depth-1">(</span></span>pc <span class="org-tuareg-font-lock-operator">+</span> <span class="org-highlight-numbers-number">1</span><span class="org-tuareg-font-lock-operator"><span class="org-rainbow-delimiters-depth-1">)</span></span> <span class="org-tuareg-font-lock-operator">~</span>pos
<span class="linenr">142: </span>
<span class="linenr">143: </span>    <span class="org-tuareg-font-lock-operator">|</span> <span class="org-tuareg-font-lock-constructor">Input</span> <span class="org-tuareg-font-lock-operator">-&gt;</span>
<span class="linenr">144: </span>      <span class="org-tuareg-font-lock-operator"><span class="org-rainbow-delimiters-depth-1">(</span></span><span class="org-keyword">match</span> input <span class="org-tuareg-font-lock-operator"><span class="org-rainbow-delimiters-depth-2">()</span></span> <span class="org-keyword">with</span>
<span class="linenr">145: </span>       <span class="org-tuareg-font-lock-operator">|</span> <span class="org-tuareg-font-lock-constructor">None</span>   <span class="org-tuareg-font-lock-operator">-&gt;</span> <span class="org-tuareg-font-lock-operator"><span class="org-rainbow-delimiters-depth-2">()</span></span>         <span class="org-comment-delimiter">(* </span><span class="org-comment">do nothing if we run [,] at EOF </span><span class="org-comment-delimiter">*)</span>
<span class="linenr">146: </span>       <span class="org-tuareg-font-lock-operator">|</span> <span class="org-tuareg-font-lock-constructor">Some</span> c <span class="org-tuareg-font-lock-operator">-&gt;</span> set pos <span class="org-tuareg-font-lock-operator"><span class="org-rainbow-delimiters-depth-2">(</span></span><span class="org-tuareg-font-lock-module">Char.</span>to_int c<span class="org-tuareg-font-lock-operator"><span class="org-rainbow-delimiters-depth-2">)</span></span><span class="org-tuareg-font-lock-operator"><span class="org-rainbow-delimiters-depth-1">)</span></span><span class="org-tuareg-font-lock-operator">;</span>
<span class="linenr">147: </span>      loop <span class="org-tuareg-font-lock-label">~pc</span><span class="org-tuareg-font-lock-operator">:</span><span class="org-tuareg-font-lock-operator"><span class="org-rainbow-delimiters-depth-1">(</span></span>pc <span class="org-tuareg-font-lock-operator">+</span> <span class="org-highlight-numbers-number">1</span><span class="org-tuareg-font-lock-operator"><span class="org-rainbow-delimiters-depth-1">)</span></span> <span class="org-tuareg-font-lock-operator">~</span>pos
<span class="linenr">148: </span>
<span class="linenr">149: </span>    <span class="org-tuareg-font-lock-operator">|</span> <span class="org-tuareg-font-lock-constructor">Output</span> <span class="org-tuareg-font-lock-operator">-&gt;</span>
<span class="linenr">150: </span>      output <span class="org-tuareg-font-lock-operator"><span class="org-rainbow-delimiters-depth-1">(</span></span><span class="org-tuareg-font-lock-module">Char.</span>of_int_exn <span class="org-tuareg-font-lock-operator"><span class="org-rainbow-delimiters-depth-2">(</span></span>get pos<span class="org-tuareg-font-lock-operator"><span class="org-rainbow-delimiters-depth-2">)</span></span><span class="org-tuareg-font-lock-operator"><span class="org-rainbow-delimiters-depth-1">)</span></span><span class="org-tuareg-font-lock-operator">;</span>
<span class="linenr">151: </span>      loop <span class="org-tuareg-font-lock-label">~pc</span><span class="org-tuareg-font-lock-operator">:</span><span class="org-tuareg-font-lock-operator"><span class="org-rainbow-delimiters-depth-1">(</span></span>pc <span class="org-tuareg-font-lock-operator">+</span> <span class="org-highlight-numbers-number">1</span><span class="org-tuareg-font-lock-operator"><span class="org-rainbow-delimiters-depth-1">)</span></span> <span class="org-tuareg-font-lock-operator">~</span>pos
<span class="linenr">152: </span>
<span class="linenr">153: </span>    <span class="org-tuareg-font-lock-operator">|</span> <span class="org-tuareg-font-lock-constructor">Loop_begin</span> <span class="org-tuareg-font-lock-operator">-&gt;</span>
<span class="linenr">154: </span>      <span class="org-keyword">if</span> get pos <span class="org-tuareg-font-lock-operator">=</span> <span class="org-highlight-numbers-number">0</span>
<span class="linenr">155: </span>      <span class="org-keyword">then</span> <span class="org-tuareg-font-lock-operator"><span class="org-rainbow-delimiters-depth-1">(</span></span>loop <span class="org-tuareg-font-lock-label">~pc</span><span class="org-tuareg-font-lock-operator">:</span><span class="org-tuareg-font-lock-operator"><span class="org-rainbow-delimiters-depth-2">(</span></span>jump pc<span class="org-tuareg-font-lock-operator"><span class="org-rainbow-delimiters-depth-2">)</span></span> <span class="org-tuareg-font-lock-operator">~</span>pos<span class="org-tuareg-font-lock-operator"><span class="org-rainbow-delimiters-depth-1">)</span></span>
<span class="linenr">156: </span>      <span class="org-keyword">else</span> <span class="org-tuareg-font-lock-operator"><span class="org-rainbow-delimiters-depth-1">(</span></span>loop <span class="org-tuareg-font-lock-label">~pc</span><span class="org-tuareg-font-lock-operator">:</span><span class="org-tuareg-font-lock-operator"><span class="org-rainbow-delimiters-depth-2">(</span></span>pc <span class="org-tuareg-font-lock-operator">+</span> <span class="org-highlight-numbers-number">1</span><span class="org-tuareg-font-lock-operator"><span class="org-rainbow-delimiters-depth-2">)</span></span>  <span class="org-tuareg-font-lock-operator">~</span>pos<span class="org-tuareg-font-lock-operator"><span class="org-rainbow-delimiters-depth-1">)</span></span>
<span class="linenr">157: </span>
<span class="linenr">158: </span>    <span class="org-tuareg-font-lock-operator">|</span> <span class="org-tuareg-font-lock-constructor">Loop_end</span> <span class="org-tuareg-font-lock-operator">-&gt;</span>
<span class="linenr">159: </span>      <span class="org-keyword">if</span> get pos <span class="org-tuareg-font-lock-operator">&lt;&gt;</span> <span class="org-highlight-numbers-number">0</span>
<span class="linenr">160: </span>      <span class="org-keyword">then</span> <span class="org-tuareg-font-lock-operator"><span class="org-rainbow-delimiters-depth-1">(</span></span>loop <span class="org-tuareg-font-lock-label">~pc</span><span class="org-tuareg-font-lock-operator">:</span><span class="org-tuareg-font-lock-operator"><span class="org-rainbow-delimiters-depth-2">(</span></span>jump pc<span class="org-tuareg-font-lock-operator"><span class="org-rainbow-delimiters-depth-2">)</span></span> <span class="org-tuareg-font-lock-operator">~</span>pos<span class="org-tuareg-font-lock-operator"><span class="org-rainbow-delimiters-depth-1">)</span></span>
<span class="linenr">161: </span>      <span class="org-keyword">else</span> <span class="org-tuareg-font-lock-operator"><span class="org-rainbow-delimiters-depth-1">(</span></span>loop <span class="org-tuareg-font-lock-label">~pc</span><span class="org-tuareg-font-lock-operator">:</span><span class="org-tuareg-font-lock-operator"><span class="org-rainbow-delimiters-depth-2">(</span></span>pc <span class="org-tuareg-font-lock-operator">+</span> <span class="org-highlight-numbers-number">1</span><span class="org-tuareg-font-lock-operator"><span class="org-rainbow-delimiters-depth-2">)</span></span>  <span class="org-tuareg-font-lock-operator">~</span>pos<span class="org-tuareg-font-lock-operator"><span class="org-rainbow-delimiters-depth-1">)</span></span>
<span class="linenr">162: </span>
<span class="linenr">163: </span>    <span class="org-tuareg-font-lock-operator">|</span> <span class="org-keyword">exception</span> _ <span class="org-tuareg-font-lock-operator">-&gt;</span> <span class="org-tuareg-font-lock-operator"><span class="org-rainbow-delimiters-depth-1">()</span></span>       <span class="org-comment-delimiter">(* </span><span class="org-comment">reached end of commands, so we terminate </span><span class="org-comment-delimiter">*)</span>
<span class="linenr">164: </span>  <span class="org-tuareg-font-lock-governing">in</span>
<span class="linenr">165: </span>  loop <span class="org-tuareg-font-lock-label">~pc</span><span class="org-tuareg-font-lock-operator">:</span><span class="org-highlight-numbers-number">0</span> <span class="org-tuareg-font-lock-label">~pos</span><span class="org-tuareg-font-lock-operator">:</span><span class="org-highlight-numbers-number">0</span>
<span class="linenr">166: </span><span class="org-tuareg-font-double-colon">;;</span>
</pre>
</div>

<p>
A couple decisions we&rsquo;ve made above about edge cases:
</p>

<ol class="org-ol">
<li>We represent bytes as unsigned. This doesn&rsquo;t make that much of a
difference, since programs can&rsquo;t observe that fact, given that&#x2026;</li>
<li>Increment and decrement operations that would underflow or overflow an
unsigned byte wrap around instead, modulo 256.</li>
<li>If we encounter the <code>,</code> (input) command when we have no remaining input,
we simply ignore it (i.e., do nothing). Other common choices including
setting the current cell to <code>0</code> or <code>-1</code>.</li>
</ol>
</div>
</div>
<div id="outline-container-orgfdcb185" class="outline-3">
<h3 id="impl-wrapper"><a id="orgfdcb185"></a><span class="section-number-3">2.8</span> Wrapper</h3>
<div class="outline-text-3" id="text-impl-wrapper">
<p>
We also define a simplified interface for running programs. This&rsquo;ll be less
awkward to use for simple cases. <code>create_memory</code> allocates an array of 10<sup>6</sup>
bytes and fills the buffer with 0x00&rsquo;s. <code>run'</code> is simply a wrapper around
<code>run</code> that creates a fresh memory and accepts and returns everything as
<code>string</code>&rsquo;s.
</p>

<div class="org-src-container">
<pre class="src src-ocaml"><span class="linenr">69: </span><span class="org-tuareg-font-lock-governing">let</span> <span class="org-function-name">create_memory</span><span class="org-variable-name"> </span><span class="org-tuareg-font-lock-operator"><span class="org-rainbow-delimiters-depth-1">()</span></span> <span class="org-tuareg-font-lock-operator">=</span>
<span class="linenr">70: </span>  <span class="org-tuareg-font-lock-governing">let</span> <span class="org-variable-name">memory_size</span> <span class="org-tuareg-font-lock-operator">=</span> <span class="org-highlight-numbers-number">1_000_000</span> <span class="org-tuareg-font-lock-governing">in</span>
<span class="linenr">71: </span>  <span class="org-tuareg-font-lock-governing">let</span> <span class="org-variable-name">memory</span> <span class="org-tuareg-font-lock-operator">=</span> <span class="org-tuareg-font-lock-module">Bigstring.</span>create memory_size <span class="org-tuareg-font-lock-governing">in</span>
<span class="linenr">72: </span>  <span class="org-tuareg-font-lock-module">Bigarray.Array1.</span>fill memory <span class="org-string">'\000'</span><span class="org-tuareg-font-lock-operator">;</span>
<span class="linenr">73: </span>  memory
<span class="linenr">74: </span><span class="org-tuareg-font-double-colon">;;</span>
</pre>
</div>

<div class="org-src-container">
<pre class="src src-ocaml"><span class="linenr">168: </span><span class="org-tuareg-font-lock-governing">let</span> <span class="org-function-name">run'</span><span class="org-variable-name"> </span><span class="org-tuareg-font-lock-operator">~</span><span class="org-variable-name">program </span><span class="org-tuareg-font-lock-operator">~</span><span class="org-variable-name">input</span> <span class="org-tuareg-font-lock-operator">=</span>
<span class="linenr">169: </span>  <span class="org-tuareg-font-lock-governing">let</span> <span class="org-variable-name">t</span> <span class="org-tuareg-font-lock-operator">=</span> parse <span class="org-tuareg-font-lock-operator"><span class="org-rainbow-delimiters-depth-1">(</span></span><span class="org-tuareg-font-lock-module">Input.</span>of_string program<span class="org-tuareg-font-lock-operator"><span class="org-rainbow-delimiters-depth-1">)</span></span> <span class="org-tuareg-font-lock-governing">in</span>
<span class="linenr">170: </span>  <span class="org-tuareg-font-lock-governing">let</span> <span class="org-variable-name">buffer</span> <span class="org-tuareg-font-lock-operator">=</span> <span class="org-tuareg-font-lock-module">Buffer.</span>create <span class="org-highlight-numbers-number">16</span> <span class="org-tuareg-font-lock-governing">in</span>
<span class="linenr">171: </span>  <span class="org-tuareg-font-lock-governing">let</span> <span class="org-variable-name">memory</span> <span class="org-tuareg-font-lock-operator">=</span> create_memory <span class="org-tuareg-font-lock-operator"><span class="org-rainbow-delimiters-depth-1">()</span></span> <span class="org-tuareg-font-lock-governing">in</span>
<span class="linenr">172: </span>  <span class="org-tuareg-font-lock-governing">let</span> <span class="org-variable-name">input</span> <span class="org-tuareg-font-lock-operator">=</span> <span class="org-tuareg-font-lock-module">Input.</span>of_string input <span class="org-tuareg-font-lock-governing">in</span>
<span class="linenr">173: </span>  <span class="org-tuareg-font-lock-governing">let</span> <span class="org-variable-name">output</span> <span class="org-tuareg-font-lock-operator">=</span> <span class="org-tuareg-font-lock-module">Output.</span>of_buffer buffer <span class="org-tuareg-font-lock-governing">in</span>
<span class="linenr">174: </span>  run t <span class="org-tuareg-font-lock-operator">~</span>memory <span class="org-tuareg-font-lock-operator">~</span>input <span class="org-tuareg-font-lock-operator">~</span>output<span class="org-tuareg-font-lock-operator">;</span>
<span class="linenr">175: </span>  <span class="org-tuareg-font-lock-module">Buffer.</span>contents buffer
<span class="linenr">176: </span><span class="org-tuareg-font-double-colon">;;</span>
</pre>
</div>
</div>
</div>
</div>
<div id="outline-container-org52da2e0" class="outline-2">
<h2 id="running"><a id="org52da2e0"></a><span class="section-number-2">3</span> Trial run</h2>
<div class="outline-text-2" id="text-running">
<p>
Let&rsquo;s try out our interpreter. <code>jbuilder</code> has an awesome <code>utop</code> command which
starts a toplevel<sup><a id="fnr.3" class="footref" href="#fn.3">3</a></sup> with access to the library we&rsquo;ve just built (<code>--dev</code>
enables stricter compilation flags).
</p>

<div class="org-src-container">
<pre class="src src-sh"><span class="org-comment-delimiter"># </span><span class="org-comment">chdir to project root</span>
jbuilder utop lib/bf_lib --dev
</pre>
</div>

<div class="org-src-container">
<pre class="src src-ocaml">utop <span class="org-tuareg-font-lock-operator">#</span> <span class="org-tuareg-font-lock-module">Bf_lib.Program.</span>run' <span class="org-tuareg-font-lock-label">~input</span><span class="org-tuareg-font-lock-operator">:</span><span class="org-string">""</span> <span class="org-tuareg-font-lock-label">~program</span><span class="org-tuareg-font-lock-operator">:</span><span class="org-string">"++++++++[&gt;++++[&gt;++&gt;+++&gt;+++&gt;+&lt;&lt;&lt;&lt;-]&gt;+&gt;+&gt;-&gt;&gt;+[&lt;]&lt;-]&gt;&gt;.&gt;---.+++++++..+++.&gt;&gt;.&lt;-.&lt;.+++.------.--------.&gt;&gt;+.&gt;++."</span><span class="org-tuareg-font-double-colon">;;</span>
<span class="org-tuareg-font-lock-operator">-</span> <span class="org-tuareg-font-lock-operator">:</span> string <span class="org-tuareg-font-lock-operator">=</span> <span class="org-string">"Hello World!\n"</span>
</pre>
</div>

<p>
Awesome. We have a working brainfuck interpreter library! See <a href="https://en.wikipedia.org/wiki/Brainfuck#Examples">Wikipedia</a> for
more example programs to try out.
</p>
</div>
</div>
<div id="outline-container-org60cbabc" class="outline-2">
<h2 id="to-be-continued"><a id="org60cbabc"></a><span class="section-number-2">4</span> To be continued</h2>
<div class="outline-text-2" id="text-to-be-continued">
<p>
In the next post, we&rsquo;ll see how to use Ecaml&rsquo;s API to interact with various
components of Emacs, such as buffers, files, key bindings, and more. Stay
tuned!
</p>
</div>
</div>
<div id="footnotes">
<h2 class="footnotes">Footnotes: </h2>
<div id="text-footnotes">

<div class="footdef"><sup><a id="fn.1" class="footnum" href="#fnr.1">1</a></sup> <div class="footpara"><p class="footpara">
The maximum length of a <code>string</code> or <code>bytes</code> is given in OCaml by
<code>Sys.max_string_length</code>. On 64-bit systems, it&rsquo;s 144115188075855863, which is
surely big enough, but on 32-bit systems it&rsquo;s 16777211 (about 16 MB), and we
might want to give a program a bigger memory than that in the future.
</p></div></div>

<div class="footdef"><sup><a id="fn.2" class="footnum" href="#fnr.2">2</a></sup> <div class="footpara"><p class="footpara">
Or, if you prefer, <b>eta-expanded</b>.
</p></div></div>

<div class="footdef"><sup><a id="fn.3" class="footnum" href="#fnr.3">3</a></sup> <div class="footpara"><p class="footpara">
OCaml&rsquo;s term for a REPL (Read-Eval-Print Loop), i.e., interactive prompt.
Similar to <code>python</code> or <code>irb</code>.
</p></div></div>


</div>
</div>]]></content><author><name>Aaron L. Zeng</name></author><category term="ecaml-getting-started" /><category term="ecaml" /><category term="emacs" /><category term="ocaml" /><summary type="html"><![CDATA[This post is part 2 of a series (prev/next). The full code is available on GitHub.]]></summary></entry><entry><title type="html">Emacs Plugins in OCaml: Ecaml API Overview (part 3)</title><link href="https://blag.bcc32.com/ecaml-getting-started/2017/11/12/emacs-plugins-in-ocaml-3/" rel="alternate" type="text/html" title="Emacs Plugins in OCaml: Ecaml API Overview (part 3)" /><published>2017-11-12T16:00:00-05:00</published><updated>2017-11-12T16:00:00-05:00</updated><id>https://blag.bcc32.com/ecaml-getting-started/2017/11/12/emacs-plugins-in-ocaml-3</id><content type="html" xml:base="https://blag.bcc32.com/ecaml-getting-started/2017/11/12/emacs-plugins-in-ocaml-3/"><![CDATA[<p class="alert alert-info">
This post is part 3 of a
<a class="alert-link" href="/categories/ecaml-getting-started/">series</a>
(<a class="alert-link" href="/ecaml-getting-started/2017/11/12/emacs-plugins-in-ocaml-2/">prev</a>/<a
class="alert-link" href="/ecaml-getting-started/2017/11/19/emacs-plugins-in-ocaml-4/">next</a>).
The full code is available on
<a class="alert-link" href="https://github.com/bcc32/ecaml-bf">GitHub</a>.
</p>

<p>
This time, we&rsquo;ll take a look at the Ecaml modules used to interact with
different components of Emacs. These modules define type-safe interfaces that
represent buffers, font faces, and various Elisp data structures like hash
tables and vectors.
</p>

<p>
This post reflects the state of the Ecaml library as of version v0.9.115.24+69
(git commit <a href="https://github.com/janestreet/ecaml/commit/25a9f825f18d722e73b0cfc0050de74dd9811a09">25a9f825</a>). It is, of course, being actively developed and improved
upon, and some details may change between when this post was written and now,
when you are reading it.
</p>

<div id="table-of-contents">
<h2>Table of Contents</h2>
<div id="text-table-of-contents">
<ul>
<li><a href="#intro">1. Intro</a></li>
<li><a href="#module-ecaml">2. The <code>Ecaml</code> module</a>
<ul>
<li><a href="#val-defun">2.1. <code>defun</code></a></li>
<li><a href="#orgbc85d0b">2.2. <code>defvar</code> and <code>defcustom</code></a></li>
<li><a href="#val-message">2.3. <code>message</code> and co.</a></li>
<li><a href="#val-provide">2.4. <code>provide</code></a></li>
</ul>
</li>
<li><a href="#other-modules">3. Feature rundown</a></li>
<li><a href="#to-be-continued">4. Phew!</a></li>
</ul>
</div>
</div>

<div id="outline-container-org57b24e9" class="outline-2">
<h2 id="intro"><a id="org57b24e9"></a><span class="section-number-2">1</span> Intro</h2>
<div class="outline-text-2" id="text-intro">
<p>
This post won&rsquo;t cover everything in the Ecaml library, since it&rsquo;s quite big<sup><a id="fnr.1" class="footref" href="#fn.1">1</a></sup>,
but it should help give you a general idea for how it&rsquo;s organized. Most of the
individual functions correspond to one or two Elisp functions for which more
information can be found by reading the corresponding section of the Emacs
Lisp manual.
</p>

<p>
For example, Merlin reports the following type and documentation comment for
<code>Ecaml.Buffer.find</code>, which links to the corresponding Emacs Manual section:
</p>

<div class="org-src-container">
<pre class="src src-ocaml"><span class="org-comment-delimiter">(* </span><span class="org-comment">[find ~name] returns the live buffer whose name is [name], if any.</span>
<span class="org-comment">   [(describe-function 'get-buffer)]. </span><span class="org-comment-delimiter">*)</span>
<span class="org-tuareg-font-lock-governing">val</span> <span class="org-function-name">find</span> <span class="org-tuareg-font-lock-operator">:</span> <span class="org-tuareg-font-lock-label">name</span><span class="org-tuareg-font-lock-operator">:</span>string <span class="org-tuareg-font-lock-operator">-&gt;</span> <span class="org-tuareg-font-lock-module">Buffer.</span>t option
</pre>
</div>

<p>
Evaluating <code>(describe-function 'get-buffer)</code>, either manually or by invoking
<code>describe-function</code> interactively (<kbd>C-h f</kbd> or
<kbd>SPC h d f</kbd> for Spacemacs) brings you to the corresponding
documentation:
</p>

<blockquote>
<p>
get-buffer is a built-in function in ‘C source code’.
</p>

<p>
(get-buffer <i>BUFFER-OR-NAME</i>)
</p>

<p>
Return the buffer named <i>BUFFER-OR-NAME</i>.
<i>BUFFER-OR-NAME</i> must be either a string or a buffer.  If <i>BUFFER-OR-NAME</i>
is a string and there is no buffer with that name, return nil.  If
<i>BUFFER-OR-NAME</i> is a buffer, return it as given.
</p>
</blockquote>

<p>
The correspondence between the behavior of <code>Ecaml.Buffer.find</code> and
<code>get-buffer</code> should be apparent: <code>find</code> returns <code>Some buffer</code> if there exists
a buffer with the given name, otherwise <code>None</code>.
</p>
</div>
</div>
<div id="outline-container-orgd14bcb8" class="outline-2">
<h2 id="module-ecaml"><a id="orgd14bcb8"></a><span class="section-number-2">2</span> The <code>Ecaml</code> module</h2>
<div class="outline-text-2" id="text-module-ecaml">
<p>
The top-level module of the Ecaml library is named, unsurprisingly, <code>Ecaml</code>.
It contains all of the other modules for interacting with more specific areas
of Emacs functionality, as well as some commonly used functions directly in
the <code>Ecaml</code> module:
</p>

<div class="org-src-container">
<pre class="src src-ocaml"><span class="org-tuareg-font-lock-governing">module</span> <span class="org-tuareg-font-lock-module">Ecaml</span> <span class="org-tuareg-font-lock-operator">:</span> <span class="org-tuareg-font-lock-governing">sig</span>
  <span class="org-tuareg-font-lock-governing">module</span> <span class="org-tuareg-font-lock-module">Advice</span> <span class="org-tuareg-font-lock-operator">=</span> <span class="org-tuareg-font-lock-module">Ecaml__.Advice</span>
  <span class="org-comment-delimiter">(* </span><span class="org-comment">snip </span><span class="org-comment-delimiter">*)</span>
  <span class="org-tuareg-font-lock-governing">module</span> <span class="org-tuareg-font-lock-module">Working_directory</span> <span class="org-tuareg-font-lock-operator">=</span> <span class="org-tuareg-font-lock-module">Ecaml__.Working_directory</span>
  <span class="org-tuareg-font-lock-governing">val</span> <span class="org-function-name">defadvice</span> <span class="org-tuareg-font-lock-operator">:</span>
    <span class="org-tuareg-font-lock-label">?docstring</span><span class="org-tuareg-font-lock-operator">:</span>string <span class="org-tuareg-font-lock-operator">-&gt;</span>
    <span class="org-tuareg-font-lock-label">?position</span><span class="org-tuareg-font-lock-operator">:</span><span class="org-tuareg-font-lock-module">Advice.Position.</span>t <span class="org-tuareg-font-lock-operator">-&gt;</span>
    <span class="org-tuareg-font-lock-module">Lexing.</span>position <span class="org-tuareg-font-lock-operator">-&gt;</span>
    <span class="org-tuareg-font-lock-label">advice_name</span><span class="org-tuareg-font-lock-operator">:</span><span class="org-tuareg-font-lock-module">Advice.Name.</span>t <span class="org-tuareg-font-lock-operator">-&gt;</span>
    <span class="org-tuareg-font-lock-label">for_function</span><span class="org-tuareg-font-lock-operator">:</span><span class="org-tuareg-font-lock-module">Symbol.</span>t <span class="org-tuareg-font-lock-operator">-&gt;</span>
    <span class="org-tuareg-font-lock-operator"><span class="org-rainbow-delimiters-depth-1">(</span></span>args<span class="org-tuareg-font-lock-operator">:</span><span class="org-tuareg-font-lock-module">Value.</span><span class="org-type">t list </span><span class="org-tuareg-font-lock-operator">-&gt;</span><span class="org-type"> </span><span class="org-tuareg-font-lock-label">inner</span><span class="org-tuareg-font-lock-operator">:</span><span class="org-tuareg-font-lock-operator"><span class="org-rainbow-delimiters-depth-2">(</span></span><span class="org-tuareg-font-lock-module">Value.</span><span class="org-type">t list </span><span class="org-tuareg-font-lock-operator">-&gt;</span><span class="org-type"> </span><span class="org-tuareg-font-lock-module">Value.</span><span class="org-type">t</span><span class="org-tuareg-font-lock-operator"><span class="org-rainbow-delimiters-depth-2">)</span></span><span class="org-type"> </span><span class="org-tuareg-font-lock-operator">-&gt;</span><span class="org-type"> </span><span class="org-tuareg-font-lock-module">Value.</span><span class="org-type">t</span><span class="org-tuareg-font-lock-operator"><span class="org-rainbow-delimiters-depth-1">)</span></span> <span class="org-tuareg-font-lock-operator">-&gt;</span> unit
  <span class="org-tuareg-font-lock-governing">val</span> <span class="org-function-name">defun</span> <span class="org-tuareg-font-lock-operator">:</span> <span class="org-tuareg-font-lock-operator"><span class="org-rainbow-delimiters-depth-1">(</span></span><span class="org-tuareg-font-lock-module">Symbol.</span>t <span class="org-tuareg-font-lock-operator">-&gt;</span> <span class="org-tuareg-font-lock-module">Function.Fn.</span>t <span class="org-tuareg-font-lock-operator">-&gt;</span> unit<span class="org-tuareg-font-lock-operator"><span class="org-rainbow-delimiters-depth-1">)</span></span> <span class="org-tuareg-font-lock-module">Function.</span>with_spec
  <span class="org-tuareg-font-lock-governing">val</span> <span class="org-function-name">defcustom</span> <span class="org-tuareg-font-lock-operator">:</span>
    <span class="org-tuareg-font-lock-module">Lexing.</span>position <span class="org-tuareg-font-lock-operator">-&gt;</span>
    <span class="org-tuareg-font-lock-module">Symbol.</span>t <span class="org-tuareg-font-lock-operator">-&gt;</span>
    <span class="org-tuareg-font-lock-module">Ecaml.Customization.Type.</span>t <span class="org-tuareg-font-lock-operator">-&gt;</span>
    <span class="org-tuareg-font-lock-label">docstring</span><span class="org-tuareg-font-lock-operator">:</span>string <span class="org-tuareg-font-lock-operator">-&gt;</span>
    <span class="org-tuareg-font-lock-label">group</span><span class="org-tuareg-font-lock-operator">:</span><span class="org-tuareg-font-lock-module">Ecaml.Customization.Group.</span>t <span class="org-tuareg-font-lock-operator">-&gt;</span> <span class="org-tuareg-font-lock-label">standard_value</span><span class="org-tuareg-font-lock-operator">:</span><span class="org-tuareg-font-lock-module">Value.</span>t <span class="org-tuareg-font-lock-operator">-&gt;</span> unit
  <span class="org-tuareg-font-lock-governing">val</span> <span class="org-function-name">defvar</span> <span class="org-tuareg-font-lock-operator">:</span>
    <span class="org-tuareg-font-lock-module">Lexing.</span>position <span class="org-tuareg-font-lock-operator">-&gt;</span> <span class="org-tuareg-font-lock-module">Symbol.</span>t <span class="org-tuareg-font-lock-operator">-&gt;</span> <span class="org-tuareg-font-lock-module">Value.</span>t <span class="org-tuareg-font-lock-operator">-&gt;</span> <span class="org-tuareg-font-lock-label">docstring</span><span class="org-tuareg-font-lock-operator">:</span>string <span class="org-tuareg-font-lock-operator">-&gt;</span> unit
  <span class="org-tuareg-font-lock-governing">val</span> <span class="org-function-name">define_derived_mode</span> <span class="org-tuareg-font-lock-operator">:</span>
    <span class="org-tuareg-font-lock-label">?parent</span><span class="org-tuareg-font-lock-operator">:</span><span class="org-tuareg-font-lock-module">Ecaml.Major_mode.</span>t <span class="org-tuareg-font-lock-operator">-&gt;</span>
    <span class="org-tuareg-font-lock-module">Lexing.</span>position <span class="org-tuareg-font-lock-operator">-&gt;</span>
    <span class="org-tuareg-font-lock-label">change_command</span><span class="org-tuareg-font-lock-operator">:</span><span class="org-tuareg-font-lock-module">Symbol.</span>t <span class="org-tuareg-font-lock-operator">-&gt;</span>
    <span class="org-tuareg-font-lock-label">docstring</span><span class="org-tuareg-font-lock-operator">:</span>string <span class="org-tuareg-font-lock-operator">-&gt;</span>
    <span class="org-tuareg-font-lock-label">initialize</span><span class="org-tuareg-font-lock-operator">:</span><span class="org-tuareg-font-lock-operator"><span class="org-rainbow-delimiters-depth-1">(</span></span>unit <span class="org-tuareg-font-lock-operator">-&gt;</span> unit<span class="org-tuareg-font-lock-operator"><span class="org-rainbow-delimiters-depth-1">)</span></span> <span class="org-tuareg-font-lock-operator">-&gt;</span> <span class="org-tuareg-font-lock-label">mode_line</span><span class="org-tuareg-font-lock-operator">:</span>string <span class="org-tuareg-font-lock-operator">-&gt;</span> <span class="org-tuareg-font-lock-module">Ecaml.Major_mode.</span>t
  <span class="org-tuareg-font-lock-governing">val</span> <span class="org-function-name">inhibit_messages</span> <span class="org-tuareg-font-lock-operator">:</span> <span class="org-tuareg-font-lock-operator"><span class="org-rainbow-delimiters-depth-1">(</span></span>unit <span class="org-tuareg-font-lock-operator">-&gt;</span> 'a<span class="org-tuareg-font-lock-operator"><span class="org-rainbow-delimiters-depth-1">)</span></span> <span class="org-tuareg-font-lock-operator">-&gt;</span> 'a
  <span class="org-tuareg-font-lock-governing">val</span> <span class="org-function-name">message</span> <span class="org-tuareg-font-lock-operator">:</span> string <span class="org-tuareg-font-lock-operator">-&gt;</span> unit
  <span class="org-tuareg-font-lock-governing">val</span> <span class="org-function-name">message_s</span> <span class="org-tuareg-font-lock-operator">:</span> <span class="org-tuareg-font-lock-module">Core_kernel.Sexp.</span>t <span class="org-tuareg-font-lock-operator">-&gt;</span> unit
  <span class="org-tuareg-font-lock-governing">val</span> <span class="org-function-name">messagef</span> <span class="org-tuareg-font-lock-operator">:</span> <span class="org-tuareg-font-lock-operator"><span class="org-rainbow-delimiters-depth-1">(</span></span>'a<span class="org-tuareg-font-lock-operator">,</span> unit<span class="org-tuareg-font-lock-operator">,</span> string<span class="org-tuareg-font-lock-operator">,</span> unit<span class="org-tuareg-font-lock-operator"><span class="org-rainbow-delimiters-depth-1">)</span></span> <span class="org-tuareg-font-lock-module">Core_kernel.</span>format4 <span class="org-tuareg-font-lock-operator">-&gt;</span> 'a
  <span class="org-tuareg-font-lock-governing">val</span> <span class="org-function-name">provide</span> <span class="org-tuareg-font-lock-operator">:</span> <span class="org-tuareg-font-lock-module">Symbol.</span>t <span class="org-tuareg-font-lock-operator">-&gt;</span> unit
  <span class="org-tuareg-font-lock-governing">val</span> <span class="org-function-name">inhibit_read_only</span> <span class="org-tuareg-font-lock-operator">:</span> <span class="org-tuareg-font-lock-operator"><span class="org-rainbow-delimiters-depth-1">(</span></span>unit <span class="org-tuareg-font-lock-operator">-&gt;</span> 'a<span class="org-tuareg-font-lock-operator"><span class="org-rainbow-delimiters-depth-1">)</span></span> <span class="org-tuareg-font-lock-operator">-&gt;</span> 'a
<span class="org-tuareg-font-lock-governing">end</span>
</pre>
</div>

<p>
Let&rsquo;s go through a couple of the most important functions directly under
<code>Ecaml</code>:
</p>
</div>
<div id="outline-container-org2ef8a10" class="outline-3">
<h3 id="val-defun"><a id="org2ef8a10"></a><span class="section-number-3">2.1</span> <code>defun</code></h3>
<div class="outline-text-3" id="text-val-defun">
<p>
<code>defun</code> allows you to define a named function, visible to any Elisp code, but
whose body consists of an implementation in OCaml.
</p>

<p>
Its full signature, with the <code>Function.with_spec</code> type alias spelled out, is:
</p>

<div class="org-src-container">
<pre class="src src-ocaml"><span class="org-tuareg-font-lock-governing">val</span> <span class="org-function-name">defun</span>
  <span class="org-tuareg-font-lock-operator">:</span>  <span class="org-tuareg-font-lock-label">?docstring</span><span class="org-tuareg-font-lock-operator">:</span>string
  <span class="org-tuareg-font-lock-operator">-&gt;</span> <span class="org-tuareg-font-lock-label">?interactive</span><span class="org-tuareg-font-lock-operator">:</span>string
  <span class="org-tuareg-font-lock-operator">-&gt;</span> <span class="org-tuareg-font-lock-label">?optional_args</span><span class="org-tuareg-font-lock-operator">:</span><span class="org-tuareg-font-lock-module">Symbol.</span>t list
  <span class="org-tuareg-font-lock-operator">-&gt;</span> <span class="org-tuareg-font-lock-label">?rest_arg</span><span class="org-tuareg-font-lock-operator">:</span><span class="org-tuareg-font-lock-module">Symbol.</span>t
  <span class="org-tuareg-font-lock-operator">-&gt;</span> <span class="org-tuareg-font-lock-module">Lexing.</span>position <span class="org-tuareg-font-lock-operator">-&gt;</span> <span class="org-tuareg-font-lock-label">args</span><span class="org-tuareg-font-lock-operator">:</span><span class="org-tuareg-font-lock-module">Symbol.</span>t list <span class="org-tuareg-font-lock-operator">-&gt;</span> <span class="org-tuareg-font-lock-module">Symbol.</span>t
  <span class="org-tuareg-font-lock-operator">-&gt;</span> <span class="org-tuareg-font-lock-module">Function.Fn.</span>t <span class="org-tuareg-font-lock-operator">-&gt;</span> unit
</pre>
</div>

<p>
That&rsquo;s a lot to unpack, so we&rsquo;ll take the arguments one at a time:
</p>

<dl class="org-dl">
<dt><code>docstring</code></dt><dd>a string that is treated as a documentation comment (hence
the name) for the function you&rsquo;re defining. In Lisps, this
is a string literal that comes directly after the argument
list in a function definition, and basically describes what
the function does. It&rsquo;s a string instead of a comment
because Lisp systems usually allow you to retrieve these
strings at runtime. Docstrings appear in the text of
<code>describe-function</code> when you do <kbd>C-h f</kbd> in
Emacs, so they&rsquo;re very useful for functions that are
intended to be invoked by a user and not a program.</dd>
<dt><code>interactive</code></dt><dd><a href="https://www.gnu.org/software/emacs/manual/html_node/elisp/Using-Interactive.html"><code>interactive</code></a> is a special form in Elisp that turns a
function into a command. It&rsquo;s really complicated so I
shan&rsquo;t describe it here, but it basically does some magic
so that you can prompt the user for input and your
function receives that input as arguments.</dd>
<dt><a id="org1094f20"></a> <code>optional_args</code> and <code>rest_arg</code></dt><dd>names for
arguments that will slurp up the rest of the arguments passed to the
function, when applicable. They don&rsquo;t actually change the behavior of
the function, but they will appear in the documentation for
<code>describe-function</code> so you should still pick good names.</dd>
<dt><code>Lexing.position</code></dt><dd><p>
<code>defun</code> requires a <code>Lexing.position</code> argument, which
describes where in your OCaml code the function was defined. This shows
up in the <code>describe-function</code> output so that you can easily figure out
where the OCaml implementation lives.
</p>

<p>
<a href="https://github.com/janestreet/ppx_here"><code>ppx_here</code></a>, a syntax extension, provides a handy shortcut for specifying
the current source location: simply write <code>[%here]</code> where you need a
<code>Lexing.position</code>.<sup><a id="fnr.2" class="footref" href="#fn.2">2</a></sup><sup>, </sup><sup><a id="fnr.3" class="footref" href="#fn.3">3</a></sup>
</p></dd>
<dt><code>args</code></dt><dd>like <code>optional_args</code> and <code>rest_arg</code>, just provides names for the
formal parameters for your function. See <a href="#org1094f20">above</a>.</dd>
<dt><code>Symbol.t</code></dt><dd>the name of your function. This is what it&rsquo;ll be referred to
as by Emacs, so if you pass the symbol <code>foo</code>, any Elisp code
that contains, say, <code>(foo)</code>, will call your function.</dd>
<dt><code>Function.Fn.t</code></dt><dd>the body of your function. It&rsquo;s the OCaml code that does
the heavy lifting. It must be a value of type <code>Ecaml.Value.t array -&gt;
        Ecaml.Value.t</code>, i.e., it should accept some number of arguments and
return a valid Elisp value. Unfortunately, there&rsquo;s no way for the
type-checker to guarantee that you&rsquo;ll get the right number of arguments
(this is basically a Lisp function, after all), so you will have to
handle any arity errors.</dd>
</dl>

<p>
Here&rsquo;s an example of a simple function being defined using <code>defun</code>.
</p>

<div class="org-src-container">
<pre class="src src-ocaml"><span class="org-tuareg-font-lock-governing">open </span><span class="org-tuareg-font-lock-module">Ecaml</span>

<span class="org-tuareg-font-lock-governing">let</span> <span class="org-tuareg-font-lock-operator"><span class="org-rainbow-delimiters-depth-1">()</span></span> <span class="org-tuareg-font-lock-operator">=</span>
  defun <span class="org-tuareg-font-lock-extension-node"><span class="org-rainbow-delimiters-depth-1">[</span></span><span class="org-tuareg-font-lock-extension-node">%here</span><span class="org-tuareg-font-lock-extension-node"><span class="org-rainbow-delimiters-depth-1">]</span></span> <span class="org-tuareg-font-lock-operator"><span class="org-rainbow-delimiters-depth-1">(</span></span><span class="org-tuareg-font-lock-module">Symbol.</span>intern <span class="org-string">"say-hello"</span><span class="org-tuareg-font-lock-operator"><span class="org-rainbow-delimiters-depth-1">)</span></span>
    <span class="org-tuareg-font-lock-label">~optional_args</span><span class="org-tuareg-font-lock-operator">:</span><span class="org-tuareg-font-lock-operator"><span class="org-rainbow-delimiters-depth-1">[</span></span> <span class="org-tuareg-font-lock-module">Symbol.</span>intern <span class="org-string">"name"</span> <span class="org-tuareg-font-lock-operator"><span class="org-rainbow-delimiters-depth-1">]</span></span>
    <span class="org-tuareg-font-lock-label">~args</span><span class="org-tuareg-font-lock-operator">:</span><span class="org-tuareg-font-lock-operator"><span class="org-rainbow-delimiters-depth-1">[]</span></span>
    <span class="org-tuareg-font-lock-operator"><span class="org-rainbow-delimiters-depth-1">(</span></span><span class="org-keyword">function</span>
      <span class="org-tuareg-font-lock-operator">|</span> <span class="org-tuareg-font-lock-operator"><span class="org-rainbow-delimiters-depth-2">[</span></span><span class="org-tuareg-font-lock-operator">|</span> name <span class="org-tuareg-font-lock-operator">|</span><span class="org-tuareg-font-lock-operator"><span class="org-rainbow-delimiters-depth-2">]</span></span> <span class="org-tuareg-font-lock-operator">-&gt;</span>
        <span class="org-tuareg-font-lock-governing">let</span> <span class="org-variable-name">name</span> <span class="org-tuareg-font-lock-operator">=</span>
          <span class="org-keyword">if</span> <span class="org-tuareg-font-lock-module">Value.</span>is_nil name
          <span class="org-keyword">then</span> <span class="org-string">"World"</span>
          <span class="org-keyword">else</span> <span class="org-tuareg-font-lock-operator"><span class="org-rainbow-delimiters-depth-2">(</span></span><span class="org-tuareg-font-lock-module">Value.</span>to_utf8_bytes_exn name<span class="org-tuareg-font-lock-operator"><span class="org-rainbow-delimiters-depth-2">)</span></span>
        <span class="org-tuareg-font-lock-governing">in</span>
        <span class="org-tuareg-font-lock-module">Value.</span>of_utf8_bytes <span class="org-tuareg-font-lock-operator"><span class="org-rainbow-delimiters-depth-2">(</span></span><span class="org-string">"Hello, "</span> <span class="org-tuareg-font-lock-operator">^</span> name <span class="org-tuareg-font-lock-operator">^</span> <span class="org-string">"!"</span><span class="org-tuareg-font-lock-operator"><span class="org-rainbow-delimiters-depth-2">)</span></span>
      <span class="org-tuareg-font-lock-operator">|</span> _ <span class="org-tuareg-font-lock-operator">-&gt;</span> <span class="org-builtin">invalid_arg</span> <span class="org-string">"wrong arity"</span><span class="org-tuareg-font-lock-operator"><span class="org-rainbow-delimiters-depth-1">)</span></span>
<span class="org-tuareg-font-double-colon">;;</span>

<span class="org-tuareg-font-lock-governing">let</span> <span class="org-tuareg-font-lock-operator"><span class="org-rainbow-delimiters-depth-1">()</span></span> <span class="org-tuareg-font-lock-operator">=</span> provide <span class="org-tuareg-font-lock-operator"><span class="org-rainbow-delimiters-depth-1">(</span></span><span class="org-tuareg-font-lock-module">Symbol.</span>intern <span class="org-string">"ecaml-bf"</span><span class="org-tuareg-font-lock-operator"><span class="org-rainbow-delimiters-depth-1">)</span></span>
</pre>
</div>

<p>
The way Elisp treats optional arguments is not, as you might guess, that the
function might receive zero or one argument, but rather that the <code>name</code>
argument might be <code>nil</code> if no argument was provided. We check that to
determine whether to use a default value, and then return an appropriate
greeting.
</p>

<p>
We can test it like so:
</p>

<div class="org-src-container">
<pre class="src src-sh"><span class="org-builtin">alias</span> <span class="org-variable-name">eb</span>=<span class="org-string">"emacs -Q -L _build/default/src --batch"</span>
eb --eval <span class="org-string">"(require 'ecaml-bf)"</span> --eval <span class="org-string">'(print (say-hello))'</span>
<span class="org-comment-delimiter"># </span><span class="org-comment">"Loaded Ecaml."</span>
<span class="org-comment-delimiter">#</span>
<span class="org-comment-delimiter"># </span><span class="org-comment">"Hello, World!"</span>
eb --eval <span class="org-string">"(require 'ecaml-bf)"</span> --eval <span class="org-string">'(print (say-hello "Emacs"))'</span>
<span class="org-comment-delimiter"># </span><span class="org-comment">"Loaded Ecaml."</span>
<span class="org-comment-delimiter">#</span>
<span class="org-comment-delimiter"># </span><span class="org-comment">"Hello, Emacs!"</span>
</pre>
</div>

<p>
If you&rsquo;re familiar with Elisp, it might be helpful to see how corresponding
Ecaml and Elisp code compare, so here&rsquo;s the above example written in Elisp.
</p>

<div class="org-src-container">
<pre class="src src-elisp"><span class="org-rainbow-delimiters-depth-1">(</span><span class="org-keyword">defun</span> <span class="org-function-name">say-hello</span> <span class="org-rainbow-delimiters-depth-2">(</span><span class="org-type">&amp;optional</span> name<span class="org-rainbow-delimiters-depth-2">)</span>
  <span class="org-rainbow-delimiters-depth-2">(</span><span class="org-keyword">let</span> <span class="org-rainbow-delimiters-depth-3">(</span><span class="org-rainbow-delimiters-depth-4">(</span>name <span class="org-rainbow-delimiters-depth-5">(</span><span class="org-keyword">or</span> name <span class="org-string">"World"</span><span class="org-rainbow-delimiters-depth-5">)</span><span class="org-rainbow-delimiters-depth-4">)</span><span class="org-rainbow-delimiters-depth-3">)</span>
    <span class="org-rainbow-delimiters-depth-3">(</span>concat <span class="org-string">"Hello, "</span> name <span class="org-string">"!"</span><span class="org-rainbow-delimiters-depth-3">)</span><span class="org-rainbow-delimiters-depth-2">)</span><span class="org-rainbow-delimiters-depth-1">)</span>
</pre>
</div>

<p>
Okay, so it&rsquo;s a lot shorter :sweat:. But that&rsquo;s okay, because most of the
code we write won&rsquo;t be just tons of boilerplate wrapping trivial OCaml
functions. Instead, the point of Ecaml is to let OCaml do the heavy lifting,
so we only need to define the <i>interface</i> for our plugin using <code>defun</code> and
then we can hack on whatever interesting functionality we want to provide.
</p>
</div>
</div>
<div id="outline-container-orgbc85d0b" class="outline-3">
<h3 id="orgbc85d0b"><span class="section-number-3">2.2</span> <code>defvar</code> and <code>defcustom</code></h3>
<div class="outline-text-3" id="text-2-2">
<p>
These functions allow you to define Elisp variables<sup><a id="fnr.4" class="footref" href="#fn.4">4</a></sup>, similarly to what
<code>defun</code> does for functions. The difference between them is that <code>defcustom</code>
defines a <b>customizable</b> variable. The Elisp manual, section 14.3:
</p>

<blockquote>
<p>
“Customizable variables”, also called “user options”, are global Lisp
variables whose values can be set through the Customize interface. Unlike
other global variables, which are defined with ‘defvar’ (*note Defining
Variables::), customizable variables are defined using the ‘defcustom’ macro.
In addition to calling ‘defvar’ as a subroutine, ‘defcustom’ states how the
variable should be displayed in the Customize interface, the values it is
allowed to take, etc.
</p>
</blockquote>

<p>
So, what does that mean for your plugin? Well, basically it just means that
the variable can be set and queried interactively by the user using the Emacs
Customize interface.
</p>
</div>
</div>
<div id="outline-container-orga16f43a" class="outline-3">
<h3 id="val-message"><a id="orga16f43a"></a><span class="section-number-3">2.3</span> <code>message</code> and co.</h3>
<div class="outline-text-3" id="text-val-message">
<div class="org-src-container">
<pre class="src src-ocaml"><span class="org-tuareg-font-lock-governing">val</span> <span class="org-function-name">message</span> <span class="org-tuareg-font-lock-operator">:</span> string <span class="org-tuareg-font-lock-operator">-&gt;</span> unit
<span class="org-tuareg-font-lock-governing">val</span> <span class="org-function-name">message_s</span> <span class="org-tuareg-font-lock-operator">:</span> <span class="org-tuareg-font-lock-module">Core_kernel.Sexp.</span>t <span class="org-tuareg-font-lock-operator">-&gt;</span> unit
<span class="org-tuareg-font-lock-governing">val</span> <span class="org-function-name">messagef</span> <span class="org-tuareg-font-lock-operator">:</span> <span class="org-tuareg-font-lock-operator"><span class="org-rainbow-delimiters-depth-1">(</span></span>'a<span class="org-tuareg-font-lock-operator">,</span> unit<span class="org-tuareg-font-lock-operator">,</span> string<span class="org-tuareg-font-lock-operator">,</span> unit<span class="org-tuareg-font-lock-operator"><span class="org-rainbow-delimiters-depth-1">)</span></span> <span class="org-tuareg-font-lock-module">Core_kernel.</span>format4 <span class="org-tuareg-font-lock-operator">-&gt;</span> 'a
</pre>
</div>

<p>
The next three functions are simply various ways of printing messages to the
echo area. The <b>echo area</b> is the tiny area at the very bottom of the Emacs
window, where messages will appear, such as &ldquo;(No changes needed to be saved)&rdquo;
when you try to save a file you already just saved.
</p>

<p>
The Elisp function <a href="https://www.gnu.org/software/emacs/manual/html_node/elisp/Displaying-Messages.html#Displaying-Messages"><code>message</code></a> displays a new message in the echo area.
Messages are also appended to the special <code>*Messages*</code> buffer, so you can
view that buffer to see any messages you may have accidentally dismissed.
</p>

<p>
<code>message</code> and <code>message_s</code> accept a <code>string</code> and a <code>Sexp.t</code><sup><a id="fnr.5" class="footref" href="#fn.5">5</a></sup>,
respectively. (<code>Sexp.t</code> is from Jane Street&rsquo;s <a href="https://github.com/janestreet/sexplib">sexplib</a>.)
</p>

<p>
<code>messagef</code> also prints to the echo area but it accepts a formatting string
and arguments the same way <code>printf</code> does, e.g.,
</p>

<div class="org-src-container">
<pre class="src src-ocaml"><span class="org-tuareg-font-lock-governing">let</span> <span class="org-tuareg-font-lock-operator"><span class="org-rainbow-delimiters-depth-1">()</span></span> <span class="org-tuareg-font-lock-operator">=</span> messagef <span class="org-string">"the answer is %d"</span> <span class="org-highlight-numbers-number">42</span>
</pre>
</div>
</div>
</div>
<div id="outline-container-orgc0591d4" class="outline-3">
<h3 id="val-provide"><a id="orgc0591d4"></a><span class="section-number-3">2.4</span> <code>provide</code></h3>
<div class="outline-text-3" id="text-val-provide">
<p>
<a href="https://www.gnu.org/software/emacs/manual/html_node/elisp/Named-Features.html#index-provide-1052"><code>provide</code></a> registers a <b>feature</b> with Emacs, basically telling Emacs that the
plugin was successfully loaded. This is needed in order to load the plugin
using <a href="https://www.gnu.org/software/emacs/manual/html_node/elisp/Named-Features.html#index-require-1053"><code>require</code></a> (or Emacs will complain). It also prevents Emacs from trying
to load the same plugin again, since <code>require</code> checks to see if its argument
has been registered as a feature before loading a plugin. Call <code>provide</code> when
your plugin is finished setting up.
</p>
</div>
</div>
</div>
<div id="outline-container-org6d8dfc1" class="outline-2">
<h2 id="other-modules"><a id="org6d8dfc1"></a><span class="section-number-2">3</span> Feature rundown</h2>
<div class="outline-text-2" id="text-other-modules">
<p>
What follows is a listing of the current modules in Ecaml (as of version
v0.9.115.24+69), each with a brief summary of the functionality provided
within.
</p>

<p>
Many of the modules correspond one-to-one with concepts in Emacs
better-documented in the manual than I can do here. Module names are linked to
the appropriate section of the Emacs or Emacs Lisp manuals.
</p>

<dl class="org-dl">
<dt><a href="https://www.gnu.org/software/emacs/manual/html_node/elisp/Advising-Functions.html#Advising-Functions">Advice</a></dt><dd>Elisp has a system for &ldquo;advising&rdquo; functions, ways of adding to or
modifying the behavior of a function without completely redefining it by
writing a new function that is called before, after, or in place of the
old one, etc. Ecaml currently only supports <b>around</b> advice.</dd>
<dt>Ansi_color</dt><dd>Contains functions for interpreting ANSI color escape
sequences in Elisp strings and translating them into Elisp
<b>string properties</b> which encode equivalent colors. Such
strings might often be the result of running terminal-oriented
version control or diff programs.<sup><a id="fnr.6" class="footref" href="#fn.6">6</a></sup></dd>
<dt><a href="https://www.gnu.org/software/emacs/manual/html_node/elisp/Auto-Major-Mode.html#index-auto_002dmode_002dalist-1885">Auto_mode_alist</a></dt><dd>Manages the variable <code>auto-mode-alist</code>, which determines
how Emacs decides what major mode to open a file in, based on the
filename.</dd>
<dt><a href="https://www.gnu.org/software/emacs/manual/html_node/elisp/Backups-and-Auto_002dSaving.html#Backups-and-Auto_002dSaving">Backup</a></dt><dd>Manages the variable <code>make-backup-files</code>, which controls whether
Emacs will make backup files (like <code>foo.c~</code>) when you edit files.</dd>
<dt><a href="https://www.gnu.org/software/emacs/manual/html_node/elisp/Buffers.html#Buffers">Buffer</a></dt><dd>Everything to do with buffers, including killing buffers,
displaying them, and finding out what files they&rsquo;re visiting.</dd>
<dt><a href="https://www.gnu.org/software/emacs/manual/html_node/elisp/Character-Codes.html#Character-Codes">Char_code</a></dt><dd>Internally, Emacs represents characters as just a code point
(integer). Technically, these are a superset of Unicode code points.</dd>
<dt><a href="https://www.gnu.org/software/emacs/manual/html_node/elisp/Color-Names.html#Color-Names">Color</a></dt><dd>Deals with colors and <b>Color Names</b>, which are various forms of
plain English and RGB strings that Emacs can interpret as colors and
display. See <kbd>M-x list-colors-display</kbd> for a list of
available colors.</dd>
<dt><a href="https://www.gnu.org/software/emacs/manual/html_node/elisp/Defining-Commands.html#Defining-Commands">Command</a></dt><dd>A <b>command</b> is simply a function that can be called
interactively, e.g., through <kbd>M-x</kbd>. It should specify a
way of receiving its arguments interactively, such as by prompting the
user for input, rather than through the normal function call mechanism.</dd>
<dt><a href="https://www.gnu.org/software/emacs/manual/html_node/emacs/Options-for-Comments.html">Comment</a></dt><dd>Manages the active comment syntax.</dd>
<dt>Compilation</dt><dd>Not sure what this does.</dd>
<dt>Current_buffer</dt><dd>Emacs has a notion of the <b>current buffer</b>, which many
Elisp functions operate on by default instead of accepting
a buffer or buffer name as an argument. This module allows
you to set the current buffer and use those functions.</dd>
<dt><a href="https://www.gnu.org/software/emacs/manual/html_node/elisp/Customization.html#Customization">Customization</a></dt><dd>Manages the definition and organization of <b>customization
items</b>, which allow users to customize variables and (font) faces through
an organized, interactive user interface.</dd>
<dt><a href="https://www.gnu.org/software/emacs/manual/html_node/elisp/Files.html#Files">Directory</a></dt><dd>Functions for managing file directories.</dd>
<dt><a href="https://www.gnu.org/software/emacs/manual/html_node/elisp/The-Echo-Area.html#The-Echo-Area">Echo_area</a></dt><dd><code>message</code> and co. print their messages here, at the bottom of
Emacs&rsquo;s frame. Also allows you to temporarily inhibit messages in the
echo area (but they&rsquo;re still logged to <code>*Messages*</code>).</dd>
<dt><a href="https://www.gnu.org/software/emacs/manual/html_node/elisp/Faces.html#Faces">Face</a></dt><dd>Emacs&rsquo;s notion of fonts, including font families, sizes, weights,
styles, and decorations.<sup><a id="fnr.7" class="footref" href="#fn.7">7</a></sup></dd>
<dt><a href="https://www.gnu.org/software/emacs/manual/html_node/elisp/Named-Features.html#Named-Features">Feature</a></dt><dd>Emacs records a set of named features <i>provided</i> by packages
(they&rsquo;re just symbols). Code that depends on a given feature can
<i>require</i> it, which causes Emacs to load the package that provides
it&#x2014;unless it has already been loaded.</dd>
<dt><a href="https://www.gnu.org/software/emacs/manual/html_node/elisp/Files.html#Files">File</a></dt><dd>Functions for managing files (renaming, writing, permissions, etc.).</dd>
<dt><a href="https://www.gnu.org/software/emacs/manual/html_node/elisp/Files.html#Files">Filename</a></dt><dd>Functions for managing filenames (extensions, relative/absolute
paths, etc.)</dd>
<dt>Find_function</dt><dd><code>find-function</code> jumps to the source code where a function
is defined, given its name. It&rsquo;s not built-in as part of
Emacs, but rather is defined by the <a href="https://www.emacswiki.org/emacs/FindFunc">Find Func</a> package.</dd>
<dt><a href="https://www.gnu.org/software/emacs/manual/html_node/elisp/Evaluation.html#Evaluation">Form</a></dt><dd>Lisp treats all code as plain old data (cons cells and symbols and
numbers&#x2014;oh my!). Module <code>Form</code> contains functions specifically related
to treating Lisp values as code (e.g., <code>eval</code> and <code>quote</code>).</dd>
<dt><a href="https://www.gnu.org/software/emacs/manual/html_node/elisp/Frames.html#Frames">Frame</a></dt><dd>Manages Emacs <b>frames</b>, which you probably call &ldquo;windows&rdquo; if you
run GUI Emacs (as opposed to in the terminal). For Emacs&rsquo;s <b>windows</b>, see
below.</dd>
<dt><a href="https://www.gnu.org/software/emacs/manual/html_node/elisp/Functions.html#Functions">Function</a></dt><dd>Manages Elisp functions. <code>Function.Fn.t</code> is the common signature
for all OCaml functions that are to be called from within Emacs.</dd>
<dt><a href="https://www.emacswiki.org/emacs/GrepMode">Grep</a></dt><dd>Facility for running <code>grep</code> from within Emacs. <code>grep</code> is run
asynchronously, and the results are collected in the <code>*grep*</code> buffer.</dd>
<dt><a href="https://www.gnu.org/software/emacs/manual/html_node/elisp/Hash-Tables.html#Hash-Tables">Hash_table</a></dt><dd>Elisp&rsquo;s built-in hash table data structure.</dd>
<dt><a href="https://www.gnu.org/software/emacs/manual/html_node/elisp/Hooks.html#Hooks">Hook</a></dt><dd>Hooks identify logical places where you can register functions to be
called. For example, you could attach a function to <code>before-save-hook</code> to
remove trailing whitespace from a file you&rsquo;re editing before you save it.</dd>
<dt><a href="https://www.gnu.org/software/emacs/manual/html_node/elisp/Input-Events.html#Input-Events">Input_event</a></dt><dd>Deals with user input events, such as mouse clicks and key
presses. You could use <code>Input_event.modifiers</code> to determine whether the
Control key was held down while a key was pressed.</dd>
<dt><a href="https://www.gnu.org/software/emacs/manual/html_node/elisp/Key-Sequences.html#Key-Sequences">Key_sequence</a></dt><dd>A sequence of one or more key presses that form a unit, such
as <kbd>C-x C-f</kbd>. This module can be used to read key
sequences from user input or simulate key presses using
<code>execute-kbd-macro</code>.</dd>
<dt><a href="https://www.gnu.org/software/emacs/manual/html_node/elisp/Keymaps.html#Keymaps">Keymap</a></dt><dd>Keymaps relate input events to commands or other keymaps (allowing
multiple input events to correspond to a single command). This is how
keys are bound to commands, and we can use these to provide key bindings
for any commands we provide in our plugin.</dd>
<dt><a href="https://www.gnu.org/software/emacs/manual/html_node/elisp/Loading.html#Loading">Load</a></dt><dd>Load another plugin, usually a Lisp file. Also contains <code>path</code>,
which allows you to examine the load path, which is a list of directories
where Emacs will look for the file name you pass to <code>load</code>.</dd>
<dt><a href="https://www.gnu.org/software/emacs/manual/html_node/elisp/Where-Defined.html#index-load_002dhistory-1059">Load_history</a></dt><dd><p>
Best described by an excerpt from the doc comments:
</p>
<div class="org-src-container">
<pre class="src src-ocaml"><span class="org-comment-delimiter">(* </span><span class="org-comment">[update_emacs_with_entries] updates [load-history] with the information supplied to</span>
<span class="org-comment">   [add_entry], which make it possible to, within Emacs, jump from a symbol defined by</span>
<span class="org-comment">   Ecaml to the Ecaml source. </span><span class="org-comment-delimiter">*)</span>
</pre>
</div>

<p>
             Used behind-the-scenes by <code>defcustom</code>, <code>defun</code>, <code>defvar</code>,
and so on.
</p></dd>
<dt><a href="https://www.gnu.org/software/emacs/manual/html_node/emacs/Major-Modes.html">Major_mode</a></dt><dd>Manages major modes, major mode keymaps, derived modes, etc.</dd>
<dt><a href="https://www.gnu.org/software/emacs/manual/html_node/elisp/Markers.html#Markers">Marker</a></dt><dd>Markers keep track of a certain position in a buffer. They are
automatically adjusted when the buffer is edited, so that they maintain
the same logical position if not the same offset.</dd>
<dt><a href="https://www.gnu.org/software/emacs/manual/html_node/elisp/Minibuffers.html#Minibuffers">Minibuffer</a></dt><dd>Often confused with the Echo Area (I used this term wrong in
the first post of this series). Used for reading input from the user. For
example, when you run an interactive command through
<kbd>M-x</kbd>, the minibuffer is where you type the name of the
command you want to run.</dd>
<dt><a href="https://www.gnu.org/software/emacs/manual/html_node/elisp/Minor-Modes.html#Minor-Modes">Minor_mode</a></dt><dd>Secondary modes you can enable and disable that provide
additional functionality on top of the major mode.</dd>
<dt><a href="https://www.gnu.org/software/emacs/manual/html_node/elisp/Creating-Symbols.html#Creating-Symbols">Obarray</a></dt><dd>An internal Emacs data structure that stores a set of symbols,
for use with <code>intern</code> and <code>read</code>. Normally there is only one, stored in
the variable <code>obarray</code>, so there is only one symbol with a given name.</dd>
<dt><a href="https://www.gnu.org/software/emacs/manual/html_node/elisp/Point.html#Point">Point</a></dt><dd>The location of the cursor in a buffer. Module <code>Point</code> contains
functions for setting the point and searching forward for a given string,
for example.</dd>
<dt><a href="https://www.gnu.org/software/emacs/manual/html_node/elisp/Positions.html#Positions">Position</a></dt><dd>A number that describes the position of a character or cursor
within a buffer. Starts at 1. Doesn&rsquo;t automatically move when the buffer
is edited, unlike a marker.</dd>
<dt><a href="https://www.gnu.org/software/emacs/manual/html_node/elisp/Processes.html#Processes">Process</a></dt><dd>Manages child process of Emacs, such as shells, <code>ispell</code>, or
Merlin. This also includes the Emacs server.</dd>
<dt>Q</dt><dd>Contains a large number of symbols, including keyword symbols (module
<code>Q.K</code>) and ampersand-symbols like <code>&amp;optional</code> (module <code>Q.A</code>). These are
all stored here to avoid unnecessarily allocating OCaml data structures
to refer to these symbols whenever they&rsquo;re needed.</dd>
<dt><a href="https://www.gnu.org/software/emacs/manual/html_node/elisp/Regular-Expressions.html#Regular-Expressions">Regexp</a></dt><dd>Elisp regular expressions. They have slightly different syntax
from regular expressions you might use in Perl or Perl-compatible
regexps.<sup><a id="fnr.8" class="footref" href="#fn.8">8</a></sup></dd>
<dt>Selected_window</dt><dd>Similar to <code>Current_buffer</code>, but for Emacs&rsquo;s &ldquo;windows&rdquo;.</dd>
<dt><a href="https://www.gnu.org/software/emacs/manual/html_node/elisp/Symbols.html#Symbols">Symbol</a></dt><dd><p>
Identifiers in Elisp are converted to <b>symbols</b>, which are
&ldquo;interned&rdquo;. This means that two symbols with the same name<sup><a id="fnr.9" class="footref" href="#fn.9">9</a></sup> are
physically the same object, and so can be compared using pointer
arithmetic. Module <code>Symbol</code> also contains functions for calling Elisp
functions from OCaml, with convenience functions for different arities.
</p>

<p>
       For example, <code>Symbol.funcall1</code> accepts a symbol, which should
denote the name of a function, and a single argument, which is passed to
the function. Functions in this module whose names end in <code>_i</code> ignore the
return value and return <code>unit</code> instead of <code>Value.t</code>. These are useful
since a lot of Elisp functions with side effects don&rsquo;t return anything
useful.
</p>

<p>
       Most of the other modules' functionality are built on top of
<code>Symbol</code> and its function-calling functions.
</p></dd>
<dt><a href="https://www.gnu.org/software/emacs/manual/html_node/elisp/Syntax-Tables.html#Syntax-Tables">Syntax_table</a></dt><dd>Syntax tables are used to provide language-specific
functionality, e.g., syntax highlighting.</dd>
<dt><a href="https://www.gnu.org/software/emacs/manual/html_node/elisp/System-Interface.html#System-Interface">System</a></dt><dd>Interacts with the operating system. Currently contains functions
for setting and querying environment variables.</dd>
<dt><a href="https://www.gnu.org/software/emacs/manual/html_node/elisp/Strings-and-Characters.html#Strings-and-Characters">Text</a></dt><dd>Represents Emacs&rsquo;s strings, which contain not only characters but
also <b>text properties</b>, which enhance text to provide everything from
(font) faces to read-only status to text-based &ldquo;buttons&rdquo;. See the <a href="https://www.gnu.org/software/emacs/manual/html_node/elisp/Text-Properties.html#Text-Properties">manual</a>
for more on text properties.</dd>
<dt><a href="https://www.gnu.org/software/emacs/manual/html_node/elisp/Timers.html#Timers">Timer</a></dt><dd>Timers allow you to schedule a function to be run at a future time,
possibly more than once.</dd>
<dt><a href="https://www.gnu.org/software/emacs/manual/html_node/elisp/User-Identification.html#User-Identification">User</a></dt><dd>Manages operating system users, such as their login names and UIDs.</dd>
<dt><a href="https://www.gnu.org/software/emacs/manual/html_node/elisp/Lisp-Data-Types.html#Lisp-Data-Types">Value</a></dt><dd>Lisps are dynamically typed, and every type of value is ultimately
a subtype of <code>Value.t</code>. Contains a lot of type predicates that allow you
to test the type of an arbitrary value.</dd>
<dt><a href="https://www.gnu.org/software/emacs/manual/html_node/elisp/Variables.html#Variables">Var</a></dt><dd>Special features available to variables, such as setting a
buffer-local value or a default value.</dd>
<dt><a href="https://www.gnu.org/software/emacs/manual/html_node/elisp/Vectors.html#Vectors">Vector</a></dt><dd>Elisp&rsquo;s built-in vector data structure.</dd>
<dt><a href="https://www.gnu.org/software/emacs/manual/html_node/elisp/Windows.html#Windows">Window</a></dt><dd>Confusingly, Emacs uses the term <b>window</b> to refer to what most
people might call window panes. A window <i>displays</i> a buffer.</dd>
<dt>Working_directory</dt><dd>Each buffer, as well as Emacs itself, has a working
directory. Relative paths are resolved relative to this directory.
<code>Working_directory.within</code> runs a function with the current working
directory set to a given value.</dd>
</dl>
</div>
</div>
<div id="outline-container-org56e8521" class="outline-2">
<h2 id="to-be-continued"><a id="org56e8521"></a><span class="section-number-2">4</span> Phew!</h2>
<div class="outline-text-2" id="text-to-be-continued">
<p>
That was quite a few modules. We&rsquo;ll use some of them next week in building our
interpreter plugin. Catch you next time.
</p>
</div>
</div>
<div id="footnotes">
<h2 class="footnotes">Footnotes: </h2>
<div id="text-footnotes">

<div class="footdef"><sup><a id="fn.1" class="footnum" href="#fnr.1">1</a></sup> <div class="footpara"><p class="footpara">
Plus, I have no idea what many of the individual pieces do.
</p></div></div>

<div class="footdef"><sup><a id="fn.2" class="footnum" href="#fnr.2">2</a></sup> <div class="footpara"><p class="footpara">
<code>ppx_jane</code>, Jane Street&rsquo;s set of ppx rewriters, includes <code>ppx_here</code>, so
if you already have that installed, you can use it to provide <code>[%here]</code>. You&rsquo;ll
need to add the following to your <code>jbuild</code> file:
</p>

<div class="org-src-container">
<pre class="src src-tuareg-jbuild">(jbuild_version 1)

(executables
 ((names     (main))
  (libraries (ecaml))
  (preprocess (pps (ppx_jane))))) ; add this line
</pre>
</div></div></div>

<div class="footdef"><sup><a id="fn.3" class="footnum" href="#fnr.3">3</a></sup> <div class="footpara"><p class="footpara">
If you&rsquo;re not keen on using syntax extensions, you can use OCaml&rsquo;s
built-in <code>__POS__</code> macro. Unfortunately, this macro returns a tuple instead of a
record, but since all the fields are in the right order, you can do <code>Obj.magic
__POS__</code> to get a <code>Lexing.position</code>. But don&rsquo;t tell anyone I told you that!
</p></div></div>

<div class="footdef"><sup><a id="fn.4" class="footnum" href="#fnr.4">4</a></sup> <div class="footpara"><p class="footpara">
Elisp is a Lisp-2, meaning that functions and variables live in two
different and non-overlapping namespaces. That&rsquo;s why defining functions and
variables uses different mechanisms (unlike in, say, OCaml!).
</p></div></div>

<div class="footdef"><sup><a id="fn.5" class="footnum" href="#fnr.5">5</a></sup> <div class="footpara"><p class="footpara">
See? Interfacing OCaml with Lisp isn&rsquo;t that weird after all!
</p></div></div>

<div class="footdef"><sup><a id="fn.6" class="footnum" href="#fnr.6">6</a></sup> <div class="footpara"><p class="footpara">
In fact, this was one of the key original motivations for Ecaml.
</p></div></div>

<div class="footdef"><sup><a id="fn.7" class="footnum" href="#fnr.7">7</a></sup> <div class="footpara"><p class="footpara">
Surprisingly, faces in Elisp live in a completely separate namespace from
variables and functions. So perhaps Elisp is a Lisp-3?
</p></div></div>

<div class="footdef"><sup><a id="fn.8" class="footnum" href="#fnr.8">8</a></sup> <div class="footpara"><p class="footpara">
For example, the grouping operator is <code>\( ... \)</code>, not <code>( ... )</code>. Plain
old parentheses just match literal parentheses in the target string.
</p></div></div>

<div class="footdef"><sup><a id="fn.9" class="footnum" href="#fnr.9">9</a></sup> <div class="footpara"><p class="footpara">
in the same obarray.
</p></div></div>


</div>
</div>]]></content><author><name>Aaron L. Zeng</name></author><category term="ecaml-getting-started" /><category term="ecaml" /><category term="emacs" /><category term="ocaml" /><summary type="html"><![CDATA[This post is part 3 of a series (prev/next). The full code is available on GitHub.]]></summary></entry><entry><title type="html">App::WatchLater: over-engineering YouTube consumption</title><link href="https://blag.bcc32.com/uncategorized/2017/11/12/app-watchlater/" rel="alternate" type="text/html" title="App::WatchLater: over-engineering YouTube consumption" /><published>2017-11-12T01:38:05-05:00</published><updated>2017-11-12T01:38:05-05:00</updated><id>https://blag.bcc32.com/uncategorized/2017/11/12/app-watchlater</id><content type="html" xml:base="https://blag.bcc32.com/uncategorized/2017/11/12/app-watchlater/"><![CDATA[<p>I recently uploaded my first <a href="https://metacpan.org/pod/distribution/App-WatchLater/bin/watch-later">distribution</a> to CPAN, the Comprehensive
Perl Archive Network. Named App-WatchLater, it consists of a couple of scripts
that manage a queue of YouTube videos to be watched later. (I know, real First
World Problem right there.)</p>

<p>Why on Earth would I write such a thing when there’s literally a <strong>Watch Later</strong>
feature built into YouTube itself? Well, I guess it must be my weird consumption
habits.</p>

<p>At one point I had over 2800 videos on my YouTube Watch Later playlist, which
became super unwieldy. YouTube playlists only load 100 videos at a time, and my
Watch Later list was in a totally arbitrary order and completely unsearchable.</p>

<p>The main modules are <code class="language-plaintext highlighter-rouge">App::WatchLater</code>, which controls most of the database and
command-line options, and <code class="language-plaintext highlighter-rouge">App::WatchLater::YouTube</code>, which is an interface to
the YouTube Data API.</p>

<h2 id="watch-later">watch-later</h2>

<p>The script <code class="language-plaintext highlighter-rouge">watch-later</code> maintains a SQLite database of videos, including the
video ID, title, and channel name and ID. It also keeps track of which have been
watched and which have not. (More fields may be added in the future, e.g., video
duration.)</p>

<p>Then, using the SQLite command-line tool, it is easy to query for videos I’ve
bookmarked from a given channel or with a given keyword in the title, so that I
can watch something appropriate for the mood I’m in.</p>

<h2 id="migrating">Migrating</h2>

<p>Since the YouTube Data API no longer allows access to the user’s <code class="language-plaintext highlighter-rouge">WL</code> (Watch
Later) playlist items, it was not possible for me to directly retrieve all of
the videos in my queue and then clear out my queue.</p>

<p>Instead, I had to write another script to scrape the playlist page for video
IDs, 100 at a time, and then another bit of JavaScript to simulate clicking the
“remove video from playlist” button for those 100 videos. Lather, rinse, repeat.</p>

<p>Now, just 2420 to go. If only I had the willpower to take care of the root
cause…</p>]]></content><author><name>Aaron L. Zeng</name></author><category term="uncategorized" /><category term="life-scripts" /><category term="perl" /><summary type="html"><![CDATA[I recently uploaded my first distribution to CPAN, the Comprehensive Perl Archive Network. Named App-WatchLater, it consists of a couple of scripts that manage a queue of YouTube videos to be watched later. (I know, real First World Problem right there.)]]></summary></entry><entry><title type="html">Emacs Plugins in OCaml: Hello, Ecaml! (part 1)</title><link href="https://blag.bcc32.com/ecaml-getting-started/2017/11/05/emacs-plugins-in-ocaml-1/" rel="alternate" type="text/html" title="Emacs Plugins in OCaml: Hello, Ecaml! (part 1)" /><published>2017-11-05T16:00:00-05:00</published><updated>2017-11-05T16:00:00-05:00</updated><id>https://blag.bcc32.com/ecaml-getting-started/2017/11/05/emacs-plugins-in-ocaml-1</id><content type="html" xml:base="https://blag.bcc32.com/ecaml-getting-started/2017/11/05/emacs-plugins-in-ocaml-1/"><![CDATA[<p class="alert alert-info">
This post is part 1 of a
<a class="alert-link" href="/categories/ecaml-getting-started/">series</a>
(<a class="alert-link" href="/ecaml-getting-started/2017/11/12/emacs-plugins-in-ocaml-2/">next</a>).
The full code is available on
<a class="alert-link" href="https://github.com/bcc32/ecaml-bf">GitHub</a>.
</p>

<p>
There was recently a <a href="https://discuss.ocaml.org/t/sample-emacs-plugin-using-ecaml/1016">thread</a> on the OCaml Discourse <a href="https://discuss.ocaml.org/">forum</a> about sample Emacs
plugins using the <a href="https://github.com/janestreet/ecaml">Ecaml</a> library. Since the library doesn&rsquo;t seem to have been
getting that much use, and since I have some personal interest<sup><a id="fnr.1" class="footref" href="#fn.1">1</a></sup> in seeing
it used more, I decided to help by writing a Getting Started guide. Here goes.
</p>

<p>
This is part 1 of a <a href="/categories/ecaml-getting-started/">series</a> (<a href="/ecaml-getting-started/2017/11/12/emacs-plugins-in-ocaml-2/">next</a>). We&rsquo;ll see why you might use Ecaml to write a plugin,
and get it set up and running with a simple Hello World. All of the code for
this series is available on <a href="https://github.com/bcc32/ecaml-bf">GitHub</a>.
</p>

<p>
(This guide has only been tested on macOS 10.12, but probably works on other
Unices like Linux or OS X. Unfortunately I have no idea about Windows.)
</p>

<div id="table-of-contents">
<h2>Table of Contents</h2>
<div id="text-table-of-contents">
<ul>
<li><a href="#emacs">1. Emacs</a>
<ul>
<li><a href="#elisp">1.1. Elisp</a></li>
<li><a href="#native-plugins">1.2. Native Plugins</a></li>
</ul>
</li>
<li><a href="#ocaml-plugins">2. OCaml plugins?</a></li>
<li><a href="#lets-write-a-plugin">3. Let&rsquo;s write a plugin</a>
<ul>
<li><a href="#obtaining-ecaml">3.1. Obtaining Ecaml</a></li>
<li><a href="#hello-from-ocaml">3.2. Hello from OCaml</a></li>
<li><a href="#compiling-our-plugin">3.3. Compiling our plugin</a></li>
</ul>
</li>
</ul>
</div>
</div>

<div id="outline-container-orgf869342" class="outline-2">
<h2 id="emacs"><a id="orgf869342"></a><span class="section-number-2">1</span> Emacs</h2>
<div class="outline-text-2" id="text-emacs">
<p>
Emacs is a great <del>text editor</del> <del>operating system</del> habitat. Emacs&rsquo;s power and
popularity arguably stem from its ability to be endlessly customized and
enhanced by plugins (i.e., extensions, packages, modules)&#x2014;for example, I&rsquo;m
writing this post in <code>org-mode</code> while editing its YAML metadata (in the same
file) in <code>yaml-mode</code>. These two editing modes are able to play nicely in the
same buffer thanks to a package called <code>mmm-mode</code> (&ldquo;multiple major modes&rdquo;&#x2026;
mode).
</p>

<p>
In this series, we&rsquo;re going to write an interpreter for a minimalist
Turing-complete programming <a href="https://en.wikipedia.org/wiki/Brainfuck">language</a>. We want to be able to run the code we&rsquo;re
editing directly within Emacs, instead of switching to our shell and running
it from there, so we&rsquo;ll embed our interpreter as a plugin in Emacs.
</p>
</div>
<div id="outline-container-orgab7c3aa" class="outline-3">
<h3 id="elisp"><a id="orgab7c3aa"></a><span class="section-number-3">1.1</span> Elisp</h3>
<div class="outline-text-3" id="text-elisp">
<p>
In the past, Emacs plugins have all been written in a language called Emacs
Lisp (or Elisp), which, as its name suggests, is a Lisp. Elisp is a perfectly
nice language, especially for writing the sort of code you might find in a
plugin, but it has a couple of drawbacks. The main one I&rsquo;m interested in is
speed.
</p>

<p>
Ya see, Emacs&rsquo;s Lisp interpreter ain&rsquo;t exactly the fastest. A naïve speed
comparison between Elisp and Ruby, for dramatic effect:
</p>

<div class="org-src-container">
<pre class="src src-elisp"><span class="org-rainbow-delimiters-depth-1">(</span><span class="org-keyword">benchmark-run</span> <span class="org-rainbow-delimiters-depth-2">(</span>print <span class="org-rainbow-delimiters-depth-3">(</span><span class="org-keyword">loop</span> for i below <span class="org-highlight-numbers-number">1000000</span> sum i<span class="org-rainbow-delimiters-depth-3">)</span><span class="org-rainbow-delimiters-depth-2">)</span><span class="org-rainbow-delimiters-depth-1">)</span>
<span class="org-comment-delimiter">;; </span><span class="org-comment">499999500000</span>
<span class="org-comment-delimiter">;; </span><span class="org-comment">(0.389676 0 0.0)</span>
</pre>
</div>

<div class="org-src-container">
<pre class="src src-sh"><span class="org-builtin">time</span> ruby -e <span class="org-string">'p (0...1000000).reduce(&amp;:+)'</span>
<span class="org-comment-delimiter"># </span><span class="org-comment">499999500000</span>
<span class="org-comment-delimiter"># </span><span class="org-comment">ruby -e 'p (0...1000000).reduce(&amp;:+)'  0.17s user 0.03s system 81% cpu 0.236 total</span>
</pre>
</div>

<p>
Both of these snippets just iterate from 0 to 999,999, adding up
</p>

<p>
The Elisp code takes quite a bit longer to add up the natural numbers less
than one million, and that&rsquo;s including the startup time of the <code>ruby</code>
interpreter! Now, to be fair, when compiled, the above Elisp code only takes
around 0.125 seconds, but that&rsquo;s still substantially slower than compiled
languages like C or Java (or OCaml).
</p>
</div>
</div>
<div id="outline-container-orgde58825" class="outline-3">
<h3 id="native-plugins"><a id="orgde58825"></a><span class="section-number-3">1.2</span> Native Plugins</h3>
<div class="outline-text-3" id="text-native-plugins">
<p>
Emacs 25 added the ability to load <b>native code</b> plugins. Native code refers
to code that runs directly on the processor, instead of being interpreted by
another program, like Emacs&rsquo;s Lisp interpreter or Ruby&rsquo;s virtual machine.
</p>

<p>
There&rsquo;s quite a good introduction to the basics of writing native plugins
<a href="http://diobla.info/blog-archive/modules-tut.html">here</a>, but the gist of it is that Emacs exposes a small C API which you can
write modules against. Compiling them to shared objects (i.e., dynamically
linked libraries, <code>.so</code>&rsquo;s, <code>.dll</code>&rsquo;s, etc.) allows Emacs to dynamically load
the plugin at runtime.
</p>
</div>
</div>
</div>
<div id="outline-container-orgdd09a67" class="outline-2">
<h2 id="ocaml-plugins"><a id="orgdd09a67"></a><span class="section-number-2">2</span> OCaml plugins?</h2>
<div class="outline-text-2" id="text-ocaml-plugins">
<p>
OCaml is a statically-typed functional programming language with a great deal
of versatility, contrary to what you might have first expected when you just
read the words &ldquo;statically-typed&rdquo; and &ldquo;functional&rdquo;. Of course, since you&rsquo;re
reading this blog post, I assume you are familiar with OCaml and won&rsquo;t belabor
singing its praises.
</p>

<p>
I&rsquo;ll just point out that OCaml has a quite nice way to interface with C code
using its foreign function interface (FFI), which means we can use OCaml to
write native plugins for Emacs, as long as we have some library to provide the
interface between the Emacs C API and our OCaml plugin. That&rsquo;s where Ecaml
comes in.
</p>
</div>
</div>
<div id="outline-container-orge83f9d6" class="outline-2">
<h2 id="lets-write-a-plugin"><a id="orge83f9d6"></a><span class="section-number-2">3</span> Let&rsquo;s write a plugin</h2>
<div class="outline-text-2" id="text-lets-write-a-plugin">
<p>
<a href="https://github.com/janestreet/ecaml">Ecaml</a> is an open-source OCaml library for writing native Emacs plugins,
available on OPAM or Jane Street&rsquo;s GitHub page. We&rsquo;ll use Ecaml to write our
Emacs plugin, in OCaml instead of Elisp.
</p>

<div class="org-src-container">
<pre class="src src-sh">mkdir ~/ecaml-bf
<span class="org-builtin">cd</span> ~/ecaml-bf
git init           <span class="org-comment-delimiter"># </span><span class="org-comment">or hg, if you prefer</span>
touch jbuild-workspace
</pre>
</div>
</div>
<div id="outline-container-orgeaaabb8" class="outline-3">
<h3 id="obtaining-ecaml"><a id="orgeaaabb8"></a><span class="section-number-3">3.1</span> Obtaining Ecaml</h3>
<div class="outline-text-3" id="text-obtaining-ecaml">
<p>
We&rsquo;ll use the development version of Ecaml, since the version in the OPAM
repository is significantly outdated. Jane Street hosts an OPAM repo with the
development versions of their open-source libraries, so let&rsquo;s add the repo to
OPAM:
</p>

<div class="org-src-container">
<pre class="src src-sh">opam switch <span class="org-highlight-numbers-number">4.05.0</span>
opam repo add janestreet-bleeding https://ocaml.janestreet.com/opam-repository
opam install ecaml
</pre>
</div>
</div>
</div>
<div id="outline-container-orgac28961" class="outline-3">
<h3 id="hello-from-ocaml"><a id="orgac28961"></a><span class="section-number-3">3.2</span> Hello from OCaml</h3>
<div class="outline-text-3" id="text-hello-from-ocaml">
<p>
We&rsquo;ll start with Hello World for now (<code>src/plugin.ml</code>):
</p>

<div class="org-src-container">
<pre class="src src-ocaml"><span class="org-tuareg-font-lock-governing">let</span> <span class="org-tuareg-font-lock-operator"><span class="org-rainbow-delimiters-depth-1">()</span></span> <span class="org-tuareg-font-lock-operator">=</span>
  <span class="org-tuareg-font-lock-module">Ecaml.</span>message <span class="org-string">"Hello from OCaml"</span><span class="org-tuareg-font-lock-operator">;</span>
  <span class="org-tuareg-font-lock-module">Ecaml.</span>provide <span class="org-tuareg-font-lock-operator"><span class="org-rainbow-delimiters-depth-1">(</span></span><span class="org-tuareg-font-lock-module">Ecaml.Symbol.</span>intern <span class="org-string">"ecaml-bf"</span><span class="org-tuareg-font-lock-operator"><span class="org-rainbow-delimiters-depth-1">)</span></span>
<span class="org-tuareg-font-double-colon">;;</span>
</pre>
</div>

<p>
Our simple plugin just writes a message to the echo area (or standard error
if Emacs is run in <code>--batch</code> mode). Emacs plugins need to <a href="https://www.gnu.org/software/emacs/manual/html_node/elisp/Named-Features.html#index-provide-1052"><code>provide</code></a> the name
of the plugin in order to be loaded successfully by <a href="https://www.gnu.org/software/emacs/manual/html_node/elisp/Named-Features.html#index-require-1053"><code>require</code></a>, so we do that
just before our plugin exits.
</p>

<p>
Note that OCaml&rsquo;s module initialization (i.e., the stuff that runs when you
compile and execute a normal OCaml program) serves as the code that is run
when the plugin is <b>loaded</b>. In order to extend Emacs&rsquo;s behavior while
running, we&rsquo;ll need to define some Elisp functions and key bindings to call
those functions, later. For now, let&rsquo;s just try our plugin out.
</p>
</div>
</div>
<div id="outline-container-orgd39da0e" class="outline-3">
<h3 id="compiling-our-plugin"><a id="orgd39da0e"></a><span class="section-number-3">3.3</span> Compiling our plugin</h3>
<div class="outline-text-3" id="text-compiling-our-plugin">
<p>
Unfortunately, I haven&rsquo;t yet been able to figure out a way to get <code>jbuilder</code>
to build shared objects. However, taking advantage of the fact that
executable files and shared objects have basically the same file format
(e.g., Mach-O or ELF), we can just pretend that the compiled executable is a
shared object. It&rsquo;s fine, trust me. Probably.
</p>

<p>
We&rsquo;ll set up our <code>jbuild</code> file like so:
</p>

<div class="org-src-container">
<pre class="src src-tuareg-jbuild"><span class="org-rainbow-delimiters-depth-1">(</span><span class="org-keyword">jbuild_version</span> <span class="org-highlight-numbers-number">1</span><span class="org-rainbow-delimiters-depth-1">)</span>

<span class="org-rainbow-delimiters-depth-1">(</span><span class="org-keyword">executables</span>
 <span class="org-rainbow-delimiters-depth-2">(</span><span class="org-rainbow-delimiters-depth-3">(</span><span class="org-constant">names</span>     <span class="org-rainbow-delimiters-depth-4">(</span>plugin<span class="org-rainbow-delimiters-depth-4">)</span><span class="org-rainbow-delimiters-depth-3">)</span>
  <span class="org-rainbow-delimiters-depth-3">(</span><span class="org-constant">libraries</span> <span class="org-rainbow-delimiters-depth-4">(</span>ecaml<span class="org-rainbow-delimiters-depth-4">)</span><span class="org-rainbow-delimiters-depth-3">)</span><span class="org-rainbow-delimiters-depth-2">)</span><span class="org-rainbow-delimiters-depth-1">)</span>

<span class="org-rainbow-delimiters-depth-1">(</span><span class="org-keyword">rule</span> <span class="org-rainbow-delimiters-depth-2">(</span><span class="org-builtin">copy</span> plugin.exe ecaml-bf.so<span class="org-rainbow-delimiters-depth-2">)</span><span class="org-rainbow-delimiters-depth-1">)</span>

<span class="org-rainbow-delimiters-depth-1">(</span><span class="org-keyword">alias</span>
 <span class="org-rainbow-delimiters-depth-2">(</span><span class="org-rainbow-delimiters-depth-3">(</span><span class="org-constant">name</span> plugin<span class="org-rainbow-delimiters-depth-3">)</span>
  <span class="org-rainbow-delimiters-depth-3">(</span><span class="org-constant">deps</span> <span class="org-rainbow-delimiters-depth-4">(</span>ecaml-bf.so<span class="org-rainbow-delimiters-depth-4">)</span><span class="org-rainbow-delimiters-depth-3">)</span><span class="org-rainbow-delimiters-depth-2">)</span><span class="org-rainbow-delimiters-depth-1">)</span>

<span class="org-rainbow-delimiters-depth-1">(</span><span class="org-keyword">alias</span>
 <span class="org-rainbow-delimiters-depth-2">(</span><span class="org-rainbow-delimiters-depth-3">(</span><span class="org-constant">name</span> runtest<span class="org-rainbow-delimiters-depth-3">)</span>
  <span class="org-rainbow-delimiters-depth-3">(</span><span class="org-constant">deps</span> <span class="org-rainbow-delimiters-depth-4">(</span><span class="org-rainbow-delimiters-depth-5">(</span><span class="org-keyword">alias</span> plugin<span class="org-rainbow-delimiters-depth-5">)</span><span class="org-rainbow-delimiters-depth-4">)</span><span class="org-rainbow-delimiters-depth-3">)</span>
  <span class="org-rainbow-delimiters-depth-3">(</span><span class="org-constant">action</span> <span class="org-rainbow-delimiters-depth-4">(</span><span class="org-builtin">run</span> emacs -Q -L . --batch --eval <span class="org-string">"(require 'ecaml-bf)"</span><span class="org-rainbow-delimiters-depth-4">)</span><span class="org-rainbow-delimiters-depth-3">)</span><span class="org-rainbow-delimiters-depth-2">)</span><span class="org-rainbow-delimiters-depth-1">)</span>
</pre>
</div>

<p>
Note the <code>copy</code> rule which simply copies the <code>ocamlopt</code>-produced native
executable file to another name ending in <code>.so</code>. This is the filename
extension that Emacs will look for when we try to load our plugin.
</p>

<p>
We&rsquo;ll also define a <code>runtest</code> action that will run Emacs in batch mode and
<a href="https://www.gnu.org/software/emacs/manual/html_node/elisp/Named-Features.html#index-require-1053"><code>require</code></a> our plugin. (I added the <code>-Q</code> flag to skip user initialization,
etc. No need to load all of my <a href="https://github.com/syl20bnr/spacemacs">Spacemacs</a> config just to run a small test in
batch mode!). Let&rsquo;s try it out:
</p>

<div class="org-src-container">
<pre class="src src-sh">jbuilder build @plugin <span class="org-highlight-numbers-number">2</span>&gt;/dev/null &amp;&amp; jbuilder runtest
<span class="org-comment-delimiter"># </span><span class="org-comment">Hello from OCaml</span>
<span class="org-comment-delimiter"># </span><span class="org-comment">"Loaded Ecaml."</span>
</pre>
</div>

<p>
As you can see, the plugin printed out our message to standard error. It also
printed out a message from Ecaml itself. Normally these messages just end up in
your <code>*Messages*</code> buffer, so it&rsquo;s easy to check to see that they&rsquo;re there.
</p>

<p>
Great! We successfully compiled and loaded our plugin into Emacs, without
writing a lick of Elisp (except for the <code>(require 'ecaml-bf)</code>). Next time we&rsquo;ll
start working on our interpreter. Stay tuned!
</p>
</div>
</div>
</div>
<div id="footnotes">
<h2 class="footnotes">Footnotes: </h2>
<div id="text-footnotes">

<div class="footdef"><sup><a id="fn.1" class="footnum" href="#fnr.1">1</a></sup> <div class="footpara"><p class="footpara">
I wrote much of an earlier version of Ecaml<sup><a id="fnr.2" class="footref" href="#fn.2">2</a></sup> (e.g., the C stubs and
low-level OCaml interface) during my internship at Jane Street in the summer of
2016 under the tremendous guidance of my mentor Stephen Weeks. It&rsquo;s a great
place to work, and I recommend anyone interested in their internship program to
definitely check it out and/or <a href="https://www.janestreet.com/join-jane-street/internships/">apply</a>.
</p></div></div>

<div class="footdef"><sup><a id="fn.2" class="footnum" href="#fnr.2">2</a></sup> <div class="footpara"><p class="footpara">
Of course, it has since been much expanded, rewritten, and improved upon
by people much more experienced than me.
</p></div></div>


</div>
</div>]]></content><author><name>Aaron L. Zeng</name></author><category term="ecaml-getting-started" /><category term="ecaml" /><category term="emacs" /><category term="ocaml" /><summary type="html"><![CDATA[This post is part 1 of a series (next). The full code is available on GitHub.]]></summary></entry><entry><title type="html">This Is a Blog</title><link href="https://blag.bcc32.com/uncategorized/2017/10/29/this-is-a-blog/" rel="alternate" type="text/html" title="This Is a Blog" /><published>2017-10-29T02:41:39-04:00</published><updated>2017-10-29T02:41:39-04:00</updated><id>https://blag.bcc32.com/uncategorized/2017/10/29/this-is-a-blog</id><content type="html" xml:base="https://blag.bcc32.com/uncategorized/2017/10/29/this-is-a-blog/"><![CDATA[<p>So, I convinced myself I should start a blog. So here it is. Initially I used
WordPress but that was too much and blegh. This is now built on Jekyll.</p>]]></content><author><name>Aaron L. Zeng</name></author><category term="uncategorized" /><summary type="html"><![CDATA[So, I convinced myself I should start a blog. So here it is. Initially I used WordPress but that was too much and blegh. This is now built on Jekyll.]]></summary></entry></feed>