<?xml version="1.0" encoding="utf-8"?><feed xmlns="http://www.w3.org/2005/Atom" ><generator uri="https://jekyllrb.com/" version="3.10.0">Jekyll</generator><link href="https://jzhonx.github.io//feed.xml" rel="self" type="application/atom+xml" /><link href="https://jzhonx.github.io//" rel="alternate" type="text/html" /><updated>2026-04-05T08:04:36+00:00</updated><id>https://jzhonx.github.io//feed.xml</id><title type="html">JzNext</title><subtitle>Go, Haskell, Declarative languages</subtitle><author><name>Junxiang Zhou</name></author><entry><title type="html">Understanding Zippers</title><link href="https://jzhonx.github.io//haskell/2026/03/17/zippers.html" rel="alternate" type="text/html" title="Understanding Zippers" /><published>2026-03-17T00:00:00+00:00</published><updated>2026-03-17T00:00:00+00:00</updated><id>https://jzhonx.github.io//haskell/2026/03/17/zippers</id><content type="html" xml:base="https://jzhonx.github.io//haskell/2026/03/17/zippers.html"><![CDATA[<p>When working with tree-structured data, we often need to navigate to a specific node and modify it. In imperative languages, this is usually straightforward thanks to mutable state and parent pointers. In functional languages, however, immutability makes this pattern less obvious.</p>

<p>In this post, we’ll explore how to navigate and modify tree structures efficiently in functional languages using a technique called <strong>zippers</strong>.</p>

<h2 id="simple-json-query-language">Simple JSON query language</h2>

<p>Suppose we are implementing a simple JSON query tool with syntax similar to <code class="language-plaintext highlighter-rouge">jq</code>. The language allows us to access and modify values in a JSON object.</p>

<p>The core syntax:</p>

<ul>
  <li><code class="language-plaintext highlighter-rouge">.field</code> — access a field</li>
  <li><code class="language-plaintext highlighter-rouge">=</code> — replace the current value</li>
  <li><code class="language-plaintext highlighter-rouge">|</code> — chain operations</li>
  <li><code class="language-plaintext highlighter-rouge">with_cursor($=filter, $cursor_movement)</code> — create a cursor at the node specified by <code class="language-plaintext highlighter-rouge">filter</code> and execute <code class="language-plaintext highlighter-rouge">query</code> with the
cursor as the root node. The <code class="language-plaintext highlighter-rouge">cursor_movement</code> is:
    <ul>
      <li><code class="language-plaintext highlighter-rouge">.field</code> — access a field relative to the current cursor node.</li>
      <li><code class="language-plaintext highlighter-rouge">.^</code> — access the parent node of the current cursor node.</li>
    </ul>
  </li>
</ul>

<p>In an imperative setting, we might represent the JSON as a tree with parent pointers, making navigation (both downward and upward) trivial.</p>

<p>In a functional language like Haskell, we generally avoid parent pointers because maintaining them correctly under immutability is difficult. Instead, we need a different approach.</p>

<h3 id="example">Example</h3>

<p>Suppose we have the following JSON object:</p>

<div class="language-json highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="p">{</span><span class="w">
  </span><span class="nl">"a"</span><span class="p">:{</span><span class="w">
    </span><span class="nl">"b"</span><span class="p">:{</span><span class="w">
        </span><span class="nl">"x"</span><span class="p">:</span><span class="mi">1</span><span class="p">,</span><span class="w">
        </span><span class="nl">"y"</span><span class="p">:</span><span class="mi">2</span><span class="w">
    </span><span class="p">}</span><span class="w">
  </span><span class="p">}</span><span class="w">
</span><span class="p">}</span><span class="w">
</span></code></pre></div></div>

<p>We have the following two queries that do the same thing but with different syntax:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>.a.b.x = 42 | .a.b.y = 43

with_cursor($=.a.b, $.x = 42 | $.y = 43)
</code></pre></div></div>

<p>We will use the example to explain how to implement the query language in Haskell, and how to use zippers to navigate
and modify the tree data structure efficiently.</p>

<h2 id="navigating-and-modifying-trees">Navigating and Modifying Trees</h2>

<h3 id="persistent-data-structure">Persistent data structure</h3>

<p>In Haskell, data structures are typically immutable. To modify a node in a tree, we need to create a new tree that
contains the modified node, while sharing the unchanged nodes with the original tree. This is known as a <strong>persistent
data structure</strong>.</p>

<p>We define a minimal tree type:</p>

<div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kr">data</span> <span class="kt">Tree</span>
  <span class="o">=</span> <span class="kt">Atom</span> <span class="kt">Int</span>
  <span class="o">|</span> <span class="kt">Object</span> <span class="p">[(</span><span class="kt">String</span><span class="p">,</span> <span class="kt">Tree</span><span class="p">)]</span>
</code></pre></div></div>

<p>Example value:</p>

<div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">root</span> <span class="o">::</span> <span class="kt">Tree</span>
<span class="n">root</span> <span class="o">=</span> <span class="kt">Object</span> <span class="p">[(</span><span class="s">"a"</span><span class="p">,</span> <span class="kt">Object</span> <span class="p">[(</span><span class="s">"b"</span><span class="p">,</span> <span class="kt">Object</span> <span class="p">[(</span><span class="s">"x"</span><span class="p">,</span> <span class="kt">Atom</span> <span class="mi">1</span><span class="p">),</span> <span class="p">(</span><span class="s">"y"</span><span class="p">,</span> <span class="kt">Atom</span> <span class="mi">2</span><span class="p">)])])]</span>
</code></pre></div></div>

<h3 id="first-approach-root-based">First Approach: Root-based</h3>

<p>The first query <code class="language-plaintext highlighter-rouge">.a.b.x = 42 | .a.b.y = 43</code> can be implemented by accessing the target node and modifying it, then
accessing another target node through the modified root node and modifying it again.</p>

<p>To access a node, we recursively follow the path from the root to the target node and recursively create new nodes along
the way. The unchanged nodes are shared between the original tree and the new tree. The number of nodes that are
modified by the <code class="language-plaintext highlighter-rouge">access</code> function is <code class="language-plaintext highlighter-rouge">O(depth(node))</code>, where <code class="language-plaintext highlighter-rouge">depth(node)</code> is the depth of the target node in the tree.</p>

<div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">access</span> <span class="o">::</span> <span class="p">[</span><span class="kt">String</span><span class="p">]</span> <span class="o">-&gt;</span> <span class="p">(</span><span class="kt">Tree</span> <span class="o">-&gt;</span> <span class="kt">Tree</span><span class="p">)</span> <span class="o">-&gt;</span> <span class="kt">Tree</span> <span class="o">-&gt;</span> <span class="kt">Tree</span>
<span class="n">access</span> <span class="kt">[]</span> <span class="n">f</span> <span class="n">t</span> <span class="o">=</span> <span class="n">f</span> <span class="n">t</span>
<span class="n">access</span> <span class="p">(</span><span class="n">k</span> <span class="o">:</span> <span class="n">ks</span><span class="p">)</span> <span class="n">f</span> <span class="p">(</span><span class="kt">Object</span> <span class="n">ts</span><span class="p">)</span>
  <span class="o">|</span> <span class="kr">let</span> <span class="p">(</span><span class="n">before</span><span class="p">,</span> <span class="n">rest</span><span class="p">)</span> <span class="o">=</span> <span class="n">break</span> <span class="p">((</span><span class="o">==</span> <span class="n">k</span><span class="p">)</span> <span class="o">.</span> <span class="n">fst</span><span class="p">)</span> <span class="n">ts</span>
  <span class="p">,</span> <span class="p">((</span><span class="kr">_</span><span class="p">,</span> <span class="n">v</span><span class="p">)</span> <span class="o">:</span> <span class="n">after</span><span class="p">)</span> <span class="o">&lt;-</span> <span class="n">rest</span> <span class="o">=</span>
      <span class="kr">let</span> <span class="n">modifiedChild</span> <span class="o">=</span> <span class="n">access</span> <span class="n">ks</span> <span class="n">f</span> <span class="n">v</span>
       <span class="kr">in</span> <span class="kt">Object</span> <span class="p">(</span><span class="n">before</span> <span class="o">++</span> <span class="p">(</span><span class="n">k</span><span class="p">,</span> <span class="n">modifiedChild</span><span class="p">)</span> <span class="o">:</span> <span class="n">after</span><span class="p">)</span>
<span class="n">access</span> <span class="kr">_</span> <span class="kr">_</span> <span class="kr">_</span> <span class="o">=</span> <span class="n">error</span> <span class="s">"Invalid path to access"</span>
</code></pre></div></div>

<p>When the path is empty, we apply the modification function f to the current node. Otherwise, we find the child with
key k, recurse into it with the remaining path, and rebuild the current node with the modified child. The cost is
O(depth(node)) new nodes per modification.</p>

<h4 id="example-of-access">Example of <code class="language-plaintext highlighter-rouge">access</code></h4>

<p>Suppose we modify the “x” node to 42 with <code class="language-plaintext highlighter-rouge">access ["a", "b", "x"] (const $ Atom 42) root</code>, the new tree and the original
tree would look like the following:</p>

<pre><code class="language-mermaid">flowchart TD
	subgraph original tree
    root1("root") --&gt;|a| a
    a("tree_a") --&gt;|b| b
    b("tree_b") --&gt;|x| x("1")
    b --&gt;|y| y("2")
	end
	
	subgraph new tree
		root2("root'") --&gt;|a| a2
		a2("tree_a'") --&gt;|b| b2
    b2("tree_b'") --&gt;|x| x2("42")
    b2 --&gt;|y| y
	end
</code></pre>

<p>In the diagram, the <code class="language-plaintext highlighter-rouge">tree_a</code> stands for a <code class="language-plaintext highlighter-rouge">Tree</code> node that is a child of the root node with key “a”. The <code class="language-plaintext highlighter-rouge">tree_a'</code> is a
modified version of <code class="language-plaintext highlighter-rouge">tree_a</code> with the modified child node “x”. The <code class="language-plaintext highlighter-rouge">root'</code> is a modified version of the original root
node with modified nodes. So is <code class="language-plaintext highlighter-rouge">tree_b'</code>.</p>

<p>From the diagram, we can see that the modified node “x” is a new node with value 42, and its parent node “b” is also a
new node that shares the unchanged child node “y” with the original tree.</p>

<h4 id="query-execution">Query execution</h4>

<p>The whole query can be translated to the following Haskell code:</p>

<div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="p">(</span> <span class="n">access</span> <span class="p">[</span><span class="s">"a"</span><span class="p">,</span> <span class="s">"b"</span><span class="p">,</span> <span class="s">"x"</span><span class="p">]</span> <span class="p">(</span><span class="n">const</span> <span class="o">$</span> <span class="kt">Atom</span> <span class="mi">42</span><span class="p">)</span>
<span class="o">.</span> <span class="n">access</span> <span class="p">[</span><span class="s">"a"</span><span class="p">,</span> <span class="s">"b"</span><span class="p">,</span> <span class="s">"y"</span><span class="p">]</span> <span class="p">(</span><span class="n">const</span> <span class="o">$</span> <span class="kt">Atom</span> <span class="mi">43</span><span class="p">)</span>
<span class="p">)</span>
  <span class="n">root</span>
</code></pre></div></div>

<p><code class="language-plaintext highlighter-rouge">.a.b.x = 42</code> is translated to <code class="language-plaintext highlighter-rouge">access ["a", "b", "x"] (const $ Atom 42)</code>, which evaluates to a function that takes a
tree and returns a new tree, and so on. The <code class="language-plaintext highlighter-rouge">|</code> operator is translated to function composition, which is <code class="language-plaintext highlighter-rouge">.</code> in Haskell.
The two <code class="language-plaintext highlighter-rouge">access</code> functions are composed together, and the resulting function is applied to the original tree <code class="language-plaintext highlighter-rouge">root</code> to
get the modified tree.</p>

<p>If there are <code class="language-plaintext highlighter-rouge">N</code> modifications in the query, the total number of nodes that are modified is <code class="language-plaintext highlighter-rouge">O(N * depth(tree))</code>. For
modifications that are close to each other, this can lead to a lot of redundant modifications. In our example, the “a”
and “b” nodes are modified twice, which is inefficient.</p>

<h3 id="second-query-implementation">Second query implementation</h3>

<p>The query <code class="language-plaintext highlighter-rouge">with_cursor($=.a.b, $.x = 42 | $.y = 43)</code> introduces a cursor that allows us to focus on a specific node in
the tree and execute a query with the focused node as the root node. The <code class="language-plaintext highlighter-rouge">with_cursor</code> query can be implemented by using
a technique called <strong>Zippers</strong>.</p>

<h4 id="zippers">Zippers</h4>

<p>Zippers are a powerful technique for navigating and modifying persistent data structures like trees. They allow us to go
to a parent node or to a specific child node in a much more efficient way, without needing to always go back to the root
node to access a node.</p>

<div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kr">data</span> <span class="kt">Zipper</span> <span class="o">=</span> <span class="kt">Zipper</span>
  <span class="p">{</span> <span class="n">focus</span> <span class="o">::</span> <span class="kt">Tree</span>
  <span class="p">,</span> <span class="n">breadcrumbs</span> <span class="o">::</span> <span class="p">[</span><span class="kt">Crumb</span><span class="p">]</span>
  <span class="p">}</span>
  
<span class="kr">data</span> <span class="kt">Crumb</span> <span class="o">=</span> <span class="kt">Crumb</span>
  <span class="p">{</span> <span class="n">before</span> <span class="o">::</span> <span class="p">[(</span><span class="kt">String</span><span class="p">,</span> <span class="kt">Tree</span><span class="p">)]</span>
  <span class="p">,</span> <span class="n">holeKey</span> <span class="o">::</span> <span class="kt">String</span>
  <span class="p">,</span> <span class="n">after</span> <span class="o">::</span> <span class="p">[(</span><span class="kt">String</span><span class="p">,</span> <span class="kt">Tree</span><span class="p">)]</span>
  <span class="p">}</span>
</code></pre></div></div>

<p>A <code class="language-plaintext highlighter-rouge">Zipper</code> consists of:</p>
<ul>
  <li>the <strong>currently focused tree node</strong></li>
  <li>a list of <strong>breadcrumbs</strong> that stores the path from the root to the current node.</li>
</ul>

<p>A <code class="language-plaintext highlighter-rouge">Crumb</code> looks like a <code class="language-plaintext highlighter-rouge">Tree</code>, except that one <code class="language-plaintext highlighter-rouge">Tree</code> has been removed; it is the node we most recently descended into. The <code class="language-plaintext highlighter-rouge">holeKey</code> stores the key of the removed node, <code class="language-plaintext highlighter-rouge">before</code> contains the preceding siblings, and <code class="language-plaintext highlighter-rouge">after</code>
contains the following siblings.</p>

<div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">emptyZipper</span> <span class="o">::</span> <span class="kt">Tree</span> <span class="o">-&gt;</span> <span class="kt">Zipper</span>
<span class="n">emptyZipper</span> <span class="n">t</span> <span class="o">=</span> <span class="kt">Zipper</span> <span class="n">t</span> <span class="kt">[]</span>
</code></pre></div></div>

<p>We create a Zipper focusing on the root:</p>

<pre><code class="language-mermaid">flowchart TD

subgraph original tree
  root1("root") --&gt;|a| a
  a("tree_a") --&gt;|b| b("tree_b")
end

subgraph zipper
  subgraph focus
    focus_node("root")
    focus_node --&gt;|a| a
  end
end
</code></pre>

<p>In the diagram, the focus node has the same value as the original root node, and the breadcrumb stack is empty because we are at the root node.</p>

<h4 id="move-down">Move down</h4>

<div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">goDown</span> <span class="o">::</span> <span class="kt">String</span> <span class="o">-&gt;</span> <span class="kt">Zipper</span> <span class="o">-&gt;</span> <span class="kt">Zipper</span>
<span class="n">goDown</span> <span class="n">k</span> <span class="p">(</span><span class="kt">Zipper</span> <span class="p">(</span><span class="kt">Object</span> <span class="n">ts</span><span class="p">)</span> <span class="n">bs</span><span class="p">)</span>
  <span class="o">|</span> <span class="p">(</span><span class="n">l</span><span class="p">,</span> <span class="p">(</span><span class="kr">_</span><span class="p">,</span> <span class="n">v</span><span class="p">)</span> <span class="o">:</span> <span class="n">r</span><span class="p">)</span> <span class="o">&lt;-</span> <span class="n">break</span> <span class="p">((</span><span class="o">==</span> <span class="n">k</span><span class="p">)</span> <span class="o">.</span> <span class="n">fst</span><span class="p">)</span> <span class="n">ts</span> <span class="o">=</span> <span class="kt">Zipper</span> <span class="n">v</span> <span class="p">(</span><span class="kt">Crumb</span> <span class="n">l</span> <span class="n">k</span> <span class="n">r</span> <span class="o">:</span> <span class="n">bs</span><span class="p">)</span>
<span class="n">goDown</span> <span class="n">k</span> <span class="p">(</span><span class="kt">Zipper</span> <span class="n">f</span> <span class="kr">_</span><span class="p">)</span> <span class="o">=</span> <span class="n">error</span> <span class="o">$</span> <span class="s">"Cannot go to child '"</span> <span class="o">++</span> <span class="n">k</span> <span class="o">++</span> <span class="s">"' of tree: "</span> <span class="o">++</span> <span class="n">show</span> <span class="n">f</span>
</code></pre></div></div>

<p>Moving downward creates a <code class="language-plaintext highlighter-rouge">Crumb</code> from the parent node by extracting the target child, which becomes the new focus node.
The <code class="language-plaintext highlighter-rouge">Crumb</code> is then pushed onto the breadcrumb stack.</p>

<p>We go down to “a”, the Zipper would look like the following:</p>

<pre><code class="language-mermaid">flowchart TD

subgraph original tree
  root1("root") --&gt;|a| a
  a("tree_a") --&gt;|b| b("tree_b")
end

subgraph zipper
	subgraph crumb_0
    crumb_0_key("holeKey: a")
	end
	
  subgraph focus
    focus_node("tree_a")
    focus_node --&gt;|b| b
  end

  crumb_0 --&gt;|next| focus
end
</code></pre>

<p>In the diagram, the <code class="language-plaintext highlighter-rouge">crumb_0</code> is the top <code class="language-plaintext highlighter-rouge">Crumb</code> in the breadcrumb stack. The <code class="language-plaintext highlighter-rouge">holeKey</code> indicates that the “a” node is
taken away from the root node, and the <code class="language-plaintext highlighter-rouge">before</code> and <code class="language-plaintext highlighter-rouge">after</code> fields are empty because there is no sibling of “a”. The focus
node is identical to the “a” node in the original tree.</p>

<p>Then go down to “b”:</p>

<pre><code class="language-mermaid">flowchart TD

subgraph original tree
  root1("root") --&gt;|a| a
  a("tree_a") --&gt;|b| b("tree_b")
  b --&gt; |x| x("1")
  b --&gt; |y| y("2")
end

subgraph zipper
  subgraph crumb_0
    crumb_0_key("holeKey: a")
	end

	subgraph crumb_1
    crumb_1_key("holeKey: b")
	end
	
  subgraph focus
    focus_node("tree_b")
    focus_node --&gt;|x| x
    focus_node --&gt;|y| y
  end

  crumb_0 --&gt;|next| crumb_1
  crumb_1 --&gt;|next| focus
end
</code></pre>

<p>The focus node is identical to the “b” in the original tree. The <code class="language-plaintext highlighter-rouge">crumb_1</code> looks similar to <code class="language-plaintext highlighter-rouge">tree_a</code>, but the “b” node
is taken away and replaced with a hole, which is indicated by the <code class="language-plaintext highlighter-rouge">holeKey</code> field. The <code class="language-plaintext highlighter-rouge">before</code> and <code class="language-plaintext highlighter-rouge">after</code> fields are
empty.</p>

<p>Now we go down to “x”, the Zipper would look like the following:</p>

<pre><code class="language-mermaid">flowchart TD

subgraph original tree
  root1("root") --&gt;|a| a
  a("tree_a") --&gt;|b| b("tree_b")
  b --&gt; |x| x("1")
  b --&gt; |y| y("2")
end

subgraph zipper
  subgraph crumb_0
    crumb_0_key("holeKey: a")
	end

	subgraph crumb_1
    crumb_1_key("holeKey: b")
	end

  subgraph crumb_2
    crumb_2_key("holeKey: x")
    crumb_2_after("after") --&gt;|y| y
	end
	
  subgraph focus
    focus_node("1")
  end

  crumb_0 --&gt;|next| crumb_1
  crumb_1 --&gt;|next| crumb_2
  crumb_2 --&gt;|next| focus
end
</code></pre>

<p>The newly added <code class="language-plaintext highlighter-rouge">crumb_2</code> indicates that the “x” node is taken away from the “b” node, and the “y” node is a sibling of
“x”, so it is stored in the <code class="language-plaintext highlighter-rouge">after</code> field of the <code class="language-plaintext highlighter-rouge">Crumb</code>. The focus node is identical to the “x” node in the original
tree.</p>

<h4 id="focus-modification">Focus modification</h4>

<p>Now we modify the value of “x” to 42. We just call the <code class="language-plaintext highlighter-rouge">modify</code> function on the focus node:</p>

<div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">modifyZipper</span> <span class="o">::</span> <span class="p">(</span><span class="kt">Tree</span> <span class="o">-&gt;</span> <span class="kt">Tree</span><span class="p">)</span> <span class="o">-&gt;</span> <span class="kt">Zipper</span> <span class="o">-&gt;</span> <span class="kt">Zipper</span>
<span class="n">modifyZipper</span> <span class="n">f</span> <span class="p">(</span><span class="kt">Zipper</span> <span class="n">t</span> <span class="n">bs</span><span class="p">)</span> <span class="o">=</span> <span class="kt">Zipper</span> <span class="p">(</span><span class="n">f</span> <span class="n">t</span><span class="p">)</span> <span class="n">bs</span>
</code></pre></div></div>

<p>Modifying the focus node does not change the breadcrumbs, nor does it return a new root node. So the time complexity of
<code class="language-plaintext highlighter-rouge">modifyZipper</code> is <code class="language-plaintext highlighter-rouge">O(1)</code>.</p>

<p>We modify the “x” node to 42. Now in the zipper, the focus node is a new node with value 42.</p>

<h4 id="move-up">Move up</h4>

<div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">goUp</span> <span class="o">::</span> <span class="kt">Zipper</span> <span class="o">-&gt;</span> <span class="kt">Zipper</span>
<span class="n">goUp</span> <span class="p">(</span><span class="kt">Zipper</span> <span class="n">t</span> <span class="p">(</span><span class="kt">Crumb</span> <span class="n">l</span> <span class="n">key</span> <span class="n">r</span> <span class="o">:</span> <span class="n">bs</span><span class="p">))</span> <span class="o">=</span> <span class="kt">Zipper</span> <span class="p">(</span><span class="kt">Object</span> <span class="p">(</span><span class="n">l</span> <span class="o">++</span> <span class="p">(</span><span class="n">key</span><span class="p">,</span> <span class="n">t</span><span class="p">)</span> <span class="o">:</span> <span class="n">r</span><span class="p">))</span> <span class="n">bs</span>
<span class="n">goUp</span> <span class="p">(</span><span class="kt">Zipper</span> <span class="kr">_</span> <span class="kt">[]</span><span class="p">)</span> <span class="o">=</span> <span class="n">error</span> <span class="s">"Already at the top"</span>
</code></pre></div></div>

<p>Moving upward is a reverse process of moving downward. It <strong>reassembles the tree</strong> by filling the hole in the Crumb with
the current focus node, popping the Crumb from the list of breadcrumbs, and making the reassembled tree the new focus
node.</p>

<p>Now we go up, the Zipper would look like the following:</p>

<pre><code class="language-mermaid">flowchart TD

subgraph original tree
  root1("root") --&gt;|a| a
  a("tree_a") --&gt;|b| b("tree_b")
  b --&gt; |x| x("1")
  b --&gt; |y| y("2")
end

subgraph zipper
  subgraph crumb_0
    crumb_0_key("holeKey: a")
	end

	subgraph crumb_1
    crumb_1_key("holeKey: b")
	end
	
  subgraph focus
    focus_node("tree_b'")
    focus_node --&gt;|x| x2("42")
    focus_node --&gt;|y| y
  end

  crumb_0 --&gt;|next| crumb_1
  crumb_1 --&gt;|next| focus
end
</code></pre>

<p>In the above diagram, the new focus node is created by filling the hole in the crumb with the modified “x” node that
has value 42. The new focus node shares the unchanged child node “y” with the original “b” node.</p>

<h4 id="access-with-zipper">access with Zipper</h4>

<p>Unlike <code class="language-plaintext highlighter-rouge">access</code> which returns a new root node, <code class="language-plaintext highlighter-rouge">accessZ</code> goes to the target node, applies the function to the focus
node, and then goes back to the same position in the tree. The number of nodes that are modified by <code class="language-plaintext highlighter-rouge">accessZ</code> is
<code class="language-plaintext highlighter-rouge">O(distance(node, cursor))</code>, where <code class="language-plaintext highlighter-rouge">distance(node, cursor)</code> is the depth of the target node from the current cursor
node. If the target node is close to the cursor node, the number of modified nodes is a much smaller number than
<code class="language-plaintext highlighter-rouge">O(depth(node))</code>, which is the number of modified nodes by <code class="language-plaintext highlighter-rouge">access</code>.</p>

<div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">accessZ</span> <span class="o">::</span> <span class="p">[</span><span class="kt">String</span><span class="p">]</span> <span class="o">-&gt;</span> <span class="p">(</span><span class="kt">Zipper</span> <span class="o">-&gt;</span> <span class="kt">Zipper</span><span class="p">)</span> <span class="o">-&gt;</span> <span class="kt">Zipper</span> <span class="o">-&gt;</span> <span class="kt">Zipper</span>
<span class="n">accessZ</span> <span class="kt">[]</span> <span class="n">f</span> <span class="n">z</span> <span class="o">=</span> <span class="n">f</span> <span class="n">z</span>
<span class="n">accessZ</span> <span class="p">(</span><span class="n">k</span> <span class="o">:</span> <span class="n">ks</span><span class="p">)</span> <span class="n">f</span> <span class="n">z</span> <span class="o">=</span> <span class="n">accessZ</span> <span class="n">ks</span> <span class="n">f</span> <span class="p">(</span><span class="n">goDown</span> <span class="n">k</span> <span class="n">z</span><span class="p">)</span> <span class="o">&amp;</span> <span class="n">goUp</span>

<span class="p">(</span><span class="o">&amp;</span><span class="p">)</span> <span class="o">::</span> <span class="n">a</span> <span class="o">-&gt;</span> <span class="p">(</span><span class="n">a</span> <span class="o">-&gt;</span> <span class="n">b</span><span class="p">)</span> <span class="o">-&gt;</span> <span class="n">b</span>
<span class="n">x</span> <span class="o">&amp;</span> <span class="n">f</span> <span class="o">=</span> <span class="n">f</span> <span class="n">x</span>
</code></pre></div></div>

<p>In the <code class="language-plaintext highlighter-rouge">accessZ</code> function, we first check if the path is empty. If it is, we apply the modification function <code class="language-plaintext highlighter-rouge">f</code> to the
current Zipper. If the path is not empty, we go down to the child node with key <code class="language-plaintext highlighter-rouge">k</code>, recursively call <code class="language-plaintext highlighter-rouge">accessZ</code> on the
child node with the remaining path <code class="language-plaintext highlighter-rouge">ks</code>, and then go back up to the original position.</p>

<p>The <code class="language-plaintext highlighter-rouge">(&amp;)</code> operator is a reverse function application operator, which allows us to write the operand before the function.
It is already defined in <code class="language-plaintext highlighter-rouge">Data.Function</code> in Haskell, but we define it here for completeness.</p>

<h4 id="with_cursor-implementation">with_cursor implementation</h4>

<p>We can implement the <code class="language-plaintext highlighter-rouge">with_cursor</code> query by using <code class="language-plaintext highlighter-rouge">accessZ</code> to navigate to the target node, apply the modification, and
then go back to the root node. The <code class="language-plaintext highlighter-rouge">withCursor</code> function takes a path to the target node, a modification function that
takes a Zipper and returns a modified Zipper, and the original tree. It returns a new tree with the modifications
applied.</p>

<div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">withCursor</span> <span class="o">::</span> <span class="p">[</span><span class="kt">String</span><span class="p">]</span> <span class="o">-&gt;</span> <span class="p">(</span><span class="kt">Zipper</span> <span class="o">-&gt;</span> <span class="kt">Zipper</span><span class="p">)</span> <span class="o">-&gt;</span> <span class="kt">Tree</span> <span class="o">-&gt;</span> <span class="kt">Tree</span>
<span class="n">withCursor</span> <span class="n">path</span> <span class="n">f</span> <span class="n">t</span> <span class="o">=</span> <span class="n">focus</span> <span class="o">$</span> <span class="n">accessZ</span> <span class="n">path</span> <span class="n">f</span> <span class="p">(</span><span class="n">emptyZipper</span> <span class="n">t</span><span class="p">)</span>
</code></pre></div></div>

<p>The <code class="language-plaintext highlighter-rouge">accessWCursor</code> function is a helper function that works similarly to <code class="language-plaintext highlighter-rouge">access</code>, which allows us to modify a tree
node.</p>

<div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">accessWCursor</span> <span class="o">::</span> <span class="p">[</span><span class="kt">String</span><span class="p">]</span> <span class="o">-&gt;</span> <span class="p">(</span><span class="kt">Tree</span> <span class="o">-&gt;</span> <span class="kt">Tree</span><span class="p">)</span> <span class="o">-&gt;</span> <span class="kt">Zipper</span> <span class="o">-&gt;</span> <span class="kt">Zipper</span>
<span class="n">accessWCursor</span> <span class="n">path</span> <span class="n">f</span> <span class="o">=</span> <span class="n">accessZ</span> <span class="n">path</span> <span class="p">(</span><span class="n">modifyZipper</span> <span class="n">f</span><span class="p">)</span>
</code></pre></div></div>

<h4 id="query-execution-1">Query execution</h4>

<p>The second query <code class="language-plaintext highlighter-rouge">with_cursor($=.a.b, $.x = 42 | $.y = 43)</code> will be translated to the following code:</p>

<div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">withCursor</span>
  <span class="p">[</span><span class="s">"a"</span><span class="p">,</span> <span class="s">"b"</span><span class="p">]</span>
  <span class="p">(</span><span class="n">accessWCursor</span> <span class="p">[</span><span class="s">"x"</span><span class="p">]</span> <span class="p">(</span><span class="n">const</span> <span class="o">$</span> <span class="kt">Atom</span> <span class="mi">42</span><span class="p">)</span> <span class="o">.</span> <span class="n">accessWCursor</span> <span class="p">[</span><span class="s">"y"</span><span class="p">]</span> <span class="p">(</span><span class="n">const</span> <span class="o">$</span> <span class="kt">Atom</span> <span class="mi">43</span><span class="p">))</span>
  <span class="n">root</span>
</code></pre></div></div>

<p>It uses <code class="language-plaintext highlighter-rouge">withCursor</code> to navigate to the “b” node, then with “b” as the cursor, modifies the “x” node and the “y” node
with <code class="language-plaintext highlighter-rouge">accessWCursor</code>.</p>

<p>If there are <code class="language-plaintext highlighter-rouge">N</code> modifications that are just one step away from the cursor node, the total number of nodes that are
modified is <code class="language-plaintext highlighter-rouge">O(N + depth(cursor))</code>, which is much more efficient than <code class="language-plaintext highlighter-rouge">O(N * depth(tree))</code>.</p>

<h3 id="comparing-the-two-approaches">Comparing the two approaches</h3>

<p>Let’s compare the two approaches in terms of the number of node allocations and time complexity.</p>

<table>
  <thead>
    <tr>
      <th>Approach</th>
      <th>Total node allocations</th>
      <th>Time complexity</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>Root-based <code class="language-plaintext highlighter-rouge">access</code></td>
      <td><code class="language-plaintext highlighter-rouge">O(N * depth(tree))</code></td>
      <td><code class="language-plaintext highlighter-rouge">O(N * depth(tree))</code></td>
    </tr>
    <tr>
      <td>Zippers</td>
      <td><code class="language-plaintext highlighter-rouge">O(N + depth(cursor))</code></td>
      <td><code class="language-plaintext highlighter-rouge">O(N + depth(cursor))</code></td>
    </tr>
  </tbody>
</table>

<h2 id="when-to-use-zippers">When to use Zippers</h2>

<p>Zippers are not universally better than root-based access. Their advantage depends on the access pattern.</p>

<h3 id="high-spatial-locality">High spatial locality</h3>

<p>When a query performs many modifications in the same region of the tree, zippers avoid redundant rebuilds of the path
from the root. Performance tests in <a href="https://arxiv.org/abs/1908.10926">Performance Analysis of
Zippers</a> show up to 280% speedup over the root-based approach when modifications are
clustered together. The key factor is <strong>spatial locality</strong>: the closer the edits are to each other (and to the cursor),
the greater the benefit.</p>

<p>When modifications are scattered across unrelated parts of the tree, the zipper must navigate up and back down for each
one, and the overhead of creating crumbs on every step can make it <strong>slower</strong> than simply calling <code class="language-plaintext highlighter-rouge">access</code> from the root
each time.</p>

<h3 id="read-only-access-avoid-zippers">Read-only access (avoid zippers)</h3>

<p>Every <code class="language-plaintext highlighter-rouge">goDown</code> or <code class="language-plaintext highlighter-rouge">goUp</code> allocates a new crumb and a new focus node, even if we never modify anything. For read-only
lookups, this overhead is wasted. A cheaper alternative is to maintain a flat map from paths to values alongside the
tree. Modifications update both the tree (via a zipper) and the map; reads consult the map directly in <code class="language-plaintext highlighter-rouge">O(log n)</code> time
without any navigation.</p>

<p>For example, the read query <code class="language-plaintext highlighter-rouge">$.^.c</code> in:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>with_cursor($=.a.b, $temp = $.^.c + 1 | $.x = $temp)
</code></pre></div></div>

<p>can be resolved by computing the absolute path <code class="language-plaintext highlighter-rouge">["a", "c"]</code> and looking it up in the flat map, avoiding zipper
navigation entirely.</p>

<h3 id="rule-of-thumb">Rule of thumb</h3>

<table>
  <thead>
    <tr>
      <th>Scenario</th>
      <th>Preferred approach</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>Many edits clustered in one subtree</td>
      <td>Zipper</td>
    </tr>
    <tr>
      <td>Edits scattered across the tree</td>
      <td>Root-based <code class="language-plaintext highlighter-rouge">access</code></td>
    </tr>
    <tr>
      <td>Read-only lookups</td>
      <td>Flat map / direct path lookup</td>
    </tr>
  </tbody>
</table>

<h2 id="complexity-of-zipper-implementation">Complexity of Zipper implementation</h2>

<p>Zipper implementation can have a lot of boilerplate code, especially when the tree structure is complex. For example, if
we have the following Value tree:</p>

<div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kr">data</span> <span class="kt">Value</span> <span class="o">=</span> <span class="kt">Atom</span> <span class="kt">Int</span>
         <span class="o">|</span> <span class="kt">List</span> <span class="p">[</span><span class="kt">Value</span><span class="p">]</span>
         <span class="o">|</span> <span class="kt">Map</span> <span class="p">[(</span><span class="kt">String</span><span class="p">,</span> <span class="kt">Value</span><span class="p">)]</span>
         <span class="o">|</span> <span class="kt">BinOp</span> <span class="kt">String</span> <span class="kt">Value</span> <span class="kt">Value</span>
         <span class="o">|</span> <span class="kt">UnOp</span> <span class="kt">String</span> <span class="kt">Value</span>
</code></pre></div></div>

<p>For each type of node that can have children, we need to define a complicated <code class="language-plaintext highlighter-rouge">Crumb</code>:</p>

<div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kr">data</span> <span class="kt">ValueCrumb</span> <span class="o">=</span> <span class="kt">ListCrumb</span> <span class="kt">Int</span> <span class="p">[</span><span class="kt">Value</span><span class="p">]</span> <span class="p">[</span><span class="kt">Value</span><span class="p">]</span>
                <span class="o">|</span> <span class="kt">MapCrumb</span> <span class="kt">String</span> <span class="p">[(</span><span class="kt">String</span><span class="p">,</span> <span class="kt">Value</span><span class="p">)]</span> <span class="p">[(</span><span class="kt">String</span><span class="p">,</span> <span class="kt">Value</span><span class="p">)]</span>
                <span class="o">|</span> <span class="kt">BinOpLeftCrumb</span> <span class="kt">String</span> <span class="kt">Value</span>
                <span class="o">|</span> <span class="kt">BinOpRightCrumb</span> <span class="kt">String</span> <span class="kt">Value</span>
                <span class="o">|</span> <span class="kt">UnOpCrumb</span> <span class="kt">String</span>
</code></pre></div></div>

<p>We will talk about how to reduce the boilerplate code in the next post.</p>

<h2 id="conclusion">Conclusion</h2>

<p>From an imperative-programming point of view, zippers can feel natural as they let us navigate and modify a tree node
“in place”, as opposed to the root-based approach where we always have to create a new tree for each modification.
Zippers work best when edits are clustered - with a cursor near the action, each modification costs only the distance
from the cursor instead of the full depth of the tree.</p>

<p>However, they are not a universal replacement for root-based access. Scattered edits gain little from a zipper, and
read-only lookups are better served by a flat index. The right choice depends on the access pattern.</p>

<h2 id="further-reading">Further reading</h2>

<ul>
  <li><a href="https://learnyouahaskell.github.io/zippers.html">Learn You a Haskell for Great Good!</a></li>
</ul>]]></content><author><name>Junxiang Zhou</name></author><category term="Haskell" /><summary type="html"><![CDATA[When working with tree-structured data, we often need to navigate to a specific node and modify it. In imperative languages, this is usually straightforward thanks to mutable state and parent pointers. In functional languages, however, immutability makes this pattern less obvious.]]></summary></entry></feed>