cover

Foreword

Blockchains are fundamentally about codifying agreements and adhering to what was coded. There are no political backdoors or deals behind closed doors—just one simple premise: if I can read the code of a smart contract, I know exactly what I'm getting into. This is an empowering philosophy. It creates a true meritocracy that incentivizes people to learn, research, and become intimately familiar with the financial foundation upon which the world runs.

Enabling this philosophy requires complex layers of algorithms and game theory, which ultimately provide users with an experience similar to Web2. We often hear complaints that "this is too much research, not enough product," yet without this immense foundational work, we would merely be creating an illegal financial network where founders and CEOs could arbitrarily freeze and steal funds. Between decentralized and centralized systems lie protocols masquerading as the former, seeking quick traction before exit scamming—the worst of both worlds.

The Goblin Book aims to educate readers about true decentralization: why it matters, when it matters, and what tools and knowledge are necessary to thrive in a decentralized world. Beyond this, it focuses on interoperability (or ledger-of-ledger technology). Written based on modern understanding of high-performance blockchains, this book demonstrates that we need not choose between decentralization, security, and low latency.

While this book delves into the most fundamental aspects of blockchain technology, making it a demanding read, the insights it offers are invaluable for understanding the future of decentralized systems.

Introduction

This book is intended for tech-savy readers, interested in learning extremely advanced blockchain and interoperability concepts. Although it explains some of the most complex topics in blockchain, we do try to start at the basics.

Various chapters are accompanied by coding projects. The complete version of each project is checked in our book repository.

Getting Started

Developing interoperability technology—or working across multiple chains simultaneously—requires a complex local development setup. You'll often need to run various chains and off-chain services concurrently. Consider a simple application that aggregates traffic from Ethereum, Solana, and Hyperliquid. Such an application would need to operate:

Development Environments:

Ethereum devnet
Arbitrum devnet (required for Hyperliquid settlement)
Solana devnet
Hyperliquid devnet

Services:

Relayer
Indexer
Database
Frontend

Programming Languages and Tools:

Solidity (via solc)
Rust
Golang
Typescript
Docker

This list continues to grow as projects become more sophisticated. Managing tool versions across different developers becomes a significant challenge. Currently, there are two main approaches for handling complex setups like this: Bazel and Nix. Throughout this book, we'll use Nix—our Goblin-approved solution—to manage these development environments efficiently.

Nixos

This book heavily leverages Nix in examples to make it easier for you to build and fetch tools used in examples. Get started by installing Nix. You do not need to actually learn the Nix language to read this book, although some basic knowledge may help you out.

Following Along

Whenever we provide code examples for you to execute in your shell, the code snippet will be accompanied by a Nix tab. The Nix tab shows you the commands necessary to load the tools into your shell for executing that snippet.

For example, here is a snippet to query the Union GraphQL API, which requires graphqurl to execute. If you click on the Nix tab and copy the lines there, graphqurl will be installed in your shell.

Don't worry about bloating your system. Once you close the shell, everything that was installed will be gone again.

Here for example, we show how to query for packets using gq.

gq https://development.graphql.union.build/v1/graphql -q '
{
  v2_packets(args: { p_limit: 3 }) {
    packet_hash
    packet_send_block_hash
  }
}
'

nix shell nixpkgs#nodePackages.graphqurl

If you decide not to use nix, do not worry. We rely mainly on common, open-source software, that can usually be installed using npm or brew. All examples can still be followed along with alternative installation methods.

Apple

Most developers building on Union use MacBooks as their main development machine, in combination with lightweight *nix VMs.

When locally developing on MacBooks, there's a few things to keep in mind:

Docker does not have first class support. We recommend orbstack and our guide.
Some applications need to be cross-compiled. For all Union-related services, we provide cross-compiled binaries. However other projects may not be as widely supported.

OrbStack

OrbStack is a fast, light-weight alternative to Docker Desktop and traditional VMs for macOS. It provides a seamless way to run containers and Linux machines on your Mac with significantly better performance and resource efficiency than traditional solutions.

OrbStack integrates containerization and virtualization capabilities directly into macOS, allowing you to:

Run Docker containers with native-like performance
Create and manage lightweight Linux VMs
Access containers and VMs via terminal, SSH, or VS Code
Seamlessly share files between host and guest systems
Use familiar Docker CLI commands without modification

Normally, these functions are not available on Apple, or do not make use of the latest Apple features, which causes performance degradation. For singular docker containers, you do not really notice this, but when for example building a relayer, you will run multiple devnets on one machine, as well as a prover. Performance is key to mimic production environments.

Dispatching an Asset Transfer

Let's dive into interoperability by performing a complex cross-chain operation programmatically. While this interoperability touches upon several technical concepts like asset standards, altVMs, indexing, light clients, and storage proofs, our main goal is to execute an end-to-end operation and understand the components involved. We'll explore the theoretical foundations in later chapters.

We will implement a TypeScript program that can manage EVM (Ethereum Virtual Machine) wallets, interact with multiple chains, and dispatch asset transfers through smart contract interactions. Finally, we will query an indexing service to trace our transfer's progress. While this code works in both frontend and backend environments thanks to TypeScript, we recommend Rust for production backends.

Setting up the project

mkdir asset-dispatcher

Create a flake.nix with the following configuration. This sets up Deno for your local development environment and adds code formatters (run with nix fmt). Enable the development environment by running nix develop.

{
  description = "Example Union TypeScript SDK usage";

  inputs = {
    nixpkgs.url = "github:NixOS/nixpkgs/nixos-unstable";
    flake-parts.url = "github:hercules-ci/flake-parts";
  };

  outputs =
    inputs@{ flake-parts, nixpkgs, ... }:
    flake-parts.lib.mkFlake { inherit inputs; } {
      systems = [
        "aarch64-darwin"
        "aarch64-linux"
        "x86_64-darwin"
        "x86_64-linux"
      ];
      perSystem =
        {
          config,
          self',
          inputs',
          pkgs,
          lib,
          system,
          ...
        }:
        let
          denortPerSystem = {
            "aarch64-darwin" = {
              target = "aarch64-apple-darwin";
              sha256 = lib.fakeHash;
            };
            "aarch64-linux" = {
              target = "aarch64-unknown-linux-gnu";
              sha256 = lib.fakeHash;
            };
            "x86_64-darwin" = {
              target = "x86_64-apple-darwin";
              sha256 = lib.fakeHash;
            };
            "x86_64-linux" = {
              target = "x86_64-unknown-linux-gnu";
              sha256 = "sha256-7reSKyqBLw47HLK5AdgqL1+qW+yRP98xljtcnp69sw4=";
            };
          }.${system};
          platform = builtins.trace "Sourcing DENORT with target: ${denortPerSystem.target}" denortPerSystem.target;
          packageJson = lib.importJSON ./package.json;
          pnpm = pkgs.pnpm;
          deno = pkgs.deno;
          denort = pkgs.fetchzip {
            url = "https://dl.deno.land/release/v${deno.version}/denort-${platform}.zip";
            sha256 = denortPerSystem.sha256;
            stripRoot = false;
          };
        in
        {
          packages = {
            default = pkgs.buildNpmPackage rec {
              pname = packageJson.name;
              inherit (packageJson) version;
              src = ./.;
              npmConfigHook = pnpm.configHook;
              nativeBuildInputs = [
                deno
                denort
              ];
              npmDepsHash = "sha256-ZztV7LzNfN2fGX1+cUq77DQLfxYPiCh4IK/fk/HbrAE=";
              pnpmDeps = pnpm.fetchDeps {
                inherit
                  pname
                  src
                  version
                  ;
                hash = npmDepsHash;
              };
              npmDeps = pnpmDeps;
              doCheck = true;
              checkPhase = ''
                deno check 'src/**/*.ts'
              '';
              buildPhase = ''
                runHook preBuild
                DENORT_BIN=${denort}/denort deno compile --no-remote --output out src/index.ts
                runHook postBuild
              '';
              installPhase = ''
                mkdir -p $out
                cp ./out $out
              '';
              doDist = false;
            };
          };

          devShells.default = pkgs.mkShell {
            buildInputs = with pkgs; [
              nodejs
              pnpm
              deno
              #nodePackages_latest.typescript-language-server
              biome
              nixfmt
            ];
          };
        };
    };
}

Next, create src/index.ts. This will contain most of our logic. Add a simple test:

console.log("hello, world");

Run it with deno run src/index.ts to verify your environment works. You should see hello, world in your terminal.

Managing wallets

Let's modify index.ts to create and fund two wallets. Note: This example hardcodes mnemonics for demonstration purposes. In production, always use proper key management services.

import { createWalletClient, http } from "npm:viem";
import { mnemonicToAccount } from "npm:viem/accounts";
import { holesky, sepolia } from "npm:viem/chains";

const sepoliaWallet = createWalletClient({
  account: mnemonicToAccount(mnemonic),
  chain: sepolia,
  transport: http(),
});

const holeskyWallet = createWalletClient({
  account: mnemonicToAccount(mnemonic),
  chain: holesky,
  transport: http(),
});

console.log(`Sepolia address: ${sepoliaWallet.account.address}`);
console.log(`Holesky address: ${holeskyWallet.account.address}`);

Create two variables, mnemonic1 and mnemonic2, each containing a 12-word sentence (space-separated) as a string. Run the script and save your addresses. You can use the same mnemonic if you prefer.

To fund our Sepolia address for contract interactions, we'll use a faucet.

Let's verify our faucet funding by checking the balance: create-client

import { createPublicClient, formatEther } from "npm:viem";

const sepoliaClient = createPublicClient({
  chain: sepolia,
  transport: http(),
});

const gasBalance = await sepoliaClient.getBalance({
  address: sepoliaWallet.account.address,
});

const erc20Balance = await sepoliaClient.readContract({
  address: WETH_ADDRESS,
  abi: erc20Abi,
  functionName: "balanceOf",
  args: [sepoliaWallet.account.address],
});

console.log([
  `Sepolia Gas Balance:   ${formatEther(gasBalance)} ETH (${gasBalance} wei)`,
  `Sepolia Token Balance: ${formatEther(erc20Balance)} WETH`,
].join("\n"));

We use formatEther for human-readable output. The parenthesized value shows the raw balance. We'll discuss sats, decimals, and asset standards later, but note that ETH is stored in wei on-chain (1 ETH = 10^18 wei).

At this point, we have secured testnet funds and set up a local wallet (though not production-ready).

Performing the Asset Transfer

To do the bridge operation, we'll directly interact with the Union contracts through their ABI. We will use the Union SDK package to import some types and the required ABIs. The sdk provides both low-level bindings to various contracts, as well as backend clients and effects based on effect.website.

For now we are going to use the raw bindings, to show what happens under the hood. To perform an asset transfer, we need to perform 3 distinct steps:

Gather configuration parameters.
Approve the contracts.
Sending the bridge transfer.

For step 1. we will rely on etherscan and the Union API. In production you might want to store hardcoded mappings, or dynamically fetch these values from your own APIs. Constructing the transaction is simple for an asset transfer. This is also the stage where we might add 1-click swaps, or DEX integration later down the road.

Although step 3 seems trivial, it is actually quite annoying when dealing with multiple, independent ecosystems. That's why we are doing EVM to EVM for now, so we are only dealing with one execution environment implementation.

Configuration Parameters

Since Union leverages channels, we will need to query the channel-id to use between Sepolia and Holesky. We're using the ucs03-zkgm-0 protocol, so that's what we'll filter on. The v2_channel_recommendations shows officially supported channels by the Union team.

query RecommendedZkgmChannels @cached(ttl: 60) {
  v2_channels(args: {
    p_limit: 5,
    p_recommended: true,
    p_version: "ucs03-zkgm-0"
  }) {
    source_universal_chain_id
    source_client_id
    source_connection_id
    source_channel_id
    source_port_id

    destination_universal_chain_id
    destination_client_id
    destination_connection_id
    destination_channel_id
    destination_port_id
  }
}

For our transfer, we are interested in the source_channel_id for Sepolia (ethereum.11155111).

Since we are doing a WETH transfer, we can use etherscan to find the asset parameters (symbol, decimals and name). Union does verify onchain that the provided parameters are correct. We do pass them to the contract because we want to calculate the packet hash ahead of time. You might wonder why we even use these values in the contract? That is to ensure that when Union instantiates a new asset on the destination chain, it is configured correctly (same symbol, decimals, and name).

Per chain, we can find the Union contracts here. For testnet deployments, these might be updated as of writing this book.

Finally we need to obtain the quote token address (the address of the asset on the destination side).

query GetTransferRequestDetails {
    v2_util_get_transfer_request_details(args: {
        p_source_universal_chain_id: "union.union-testnet-10",
        p_destination_universal_chain_id: "ethereum.11155111",
        p_base_token: "0x7b79995e5f793A07Bc00c21412e50Ecae098E7f9"
    }) {
        quote_token
        source_channel_id
        destination_channel_id
        already_exists
        wrap_direction
    }
}

This should return

{
  "data": {
    "get_wrapped_transfer_request_details": [
      {
        "quote_token": "0x685a6d912eced4bdd441e58f7c84732ceccbd1e4",
        "source_channel_id": 8,
        "destination_channel_id": 47,
        "already_exists": true
      }
    ]
  }
}

The source_channel_id should match the channel from the v2_channel_recommendations query.

The quote_token is deterministically generated depending on the contract addresses and channel_ids. If already_exists is false, the Union contract on the destination chain will instantiate a new asset, hence why the deterministically derived address algorithm is so important.

Approvals

Under the hood, the Union contract will withdraw funds from our account before bridging them to Holesky. This withdrawal is normally not allowed (for security reasons, imagine if smart contracts were allowed to just remove user funds!), so we need to approve the Union contract to allow it to withdraw.

import { erc20Abi } from "npm:viem";

await sepoliaWallet.writeContract({
  address: "0x7b79995e5f793A07Bc00c21412e50Ecae098E7f9",
  abi: erc20Abi,
  functionName: "approve",
  args: [ucs03address, 100000000000n],
});

For convenience, we are allowing the contract MaxUint256, so that we do not need to do further approvals. From now on, the Union ucs03 contract can withdraw WETH on Sepolia.

Bridging

Executing the actual bridge operation seems like quite a lot of lines of code. Later we will use the alternative typescript client and effects API, to simplify the flow.

When we interact with the send entrypoint, we submit a program. Union's bridge standard leverages a lightweight, non-Turing complete VM. That way, we can do 1-click swaps, forwards, or other arbitrary logic. The args for our call in this case is the Batch instruction, which is effectively a list of instructions to execute. Inside the batch, we have two FungibleAssetOrders. The first order is transferring wrapped Eth using a 1:1 ratio (meaning that on the receiving side, the user will receive 100% of the amount). The second order has a 1:0 ratio, meaning that the user receives nothing on the destination side. Effectively, we are 'tipping' the protocol here. An alternative way to ensure this transfer is funded, is altering the ratio of the first transfer. For example, a 100:99 ratio would be a 1% transfer fee.

import { Instruction } from "npm:@unionlabs/sdk/ucs03";
import { ucs03abi } from "npm:@unionlabs/sdk/evm/abi";
import { type Hex, toHex } from "npm:viem";

function generateSalt() {
  const rawSalt = new Uint8Array(32);
  crypto.getRandomValues(rawSalt);
  return toHex(rawSalt) as Hex;
}

// We're actually enqueuing two transfers, the main transfer, and fee.
const instruction = new Instruction.Batch({
  operand: [
    // Our main transfer.
    new Instruction.FungibleAssetOrder({
      operand: [
        sepoliaWallet.account.address,
        holeskyWallet.account.address,
        WETH_ADDRESS,
        4n,
        // symbol
        "WETH",
        // name
        "Wrapped Ether",
        // decimals
        18,
        // path
        0n,
        // quote token
        "0xb476983cc7853797fc5adc4bcad39b277bc79656",
        // quote amount
        4n,
      ],
    }),
    // Our fee transfer.
    new Instruction.FungibleAssetOrder({
      operand: [
        sepoliaWallet.account.address,
        holeskyWallet.account.address,
        WETH_ADDRESS,
        1n,
        // symbol
        "WETH",
        // name
        "Wrapped Ether",
        // decimals
        18,
        // path
        0n,
        // quote token
        "0xb476983cc7853797fc5adc4bcad39b277bc79656",
        // quote amount
        0n,
      ],
    }),
  ],
});

const transferHash = await sepoliaWallet.writeContract({
  abi: ucs03abi,
  functionName: "send",
  address: ucs03address,
  args: [
    // obtained from the graphql Channels query
    sourceChannelId,
    // this transfer is timeout out by timestamp, so we set height to 0.
    0n,
    // The actual timeout. It is current time + 2 hours.
    BigInt(Math.floor(Date.now() / 1000) + 7200),
    generateSalt(),
    {
      opcode: instruction.opcode,
      version: instruction.version,
      operand: Instruction.encodeAbi(instruction),
    },
  ],
});

The denomAddress is the ERC20 address of the asset we want to send. You might notice that regular ETH does not have an address, because it is not an ERC20. To perform the transfer, ETH must be wrapped to WETH (optional if you already own WETH):

import { parseEther } from "npm:viem";

// WETH ABI - we only need the deposit function for wrapping
const WETH_ABI = [
  {
    name: "deposit",
    type: "function",
    stateMutability: "payable",
    inputs: [],
    outputs: [],
  },
] as const;

// Create the wallet client and transaction
const hash = await sepoliaWallet.writeContract({
  address: WETH_ADDRESS,
  abi: WETH_ABI,
  functionName: "deposit",
  value: parseEther("0.0001"), // Amount of ETH to wrap
});

console.log(`Wrapping ETH: ${hash}`);

Once this transaction is included, the transfer is enqueued and will be picked up by a solver. Next we should monitor the transfer progression using an indexer. The easiest solution is [graphql.union.build], which is powered by [hubble]. Later we will endeavour to obtain the data directly from public RPCs as well.

Tracking Transfer Progression

Once the transfer is enqueued onchain, we go through a pipeline of backend operations, which normally are opaque to the enduser, but useful for us for debugging (and fun to look at). Union refers to these steps as Traces, and they are indexed and stored for us by Hubble. Some of these include:

PACKET_SEND
PACKET_SEND_LC_UPDATE_L0
PACKET_RECV
PACKET_ACK

The PACKET_SEND was actually us performing the transfer. The other steps are executed by solvers. Later we will write a solver to explore what each entails.

To get the tracing data, we'll make a Graphql query. For now we will just use fetch calls, but there are many high quality graphql clients around.

const query = `
  query {
    v2_transfers(where: {transfer_send_transaction_hash:{_eq: "${transferHash}"}}) {
      traces {
        type
        height
        chain { 
          display_name
          universal_chain_id
        }
      }
    }
  }`;

const result = await new Promise((resolve, reject) => {
  const interval = setInterval(async () => {
    try {
      const request = fetch("https://graphql.union.build/v1/graphql", {
        method: "POST",
        headers: {
          "Content-Type": "application/json",
        },
        body: JSON.stringify({
          query,
          variables: {},
        }),
      });

      const response = await request;
      const json = await response.json();
      const transfers = json.data["v2_transfers"];

      console.log({ json });

      if (transfers?.length) {
        clearInterval(interval);
        resolve(transfers);
      }
    } catch (err) {
      clearInterval(interval);
      reject(err);
    }
  }, 5_000);
});

console.log(result);

For example, for the transaction hash 0xa7389117b99b7de4dcd71dc2acbe21d42826dd4d35174c72f23c0adb64144863, we get the following data:

{
  "data": {
    "v2_transfers": [
      {
        "traces": [
          {
            "type": "PACKET_SEND",
            "height": 7839514,
            "chain": {
              "display_name": "Sepolia",
              "universal_chain_id": "11155111.sepolia"
            }
          },
          {
            "type": "PACKET_SEND_LC_UPDATE_L0",
            "height": null,
            "chain": {
              "display_name": "Union Testnet 9",
              "universal_chain_id": "union-testnet-9.union"
            }
          },
          {
            "type": "PACKET_RECV",
            "height": null,
            "chain": {
              "display_name": "Union Testnet 9",
              "universal_chain_id": "union-testnet-9.union"
            }
          },
          {
            "type": "WRITE_ACK",
            "height": null,
            "chain": {
              "display_name": "Union Testnet 9",
              "universal_chain_id": "union-testnet-9.union"
            }
          },
          {
            "type": "WRITE_ACK_LC_UPDATE_L0",
            "height": null,
            "chain": {
              "display_name": "Sepolia",
              "universal_chain_id": "11155111.sepolia"
            }
          },
          {
            "type": "PACKET_ACK",
            "height": null,
            "chain": {
              "display_name": "Sepolia",
              "universal_chain_id": "11155111.sepolia"
            }
          }
        ]
      }
    ]
  }
}

Universal chain IDs are chain identifiers specifically used by Union, which are, as the name implies, universally unique. The reason for deviating from what the chains themselves use, is described here.

If we want to monitor the progression of a transfer, we would poll this query. There are three important trace types to watch for.

PACKET_SEND: our transaction was included on the source chain. From this moment on, explorer links using the transaction hash should return data. (on average, the Union API is about 5-10 seconds faster than Etherscan though.)
PACKET_RECV: the relayer has submitted a proof and the packet for the transfer. Funds are now usable on the destination side. The transfer flow is now 'completed' from the user's perspective.
PACKET_ACK: the relayer has acknowledged the transfer on the source chain. If the open-filling API was used, this event will also trigger payment for the solver. This is only of interest for solvers/backend engineers.

Once we see the PACKET_RECV event, our funds will be usable on Holesky. The traces after that are used by the system to pay the solver, and maintain bookkeeping.

We can query Holesky for our balance to verify that we received funds:



const holeskyClient = createPublicClient({
  chain: holesky,
  transport: http(),
});

const holeskyBalance = await holeskyClient.readContract({
  address: "0xb476983cc7853797fc5adc4bcad39b277bc79656",
  abi: erc20Abi,
  functionName: "balanceOf",
  args: [holeskyWallet.account.address],
});

const formattedBalance = gasBalance / 10n ** BigInt(18);

console.log(
  `Token balance: ${formatEther(formattedBalance)} (${holeskyBalance})`,
);

This should now return the amount sent in the first FungibleAssetOrder.

Summary

This was a hands-on way to introduce you to multichain programming. We have omitted the implementation details of many of the individual steps. You have now experienced the transfer flow that a regular user experiences when interacting through UIs. In the next chapter, we will go deeper into what each trace meant. Later we will write a simple solver, and show orders are filled.

Overview

Before we explore how IBC and Union work, we take a short detour to get acquainted with interoperability in general.

At its core, interoperability is about relaying data between two smart contracts on different chains, the same way that the internet is used to relay data between two processes on different servers. In our analogy here, a smart contract functions as a standalone process. Building from this abstraction, we realise that a connection really acts as a way for smart contracts to send bytes to each other.

Often in various protocols, we talk about message sending between chains. These messages are effectively data packets. Just as TCP/IP provides guarantees about packet delivery and ordering across the internet, blockchain interoperability protocols must provide similar guarantees about message delivery and execution across chains. The key difference is that while internet protocols primarily ensure data integrity and delivery, blockchain interoperability protocols must also ensure consensus agreement and cryptographic verification of the messages being relayed.

This means that cross-chain communication requires not just moving data, but also proving that the data came from a valid source and was properly authorized. The relayers that facilitate this communication serve a role similar to routers in internet infrastructure, but with the additional responsibility of providing cryptographic proofs and handling consensus verification.

We will go through each layer of Union's protocol and explain how packet semantics, cryptographic verification and guaranteed delivery is implemented. We shall see how (light)clients, connections, channels and packets relate to each other and inspect the real-life usage of the core general-message passing protocol in asset transfers.

The next sections are heavy on theoretical knowledge, after which we will continue to build a dApp which leverages Union to interact with Bitcoin derivatives.

Connections

At the beginning of the lifecycle of communication between two chains, a connection must be opened. This is a process by which one chain initiates the opening of the connection, and the other responds with certain data. We call this process a 4-way handshake, as each chain must send two messages. The handshake is used to bootstrap the connection, exchanging critical information such as the current validator set, chain identifier, and consensus mechanism. This data is stored on both chains and, once the handshake is completed, used for verifying future cross-chain messages.

sequenceDiagram
    participant Chain A
    participant Relayer
    participant Chain B

    Chain A->>Relayer: ConnectionOpenInit (includes Chain A's info)
    Relayer->>Chain B: Relay ConnectionOpenInit
    Chain B->>Relayer: ConnectionOpenTry (includes Chain B's info)
    Relayer->>Chain A: Relay ConnectionOpenTry
    Chain A->>Relayer: ConnectionOpenAck (verify Chain B's info)
    Relayer->>Chain B: Relay ConnectionOpenAck
    Chain B->>Relayer: ConnectionOpenConfirm
    Relayer->>Chain A: Relay ConnectionOpenConfirm

    Note over Chain A,Chain B: Connection Established

During this handshake:

Chain A initiates with ConnectionOpenInit, sending its chain-specific parameters
Chain B responds with ConnectionOpenTry, verifying Chain A's data and providing its own
Chain A acknowledges with ConnectionOpenAck, confirming Chain B's information
Chain B finalizes with ConnectionOpenConfirm, establishing the secure connection

Once established, this connection can be used for secure cross-chain communication, with both chains able to verify messages using the exchanged parameters and consensus proofs.

This connection effectively acts as a socket to read and write bytes between the two chains. Although this is powerful, we ideally want a more structured way to communicate, akin to HTTP. For that we use channels.

Multiple Connections

Usually the relation between chains and connections is one-on-one, meaning that there only exists one connection between two chains. There is nothing preventing multiple from existing however. You will probably see some duplicates for testing reasons: deploying connections while verifying the actual production one will work.

gq https://development.graphql.union.build/v1/graphql -q '
query Connections @cached(ttl: 60) {
  v2_connections(args: { p_limit: 30 }) {
    source_universal_chain_id
    source_client_id
    source_connection_id

    destination_universal_chain_id
    destination_client_id
    destination_connection_id
  }
}'

nix shell nixpkgs#nodePackages.graphqurl

There are uses for multiple connections outside of testing though. Connections may leverage different clients, and thus have different security guarantees. A 'fast' connection could leverage an oracle solution, while the 'slow' connection awaits full finality.

Channels

Channels provide an application-level communication protocol built on top of connections. While connections handle the basic secure transport between chains, channels implement message delivery and application-specific logic. Think of channels as dedicated message queues between specific applications on different chains, where messages are typed and have certain effects.

sequenceDiagram
    participant App on Chain A
    participant Chain A
    participant Chain B
    participant App on Chain B

    App on Chain A->>Chain A: Request channel creation
    Chain A->>Chain B: ChanOpenInit
    Chain B->>App on Chain B: Notify app
    Chain B->>Chain A: ChanOpenTry
    Chain A->>Chain B: ChanOpenAck
    Chain B->>Chain A: ChanOpenConfirm

    Note over Chain A,Chain B: Channel Established

    App on Chain A->>Chain A: Send packet
    Chain A->>Chain B: Packet transfer
    Chain B->>App on Chain B: Deliver packet

Each channel has key properties:

Ordering: Controls packet delivery (ordered, unordered, or ordered with timeouts)
Version: Application-specific string for protocol versioning
State: Tracks the channel establishment process

The channel handshake ensures both applications:

Agree on the version
Are ready to process packets
Can verify each other's packet commitments

Multiple channels can exist over a single connection, each serving different applications. For example, a token transfer application and a governance application could each have their own channel while sharing the underlying secure connection. In general, Union multiplexes traffic over connections and only maintains one connection per chain, while operating many different channels.

Channel Use Cases

Whenever a protocol has a structured message format, it should consider using a specific channel. This is useful for indexers, which use channel.version to read packets for further analysis.

We can query active channels by running:

gq https://development.graphql.union.build/v1/graphql -q '
query Channels @cached(ttl: 60) {
  v2_channels(args: { p_limit: 30 }) {
    source_universal_chain_id
    source_client_id
    source_connection_id
    source_channel_id
    source_port_id

    destination_universal_chain_id
    destination_client_id
    destination_connection_id
    destination_channel_id
    destination_port_id

    version
  }
}
'

nix shell nixpkgs#nodePackages.graphqurl

You will probably see ucs03-zkgm-0 in the output, which is the multiplexed transfer protocol. Like a swiss-army knife, it works for loads of complex applications. Other common versions are ics20, which is used for legacy asset transfers. With multiplexed, we mean that a single channel serves many applications at the same time.

graph TB
    B1[Token Bridge] & B2[NFT Bridge] & B3[Governance] --- MC[ucs03-zkgm-0]
    MC --- Chain3[Chain B]
    Chain3 --- B4[Token Bridge] & B5[NFT Bridge] & B6[Governance]

In legacy channel configurations, there would be 3 individual channels. Multiplexing offers key advantages:

Applications do not need to relay their own channels.
Smart contract developers can leverage enshrined smart contracts.
The channel implementation can use smart batching to limit the amount of packets necessary.

Packets

Packets are the unit of cross-chain communication that carry application data through established channels. For unordered channels, packets can be delivered in any sequence, making them ideal for applications where message ordering isn't critical. Union specifically chose not to support ordered channels due to their poor performance during congestion and incompatibility with fee markets.

sequenceDiagram
    participant App A
    participant Chain A
    participant Chain B
    participant App B

    App A->>Chain A: Send Packet
    Note over Chain A: Store Commitment
    Chain A-->>Chain B: Relay Packet + Proof
    Note over Chain B: Verify Proof
    Chain B->>App B: Execute Packet
    Note over Chain B: Store Receipt
    Chain B-->>Chain A: Acknowledge + Proof
    Note over Chain A: Mark Commitment

Each packet contains:

Source channel
Destination channel
Timeout height or timestamp
Data payload

Packet Lifecycle:

Application sends data through its channel
Source chain stores a commitment to the packet
Relayer delivers packet and proof to destination
Destination verifies and executes packet
Relayer returns acknowledgment to source
Source chain cleans up the commitment

Timeouts prevent packets from being permanently stuck if the destination chain halts or refuses to process them. When a timeout occurs, the source chain reclaims the packet and notifies the sending application.

ucs01-zkgm

Union leverages a specialized channel with packet data for asset transfers. While analogous to ics01 in legacy IBC chains, it offers several advantages:

Multi-Asset transfers
Open Filling
Ahead of Finality (AoF) filling
Routing for GMP

The packet schema functions as a small program with various instructions executed by the IBC app:

struct ZkgmPacket {
    bytes32 salt;
    uint256 path;
    Instruction instruction;
}

struct Instruction {
    uint8 version;
    uint8 opcode;
    bytes operand;
}

Instructions use ethabi encoding to structure packets or perform operations. For example, the Forward instruction enables packet forwarding:

struct Forward {
    uint32 channelId;
    uint64 timeoutHeight;
    uint64 timeoutTimestamp;
    Instruction instruction;
}

The most common instruction is FungibleAssetOrder:

struct FungibleAssetOrder {
    bytes sender;
    bytes receiver;
    bytes baseToken;
    uint256 baseAmount;
    string baseTokenSymbol;
    string baseTokenName;
    uint256 baseTokenPath;
    bytes quoteToken;
    uint256 quoteAmount;
}

This instruction powers the official Union app's bridging functionality. Unlike other bridges, it includes both base and quote information, enabling users to specify desired asset conversions (e.g., USDC to unionUSDC). This design allows FungibleAssetOrder to handle non-equivalent asset swaps when solvers provide liquidity.

We can see this structure inside the packets live:

gq https://development.graphql.union.build/v1/graphql -q '
query Packets @cached(ttl: 60) {
  v2_packets(args: { p_limit: 5 }) {
    channel_version
    decoded
  }
}'

nix shell nixpkgs#nodePackages.graphqurl

The indexer uses the channel.version to decode the packet and show what is being transmitted. For ucs03-zkgm-0, you should observe something like

{
  "data": {
    "v2_packets": [
      {
        "channel_version": "ucs03-zkgm-0",
        "decoded": {
          "path": "0x0",
          "salt": "0x39cdaec1be16a7f3a5c39db77ff337ba8675c8937e81b0d2418b1b52f404e4d4",
          "instruction": {
            "_index": "",
            "opcode": 2,
            "operand": {
              "_type": "Batch",
              "instructions": [
                {
                  "_index": "0",
                  "opcode": 3,
                  "operand": {
                    "_type": "FungibleAssetOrder",
                    "sender": "0x307865663433356538653663353337363130666562636361656538356236363864623165636166653032",
                    "receiver": "0x50a22f95bcb21e7bfb63c7a8544ac0683dcea302",
                    "baseToken": "0x7562626e",
                    "baseAmount": "0x1",
                    "quoteToken": "0x9e8af87a38012f5bb809b8040b4e34439fb8122f",
                    "quoteAmount": "0x1",
                    "baseTokenName": "ubbn",
                    "baseTokenPath": "0x0",
                    "baseTokenSymbol": "ubbn",
                    "baseTokenDecimals": 0
                  },
                  "version": 1,
                  "_instruction_hash": "0x6d4debb95009b4c114d1faeac846042755da1e4814d92d300004c091d30581ae"
                }
              ]
            },
            "version": 0,
            "_instruction_hash": "0x44e9064c6fbb9ee05a91dc52a4fe66f0b6544bd5a7f747f445e3610e9bc910bc"
          }
        }
      },
      ...
  ]}
}

Here we can see a packet with a FungibleAssetOrder, so we know this is funds being transmitted from one chain to another.

Fees

Rather than explicitly defining relayer and gas fees, FungibleAssetOrder incentivizes packet processing through the value difference between baseAmount and quoteAmount for equivalent assets:

FungibleAssetOrder({
    ...
    baseToken: USDC,
    baseAmount: 100,
    quoteToken: USDC,
    quoteAmount: 99,
})

This example sets a 1 USDC fee independent of the destination chain's gas token. Relayers evaluate packet settlement based on profitability.

Gas Station

The protocol addresses the common challenge of users lacking gas tokens after bridging through a composable instruction system. While some centralized bridges offer unreliable gas services, Union's approach uses the Batch instruction to combine multiple FungibleAssetOrder instructions atomically:

struct Batch {
    Instruction[] instructions;
}

A transfer with gas deposit combines two orders:

Batch({
    instructions: [
        FungibleAssetOrder { actualTransferDetails.. },
        FungibleAssetOrder { baseTokenAmount: 0, quoteToken: $GAS, quoteTokenAmount: 1 },
    ],
})

Relayers evaluate the batch's cumulative profit, converting gas tokens to USD value. For instance, if the first order yields 5 USD profit and the second costs 1 $GAS, relayers fulfill the packet when the net profit exceeds their threshold. The smart contract uses the relayer's balance for the gas portion, demonstrating open filling functionality.

Marking Commitments

Union's approach to handling commitments differs from traditional IBC implementations in an important security aspect. While IBC-classic allows commitments to be cleaned up due to unique sequencing, Union's optimistic packet execution model requires a different approach to prevent potential exploits.

The Security Challenge

A key security vulnerability could arise if commitments were cleaned (deleted) rather than marked:

An attacker could send a packet
Get it acknowledged
Exploit the commitment cleanup to loop this sequence:
- Send the same packet again (generating same hash)
- Get acknowledgment
- Repeat

This attack vector exists because packet hashes can collide when identical packets are sent multiple times, unlike IBC-classic where sequence numbers ensure uniqueness.

Solution: Marking Instead of Cleaning

To prevent this attack while maintaining optimistic execution, Union:

Keeps all commitments stored instead of cleaning them
Marks fulfilled commitments as "acked" rather than deleting them
Validates against this "acked" status to prevent replay attacks

This approach:

Prevents the looping vulnerability
Only costs about 4k more gas compared to cleaning
Maintains security without compromising the optimistic execution model

The gas cost difference is negligible compared to the protocol level advantage that optimistic solving provides.

Clients

Clients are modules that track and verify the state of other chains. How they do this varies significantly based on the execution environment of the connected chains. Most clients are implemented as smart contracts.

Core Concepts

Every IBC client must provide:

State tracking of the counterparty chain
Verification of state updates
Proof verification for individual transactions/packets
Misbehavior detection

However, the implementation details can vary depending on the execution environment (EVM or Move for example).

We usually refer to both the code and to the instantiation as a client. The best way to grok this, is to see a client as both the ERC20 code implementation, and an actual ERC20 coin. There can be many clients on a chain, and new clients can be trustlessly instantiated after the code has been uploaded.

State Tracking

Clients must maintain a view of their counterparty chain's state. This typically includes:

Latest verified header/block height
Consensus state (if applicable)
Client-specific parameters like timeout periods
Commitment roots for verifying packet data

Much like how an ERC20 contract tracks balances, a client tracks these state components for its specific counterparty chain instance. The client logic defines what state to track, while each client instance maintains its own state values.

Verification

Verification is how clients validate state updates from their counterparty chain. This process varies dramatically based on the chain's architecture:

Tendermint chains verify through validator signatures
Ethereum clients check PoW/PoS consensus rules
L2s might verify through their parent chain's mechanisms

The client code implements the verification rules, while each instance enforces these rules on its specific counterparty chain's updates.

Inclusion Proofs

Clients must verify proofs that specific transactions or packets were included in the counterparty chain's state. This involves:

Verifying the proof format matches the counterparty's tree structure
Checking the proof against the stored commitment root
Validating the claimed data matches the proof

For example:

Tendermint chains use IAVL+ tree proofs
Ethereum uses Merkle Patricia proofs
Some L2s use their own specialized proof formats

Misbehavior Detection

Clients implement rules to detect and handle misbehavior from their counterparty chains. Common types include:

Double signing - Same height with different state roots
Invalid state transitions - Consensus rule violations
Timeout violations - Not responding within parameters

When misbehavior is detected, clients can:

Freeze to prevent further packet processing
Allow governance intervention
Implement automatic resolution mechanisms

Just as each ERC20 instance can be frozen independently, each client instance handles misbehavior for its specific counterparty chain relationship.

Implementations

Clients are the most complex portion of how IBC works. Implementations depend on deep cryptographic and algorithmic knowledge of consensus verification. Later we will describe how to implement one, but for now it is better to understand the protocol in full.

We can query for current live clients by running:

gq https://development.graphql.union.build/v1/graphql -q '
query Clients @cached(ttl: 60) {
  v2_clients(args: { p_limit: 3 }) {
    universal_chain_id
    client_id
    counterparty_universal_chain_id
  }
}'

nix shell nixpkgs#nodePackages.graphqurl

This provides information for which client is live on which chain, and what other chain it is tracking.

Open Filling

Traditional Bridge Models

Traditional bridge protocols typically handle transfers through a single mechanism per token:

Mint/Burn: The bridge mints a specific token on the destination chain
Locked Pools: Assets are locked in token-specific pools
Solver Networks: Solvers provide liquidity for particular tokens

Each model handles one token type at a time, requiring multiple separate transfers for different assets, and quite often requiring users to switch between various bridges.

One step forward in making the bridge model more flexible, is to separate the relaying of information from the actual fulfillment of an order. The bridge protocol focuses on providing the initial data and relaying the acknowledgement, while different implementations can exist to actually provide the assets. We refer to this model as open filling.

We shall see that open filling has advantages in flexibility and can make better use of local optimizations. On some chains, liquidity pools may be abundant, while on others, the solver market is more mature. Open filling allows bridges to adjust to these market realities.

Open Filling

Union introduces "open filling" where the assets in a transfer can be provided in various ways, while still guaranteeing the atomic execution of the packet:

A single transfer can include multiple different tokens
Each token can use its own fill mechanism
All fills are composed atomically in one transaction

graph TD
    A[User Transfer Request: Value X] --> B[FungibleAssetPacket]
    B --> C[100 USDC from Bridge]
    B --> D[0.015 BTC from Solver]
    B --> E[10 BABY from Pool]
    C --> F[Atomic Completion]
    D --> F
    E --> F

Besides flexibility, open filling can be used to implement features which traditionally we do not consider a core bridging service. One such feature we already encountered: gas station. We could also leverage this to implement an exchange, by specifying non-equivalent base and quote assets:

FungibleAssetOrder { baseToken: USD baseTokenAmount: 100000, quoteToken: BTC, quoteTokenAmount: 1 },

Given this order, the only way to fill is for the relayer to either swap USD in a dex for BTC, or to provide 1BTC itself and keep the $100,000. With open filling, we do not care about the implementation detail (and can support both at the same time).

Union Relayer

The Union IBC relayer is the infrastructure component that performs the I/O and transaction submissions. Think of it as a blockchain postal service - monitoring, packaging, and delivering messages between chains. It connects to various RPCs to detect new blocks and users interacting with IBC contracts, to then generate proofs and submit transactions to destination chains.

How Messages Flow

Let's walk through the process of how a message moves between chains using the relayer:

A user submits a message on Chain A
Chain A stores the message and emits an event
The relayer detects this event
The relayer queries Chain A to generate a consensus proof
The relayer submits the proof and message to Chain B
Chain B verifies and executes the message
The relayer queries Chain B to generate a consensus proof
The relayer confirms receipt back to Chain A using 7.

Depending on the packets being relayed, the relayer may earn a fee. ICS20 allows for 'tipping' the relayer, while Union chooses a UTXO style model, which means that the relayer earns the leftover assets after a transfer occurs. Frontends usually display this as a fee to the user, but under the hood they construct a Batch of FungibleAssetOrders where the quote side will be zero, effectively tipping the relayer.

Core Functions

Chain Monitoring

The relayer maintains active connections to multiple blockchain networks simultaneously. For each chain, it:

Subscribes to new blocks and events
Tracks block confirmations
Monitors chain health and consensus status
Connects to a prover service

Chain monitoring must be highly reliable as missed events could lead to stuck packets. The relayer implements sophisticated retry and recovery mechanisms.

Proof Generation and Verification

For each supported chain pair, the relayer leverages a proving service to generate the actual proofs. It collects the public and private inputs before making the API call.

The reason relaying and proving is separated out over this interface, is because relaying is an I/O heavy operation, that requires fast internet access, while proving is a compute heavy operation. Proving can also be distributed over various machines, which the API abstracts over. That way, the relayer does not need to know the proving implementation.

Implementations

There are three major IBC relayer implementations:

We will discuss Voyager's architecture in this book, as it is the most flexible to extensions and supports the widest array of implementations.

Architecture

Voyager leverages a stateless, message based architecture. Internally it leverages [PostgresSQL] to maintain a queue of events, and tasks to execute. Each RPC call to fetch data, transaction to be submitted, timer or error encountered is represented by a JSON stored in the database.

Plugins

Voyager leverages various plugins to submit transactions, handle new types of chains, and inspect the intermediate state of packets for filtering or modification.

Addresses

One would think that addresses would be resolved by now and chains would have uniform handling of this by now. That is not the case at all.

For a formal specification of how Union handles addresses, check the docs.

TLDR

Cosmos addresses use bech32 encoding with this format:

{HRP}1{address}{checksum}

The human readable part (hrp) differentiates between chains (like union or stars). It's followed by the number 1, then the address, and finally a 6 byte checksum.

When querying transfers across multiple chains for address union1abc...123, searching for that specific string would miss transfers from the same address on other chains like stars1abc...xyz.

Union's SDKs and APIs solve this by supporting searches by:

display style (chain-specific format shown in browsers)
canonical format (without hrp/chain-specific info)

You can query the API to see all versions of an address:

gq https://development.graphql.union.build/v1/graphql -q '
query GetAddressTypesForDisplayAddress {
  v2_util_get_address_types_for_display_address(args: {
    p_display_address: "union1d03cn520attx29qugxh4wcyqm9r747j64ahcj3"
  }) {
    display
    canonical
    zkgm
  }
}
'

nix shell nixpkgs#nodePackages.graphqurl

Your query should return exactly the following data.

{
  "data": {
    "get_address_types_for_display_address": [
      {
        "display": "union1d03cn520attx29qugxh4wcyqm9r747j64ahcj3",
        "canonical": "0x6be389d14fead665141c41af576080d947eafa5a",
        "zkgm": "0x756e696f6e31643033636e353230617474783239717567786834776379716d39723734376a36346168636a33"
      }
    ]
  }
}

Chain IDs

Most blockchains use a unique identifier (like Ethereum's 1) to protect users against replay attacks and help wallets select the correct chain. Applications typically assume these IDs are globally unique across both mainnet and testnets. For example, Sepolia uses 11155111, while Ethereum mainnet uses 1.

Initially, Union also used these 'canonical' identifiers, but this approach revealed a critical issue: chain IDs aren't actually unique across different blockchain ecosystems. For instance, Aptos also uses ID 1, creating potential security vulnerabilities like replay attacks, especially for EVM-compatible Move-based chains.

To address this problem, Union implemented a more robust format:

{ hrp }.{ chainId }

In this structure, chainId represents how a chain identifies itself, while hrp (human-readable part) provides a recognizable prefix. For example, Union's testnet is identified as union.union-testnet-10.

This approach ensures true uniqueness across blockchain ecosystems while maintaining compatibility with existing systems.

Project: Nexus

In this project, we will implement the basics of a multichain decentralized exchange (dex). We will allow users to swap assets by trading against solvers. Our exchange is thus an intent-based protocol.

Functionality

Our application will have the following functionality:

Swaps: choose assets.
Bridge: choose the destination chain.
History: track historic trades

We will focus on implementing the logic only. Frontend, design and UX will not be covered (although we'd gladly accept PRs to expand the guide).

Architecture

We will refer to our exchange as Nexus, since Nex rhymes with Dex, it is short and memeable. From a high level, our project will function approximately like so:

flowchart LR
    Frontend(Frontend App)
    subgraph SourceChain["Source Chain"]
        SourceContract(Nexus)
        SourceUnion(Union)
    end

    subgraph DestinationChain["Destination Chain"]
        DestContract(Union)
    end

    subgraph Data["Indexing Layer"]
        GraphQL(Union GraphQL API)
    end

    subgraph Solvers["Solver Layer"]
        Solver(Voyager Plugin)
    end

    Frontend --> |Submit Order| SourceContract
    Frontend --> |Query History| GraphQL
    SourceContract --> |Forward Order| SourceUnion
    SourceUnion --> |Route Order| Solver
    Solver --> |Settle| DestContract

    SourceContract -.-> |indexes| GraphQL
    DestContract -.-> |indexes| GraphQL

We will focus on how to submit orders to Nexus, call the Union solidity API, and track order fulfillment. Finally we shall implement a Voyager plugin to specifically solve for our protocol.

Requirements

Our app will focus on two core operations:

Swaps: Trade between any ERC20 tokens supported by our solvers
Bridge: Move assets between supported chains, with the ability to swap during the bridge

Each operation will maintain comprehensive historical data tracking user trades, token amounts, prices at execution time, and transaction status. This data will be used for:

Displaying trade history
Calculating PnL across chains
Analyzing user trading patterns

Swaps

For our swaps, we will for now not rely on liquidity pools directly. Instead we will assume that our solver manages inventory efficiently. The solver may integrate with DEXes and choose to leverage centralized exchanges too.

Bridge

Our bridge functionality is simple: we will allow a user to select what chain to start at, and which chain to end at. Since we are building a multichain exchange, we will not allow swaps without bridging for now, although that will be relatively trivial to add.

Historic data

We will query the Union graphql API for data related to our contracts and users. For now we do not store them in another database, although if we want to do advanced analysis, that'd be the next step.

Swaps

Our swaps implementation consists of two main components. (1). Typescript code calling our exchange contract, which then (2). turns the order into a set of FungibleAssetOrders and submits it to the Union contract. We do not directly call the Union contract, because we want our own interface to provide a nice API for other smart contracts to use, as well as potentially build in governance controls.

Project Setup

Start by creating a flake.nix. We will be using foundry and using our flake to manage the environment.

{
  description = "Project Nexus";

  inputs = {
    nixpkgs.url = "github:NixOS/nixpkgs/nixpkgs-unstable";
    flake-utils.url = "github:numtide/flake-utils";
    foundry.url = "github:shazow/foundry.nix";
  };

  outputs =
    {
      self,
      nixpkgs,
      flake-utils,
      foundry,
    }:
    flake-utils.lib.eachDefaultSystem (
      system:
      let
        pkgs = import nixpkgs {
          inherit system;
          overlays = [ foundry.overlay ];
        };
      in
      {
        devShells.default = pkgs.mkShell {
          buildInputs = [
            pkgs.foundry-bin # Provides forge, cast, anvil, etc.
          ];
        };
      }
    );
}

Now you can run nix develop to activate the local environment and use forge and other tools. Verify the installation succeeded by running forge init nexus.

Next we need to install the Union evm contracts.

forge install OpenZeppelin/openzeppelin-contracts
forge install unionlabs/union@5f4607a0cba6b8db1991b1d24f08605e9ba8600e

You can choose a more recent commit hash as well by navigating to the Union monorepo.

Nexus Smart Contract

Our smart contact will have a few functions, but the most important one is the simple swap function, which accepts and Order and executes it.

struct Order {
    uint32 destinationChainId;
    bytes receiver;
    address baseToken;
    uint256 baseAmount;
    bytes quoteToken;
    uint256 quoteAmount;
    bytes32 salt;
    uint64 timeoutTimestamp;
}

Our order specifies the destinationChainId, which is where the user wants to receive their tokens. The salt is added to allow for orders of exactly the same amount and assets to function, since Union hashes orders against replay attacks, we need a way to alter that hash.

Next our swap function:

function swap(Order memory order) public {
    // 1. Get channel ID for destination chain
    // 2. Transfer tokens from user to contract
    // 3. Create fungible asset order instruction
    // 4. Call zkgm contract

We need to implement the 4 steps in our example.

Chain ID to Channel Mapping

First, we need to map destination chain IDs to Union channel IDs. Union uses channels to route orders between chains. We could compute this on the frontend when submitting orders, but we want Nexus to be callable by other smart contracts as well, hence why we store the mapping.

mapping(uint32 => uint32) public destinationToChannel;

...

function setChannelId(uint32 destinationChainId, uint32 channelId) external onlyOwner {
    destinationToChannel[destinationChainId] = channelId;
}

Token Transfer

Next, we need to handle the ERC20 token transfer from user to Nexus contract:

function swap(Order memory order) public {
        // 1. Get channel ID for destination chain
        uint32 channelId = destinationToChannel[order.destinationChainId];
        require(channelId != 0, "Invalid destination chain");

        // 2. Transfer tokens from user to contract
        IERC20(order.baseToken).safeTransferFrom(
            msg.sender,
            address(this),
            order.baseAmount
        );
}

Currently we assume the tokens will always be ERC20, which means that we cannot support native Eth. Union's transfer app handles this by optionally performing wrapping for the user. This is a good addtion to the protocol to implement in a v2.

Order Instructions

Next we will construct our FungibleAssetOrder. We use the values from the channel mapping and the order to create them, it's just a simple format operation.

function swap(Order memory order) public {
        ...

        // 3. Create fungible asset order instruction
        Instruction memory instruction = zkgm.makeFungibleAssetOrder(
            0,
            channelId,
            msg.sender,
            order.receiver,
            order.baseToken,
            order.baseAmount,
            order.quoteToken,
            order.quoteAmount
        );

}

Right now we set sender to msg.sender. This just means that on timeouts or other operational issues, the assets will be refunded to the address in the sender field. Our dex does not need to handle unfilled orders by itself.

We could also make this a field in our Order, to allow users to specify a different address, or potentially make it a smart contract address and build our own refund mechanism.

Submit the Order

To interact with the IBC contract, we will need to store it in our own contract. For now, let's pass it during construction.

IZkgm public zkgm;

// Constructor to set the zkgm contract and initialize Ownable
constructor(address _zkgm) Ownable(msg.sender) {
    require(_zkgm != address(0), "zkgm address cannot be zero");
    zkgm = IZkgm(_zkgm);
}

When submitting the order, we should provide a timeoutTimestamp. If the order isn't completed before the timout, the funds will be refunded. This timeout will ensure that if solvers do not want to handle the order (because of price fluctuations) or if there is an outage on the Union network, the user will still receive their funds.

function swap(Order calldata order) external {
        ...
        // 4. Call zkgm contract
        zkgm.send(
            channelId,
            order.timeoutTimestamp, // Could be current time + some buffer
            0, // Optional block timeout
            order.salt,
            instruction
        );
}

Deployment

Finally we will deploy our contract to Holesky, to interact directly with Union testnet.

We can obtain the zkgm address (called ucs03) from Union's deployment.json.

forge create \
    --rpc-url $HOLESKY_RPC_URL \
    --private-key $PRIVATE_KEY \
    src/Nexus.sol:Nexus --constructor-args $IBC_HANDLER

This will deploy your contract. You will still need to configure the supported routes. We will do this in the SDK section.

Extending the Contract

Once you've completed this part of the project, consider adding some additional features yourself, such as unit tests, events, or bigger features. A full codebase of the above code can be found here. Feel free to clone and tinker around if you got stuck.

Relayer Fees

Right now our code relies on the fact that the relayer is paid by the price of the base assets being higher than the quote assets (which means it is a profitable trade for the relayer). If the price delta is too small, relayers will not pick up this order. We could instead use the Batch instruction to include a relayer tip as well.

Supported Assets

Nexus will now create orders for any asset, which means that we might receive invalid orders which will always time out. Limiting the assets that we accept will prevent these errors from occurring.

Local Swaps

Right now we always submit orders to Union, but if the destinationChainId == localChainId, we could use a local dex instead.

SDK

Even though UI and design are out of scope for this guide, we will still go through interacting with our contract from Typescript. The code can be easily used inside React or Svelte applications.

Setup

For our JavaScript side logic, we will extend our flake.nix with the right tools:

{
  description = "Project Nexus";

  inputs = {
    nixpkgs.url = "github:NixOS/nixpkgs/nixpkgs-unstable";
    flake-utils.url = "github:numtide/flake-utils";
    foundry.url = "github:shazow/foundry.nix";
  };

  outputs =
    {
      self,
      nixpkgs,
      flake-utils,
      foundry,
    }:
    flake-utils.lib.eachDefaultSystem (
      system:
      let
        pkgs = import nixpkgs {
          inherit system;
          overlays = [ foundry.overlay ];
        };
      in
      {
        devShells.default = pkgs.mkShell {
          buildInputs = [
            pkgs.foundry-bin # Provides forge, cast, anvil, etc.
            pkgs.nodejs # Node.js for JavaScript/TypeScript runtime
            pkgs.nodePackages.typescript # TypeScript compiler (tsc)
            pkgs.nodePackages.ts-node
          ];
        };
      }
    );
}

We can now scaffold our SDK project. Here we use Typescript as it helps us potentially catch more bugs early on.

nix develop
mkdir sdk && cd sdk
npm init
tsc --init

Set some sensible values when prompted:

package name: (nexus) sdk
version: (1.0.0)
description: SDK for the Nexus Exchange
entry point: (index.js)
test command:
git repository:
keywords:
author:
license: (ISC) MIT

Next we setup some default file:

mkdir src
echo 'console.log("Hello, TypeScript!");' > src/index.ts

As well as that we edit our package.json to configure Typescript. Extend the script section with a build and start script:

"scripts": {
  "build": "tsc",
  "start": "ts-node src/index.ts",
}

We can now run our Typescript code by running

npm start

> sdk@1.0.0 start
> ts-node src/index.ts

Hello, TypeScript!

Dependencies and Tools

We'll leverage viem to interact with our contracts. Depending on your frontend framework, you might also want to use wagmi if you are building a frontend application.

npm install viem

We can now import items from viem and use them. Add the following line to your index.ts.

import { createPublicClient, createWalletClient, http, parseAbi } from "viem";

Have a look in the viem repository to see what other features are available.

ABI

We defined our contract logic already. Next we'll want to generate types to explain how to interact with our contract. We could redefine all the types ourselves, but it is better to parse the ABI:

const abi = parseAbi([
  `struct Order {
      uint32 destinationChainId,
      bytes receiver,
      address baseToken,
      uint256 baseAmount,
      bytes quoteToken,
      uint256 quoteAmount,
      bytes32 salt
    }`,
  `function swap(Order order) external`,
]);

Here we copied portion's of our ABI in index.ts. Even better is to actually point it to our contracts and generate bindings. For larger contracts and complex codebases, we recommend doing so.

Interacting with Nexus

In this example, we will start a swap from Ethereum to other chains, so we will instantiate just a single client. In a real app, we would keep a record of chainIds to clients, and use a different client depending on the source chain.

import { mainnet } from "viem/chains";
import { mnemonicToAccount } from "viem/accounts";

const publicClient = createPublicClient({
  chain: mainnet,
  transport: http(),
});

// In a frontend app, we'd use the wallet extension instead of this one.
const account = mnemonicToAccount(
  "test test test test test test test test test test test junk",
);
const walletClient = createWalletClient({
  account,
  chain: mainnet,
  transport: http(),
});

Our swap function is a simple contract call. We will first perform a simulation to verify it succeeds. Most likely, users will first need to grant an allowance to the Nexus contract before performing the swap.

async function swap(order: Order) {
  const { request } = await publicClient.simulateContract({
    address: nexusAddress,
    abi,
    functionName: "swap",
    args: [order],
  });

  const hash = await walletClient.writeContract(request);

  return hash;
}

To perform a swap, we call the function:

const order = {
  destinationChainId: 43114,
  receiver: "0x1234...",
  baseToken: "0xabcd...",
  baseAmount: BigInt("1000000000000000000"),
  quoteToken: "0x5678...",
  quoteAmount: BigInt("2000000000000000000"),
  salt: "0x1",
} as const;

const txHash = await swap(order);
console.log({ txHash });

Since we do not have a relayer running at the moment for our protocol, this will most likely not be processed. In the next section we shall configure a personal Voyager instance and ensure it has liquidity to solve for our protocol. Currently this call will fail, because we haven't whitelisted any routes yet. We will set that configuration now as well.

We can fetch 'recommended' channels from the API. Here we are looking for channels which use zkgm. The returned value shows you all available routes starting from Holesky.

gq https://graphql.union.build/v1/graphql -q '
query RecommendedChannelsSepolia @cached(ttl: 60) {
  v2_channels(args: {
    p_limit: 5,
    p_recommended: true,
    p_source_universal_chain_id: "ethereum.11155111"
  }) {
    source_universal_chain_id
    source_client_id
    source_connection_id
    source_channel_id
    source_port_id

    destination_universal_chain_id
    destination_client_id
    destination_connection_id
    destination_channel_id
    destination_port_id

    version
  }
}'

nix shell nixpkgs#nodePackages.graphqurl

We can set the route in Nexus by making a call with our deployer private key, using the setChannelId function. We will write a Typescript helperfunction again. First we extend the ABI definition:

const abi = parseAbi([
    ...,
  `function setChannelId(uint32 destinationChainId, uint32 channelId)`,
]);

And then we define our helper function:

async function setChannelId(destinationChainId: number, channelId: number) {
  const { request } = await publicClient.simulateContract({
    address: nexusAddress,
    abi,
    functionName: "setChannelId",
    args: [destinationChainId, channelId],
  });

  const hash = await walletClient.writeContract(request);

  return hash;
}

We can call this using our admin private key (update the publicClient) and call the function with the right chainId and channelId to set the route.

Now our swap function will succeed and enqueue a swap.

Indexing

Once the swap is enqueued and we receive the txHash, we can monitor it's progression through the indexer. We can query the details using gq again, but we will leave that up for you to figure out.

Inside our app, we should perodically poll (once every 3 seconds is reasonable). That way, we will see additional traces appear, which we can use to track the transfer progression. For executing the queries, we'll leverage apollo.

import { ApolloClient, InMemoryCache, gql } from "@apollo/client";

const client = new ApolloClient({
  uri: "https://development.graphql.union.build/v1/graphql",
  cache: new InMemoryCache(),
});

const PACKET_QUERY = gql`
  query GetPacket($txHash: String!) {
    v2_packets(args: { p_transaction_hash: $txHash }) {
      source_universal_chain_id
      destination_universal_chain_id
      decoded
      traces {
        type
        block_hash
        transaction_hash
        event_index
      }
    }
  }
`;

Apollo will so some typechecking and smart caching for us, which is very helpful. Notice how we now pass the txHash as an argument to the PACKET_QUERY as well.

For our poll function, we will continiously poll until we see the PACKET_RECV trace, which means that the packet has been received on the destination side. In actual frontends, we will want to do something similiar such as periodic polling, but connect these to our effects or stores.

async function pollPacketStatus(txHash: string) {
  const interval = setInterval(async () => {
    try {
      const { data } = await client.query({
        query: PACKET_QUERY,
        variables: { txHash },
        fetchPolicy: "network-only", // Don't use cache
      });

      const packet = data.v2_packets[0];
      if (packet) {
        console.log({ packet });

        // Optional: Stop polling if we see a completion trace
        if (packet.traces.some((t) => t.type === "PACKET_RECV")) {
          clearInterval(interval);
        }
      }
    } catch (error) {
      console.error("Error polling packet:", error);
    }
  }, 3000);

  // Cleanup after 5 minutes to prevent indefinite polling
  setTimeout(() => clearInterval(interval), 300000);

  return () => clearInterval(interval); // Return cleanup function
}

pollPacketStatus(txHash);

We now have code to submit and track orders. In the next section, we shall see how to inspect historic orders for specific accounts and how to perform aggregate statistics on them.

Next Steps

The Typescript code is still very limited, we lack ways to perform admin specific operations, as well as handling approvals, or querying for whitelisted assets.

Light Clients

Trust-minimized interoperability protocols use light clients to secure message passing between blockchains. Light clients can be implemented as smart contracts, Cosmos SDK modules, or components within wallets. Their fundamental purpose is to verify the canonicity of new blocks—confirming that a block is a valid addition to a blockchain—without requiring the full block data.

Block Structure Fundamentals

A blockchain block typically consists of two main sections:

Header: Contains metadata about the block, including:

Block producer information
Block height and timestamp
Previous block hash
State root (a cryptographic summary of the blockchain's state)
Transaction root (a Merkle root of all transactions in the block)
Other consensus-specific data

Body: Contains the complete list of transactions included in the block.

The header has a fixed size (typically a few hundred bytes), while the body's size varies dramatically based on the number and complexity of transactions. This size difference is crucial for understanding light client efficiency.

The key distinction between light clients and full nodes lies in their data requirements:

Light clients only process block headers, which enables efficient verification with minimal data (kilobytes instead of megabytes or gigabytes)
Full nodes process both headers and bodies, requiring significantly more computational resources and storage

This efficiency makes light clients ideal for cross-chain communication, mobile applications, and resource-constrained environments.

Light clients achieve security through cryptographic verification rather than data replication. They:

Track validator sets from the source blockchain
Verify consensus signatures on new block headers
Validate state transitions through cryptographic proofs
Maintain only the minimal state required for validation

This approach ensures that even with minimal data, light clients can detect invalid or malicious blocks.

Light clients form the backbone of trustless bridge infrastructure:

Smart contract-based light clients on Ethereum can verify Cosmos chain blocks
Cosmos modules can verify Ethereum blocks using embedded light clients
Cross-rollup communication can leverage light client technology for L2-to-L2 messaging

When implemented as bridge components, light clients enable secure cross-chain asset transfers and message passing without requiring trusted third parties.

Wallets and User Interfaces

Modern wallet implementations increasingly incorporate light client technology:

Mobile wallets can verify transactions without syncing the entire chain
Browser extensions can validate state without backend reliance
Hardware wallets can verify complex operations with limited resources

This improves both security and user experience by reducing dependency on remote (RPC) servers.

Ethereum Light Client Deep Dive

Ethereum's light client protocol is particularly significant for Union's architecture. It uses a combination of:

Consensus verification: Validating signatures from the beacon chain's validator set
Sync committees: Tracking rotating sets of validators for efficient verification
Merkle proofs: Verifying transaction inclusion and state values without downloading the full state

Ethereum light clients can securely validate blocks with just a few kilobytes of data, compared to the hundreds of megabytes required for full validation. This efficiency makes them ideal for cross-chain applications.

In subsequent sections, we'll examine how Union leverages these light client principles to secure cross-chain communication and explore implementation details of the Ethereum light client that secures a significant portion of Union's traffic.

Ethereum Light Client

This chapter explores the architecture and operation of the Ethereum light client, along with the cryptographic foundations that ensure the validity of its proofs. We'll begin with a high-level overview of the protocol before diving into the technical details of each component.

Protocol Overview

A light client exposes several public functions that can be invoked by external parties. Each function call modifies the client's internal state S, storing critical data that informs subsequent operations. The light client lifecycle can be divided into three distinct phases:

Initialization: Occurs exactly once when the client is created
Updating: Called repeatedly to verify each new block
Freezing: Invoked once if malicious behavior is detected

Initialization typically happens during contract deployment for smart contract-based light clients. During this phase, we provide the client with its initial trusted state, which can be either:

The genesis block of the chain
A recent, well-established checkpoint block

This initial block is known as the "trusted height" and serves as the foundation for all subsequent verifications. Since this block is assumed to be correct without cryptographic verification within the light client itself, its selection is critical. In production environments, a governance proposal or similar consensus mechanism often validates this block before it's passed to the light client.

Once initialized, the light client can begin verifying new blocks. The update function accepts a block header and associated cryptographic proofs, then:

Verifies the header's cryptographic integrity
Validates the consensus signatures from the sync committee
Updates the client's internal state to reflect the new "latest verified block"

Updates can happen in sequence (verifying each block) or can skip intermediate blocks using more complex proof mechanisms. The efficiency of this process is what makes light clients practical for cross-chain communication.

If the light client detects conflicting information or invalid proofs that suggest an attack attempt, it can enter a "frozen" state. This is a safety mechanism that prevents the client from processing potentially fraudulent updates. Recovery from a frozen state typically requires governance intervention.

Since initialization is rather trivial, we will not dive deeper into it.

Updating

Since Ethereum is finalized by the Beacon Chain, our Ethereum light client accepts beacon block data as update input. A beacon block roughly has this structure:

class BeaconBlockBody(Container):
    randao_reveal: BLSSignature
    eth1_data: Eth1Data
    graffiti: Bytes32
    proposer_slashings: List[ProposerSlashing, MAX_PROPOSER_SLASHINGS]
    attester_slashings: List[AttesterSlashing, MAX_ATTESTER_SLASHINGS]
    attestations: List[Attestation, MAX_ATTESTATIONS]
    deposits: List[Deposit, MAX_DEPOSITS]
    voluntary_exits: List[SignedVoluntaryExit, MAX_VOLUNTARY_EXITS]
    sync_aggregate: SyncAggregate
    execution_payload: ExecutionPayload
    bls_to_execution_changes: List[SignedBLSToExecutionChange, MAX_BLS_TO_EXECUTION_CHANGES]

We are specifically interested in sync_aggregate, which is a structure describing the votes of the sync committee:

class SyncAggregate(Container):
    sync_committee_bits: Bitvector[SYNC_COMMITTEE_SIZE]
    sync_committee_signature: BLSSignature

The sync_committee_bits indicate which members voted (not all need to vote), and the sync_committee_signature is a BLS signature of the members referenced in the bit vector.

BLS signatures (Boneh-Lynn-Shacham) are a type of cryptographic signature scheme that allows multiple signatures to be aggregated into a single signature. This makes them space and compute efficient (you can aggregate hundreds of signatures into one). Just as we aggregate signatures, we can aggregate public keys as well, such that the aggregate public key can verify the aggregated signature.

For our SyncAggregate, computing the aggregate pubkey is simple:

def _aggregate_pubkeys(committee, bits):
    pubkeys = []
    for i, bit in enumerate(bits):
        if bit:
            pubkeys.append(committee[i])
    return bls.Aggregate(pubkeys)

At scale, we can aggregate thousands (if not hundreds of thousands) of signatures and public keys, while only verifying their aggregates.

To our light client, as long as a majority of sync committee members have attested the block, it is considered final.

class LightClient():
    def update(self, block: BeaconBlockBody):

        # Count how many committee members signed
        signature_count = sum(sync_aggregate.sync_committee_bits)

        # Need 2/3+ committee participation for finality
        if signature_count < (SYNC_COMMITTEE_SIZE * 2) // 3:
            raise ValueError("Insufficient signatures from sync committee")

        # Construct aggregate public key from the current committee and bit vector
        aggregate_pubkey = _aggregate_pubkeys(
            self.current_sync_committee,
            block.sync_aggregate.sync_committee_bits
        )

Now we have the aggregate_pubkey for the committee, as well as verifying that enough members have signed. Notice that to obtain the sync committee public keys, we used self.current_sync_committee. This is set during initialization, and later updated in our update function.

Next we have to construct the digest (what has been signed) before we verify the aggregated signature. If we didn't compute the digest ourselves, but obtained it from the block, then the caller could fraudulently pass a correct digest, but have other values in the block altered.

    signing_root = self._compute_signing_root(block)

    # Verify the aggregated signature against the aggregated public key
    if not bls.Verify(
        aggregate_pubkey,
        signing_root,
        sync_aggregate.sync_committee_signature
    ):
        raise ValueError("Invalid sync committee signature")

Since the signature and block are both valid, we can now trust the contents of the passed beacon block. Next the light client will store data from the block:

    self.latest_block_root = self._compute_block_root(block)
    self.latest_slot = block.slot

Finally, we have to update the sync committee. The committee rotates every sync committee period (256 epochs), and thus if this is at the boundary, we have to update these values. Luckily Ethereum makes this easy for us, and provides what the next sync committee will be:

    if slot % (SLOTS_PER_EPOCH * EPOCHS_PER_SYNC_COMMITTEE_PERIOD) == 0:
        self.current_sync_committee = self.next_sync_committee
        self.next_sync_committee = block.next_sync_committee

SLOTS_PER_EPOCH and EPOCHS_PER_SYNC_COMMITTEE_PERIOD can be hardcoded, or stored in the light client state. Each epoch is 32 slots (approximately 6.4 minutes), so a full sync committee period lasts about 27.3 hours.

With this relatively simple protocol, we now have a (python) smart contract that can track Ethereum's blocks.

Optimizations

In actuality, the beacon block is still too large for a light client. The actual light client uses the LightClientHeader data structure, which consists of a beacon header and execution header.

The beacon header is used to prove the consensus and transition the internal state, as well as immediately prove that the execution header is valid. The block height in the execution header is then used for further client operations, such as transaction timeouts. Using the execution height instead of the beacon height for timeouts has advantages for users and developers, ensuring they do not even need to be aware of the Beacon Chain's existence.

Another significant optimization relates to signature aggregation. Since the majority of the sync committee always signs, we instead aggregate the public keys of the non-signers, and subtract that from the aggregated total. Effectively, if on average 90% of members sign, we submit the 10% that did not sign. This results in an approximate 80% computational reduction (by avoiding the need to process 90% of the signatures individually), as well as reducing the size of the client update transaction.

Inclusion Proofs

Now that we understand how to verify blocks using a light client, we will show how based on these verified blocks, we can verify state proofs, and in extension of that, messages and transfers. We will explore how to efficiently prove that a piece of data is included in a larger dataset without needing to reveal or access the entire dataset.

Merkle Trees

A Merkle tree (or hash tree) is a data structure that allows for efficient and secure verification of content in a large body of data. Named after Ralph Merkle, who patented it in 1979, these trees are fundamental building blocks in distributed systems and cryptographic applications.

Structure of a Merkle Tree

A Merkle tree is a binary tree where:

Each leaf node contains the hash of a data block
Each non-leaf node contains the hash of the concatenation of its child nodes

Let's visualize a simple Merkle tree with 4 data blocks:

graph TD
    Root["Root Hash: H(H1 + H2)"] --> H1["H1: H(H1-1 + H1-2)"]
    Root --> H2["H2: H(H2-1 + H2-2)"]
    H1 --> H1-1["H1-1: Hash(Data 1)"]
    H1 --> H1-2["H1-2: Hash(Data 2)"]
    H2 --> H2-1["H2-1: Hash(Data 3)"]
    H2 --> H2-2["H2-2: Hash(Data 4)"]

In this diagram:

We start with four data blocks: Data 1, Data 2, Data 3, and Data 4
We compute the hash of each data block to create the leaf nodes (H1-1, H1-2, H2-1, H2-2)
We then pair the leaf nodes and hash their concatenated values to form the next level (H1, H2)
Finally, we concatenate and hash the results to get the root hash

The root hash uniquely represents the entire dataset. If any piece of data in the tree changes, the root hash will also change. Root hashes are present in block headers for most blockchains. Ethereum for example has thestate_root field in each header. With the state root, we can construct a proof for any data stored on Ethereum, so whenever we write value V to storage in a solidty smart contract at block H, we can construct a proof to show that from H onwards, that slot contains V. This proof will be valid until we update or delete the value.

def prove(state_root, proof, V) -> Boolean

For Merkle trees specifically, we construct Merkle Inclusion proofs. Constructing the proof is relatively computationally intensive and requires access to the full state and history, so only archive nodes are capable of doing so.

Inclusion Proofs

An inclusion proof (also called a Merkle proof) is a way to verify that a specific data block is part of a Merkle tree without having to reveal the entire tree.

An inclusion proof consists of:

The data block to be verified
A "proof path" - a list of hashes that, combined with the data block's hash in the right order, will reproduce the root hash

Let's visualize how a Merkle proof works for Data 2 in our example:

In this visualization, we're proving that Data 2 is included in the tree. The pink node is the data we're proving (Data 2), and the blue nodes represent the proof path hashes we need.

The proof for Data 2 consists of:

The data itself: Data 2
The proof path: [H1-1, H2]

graph TD
    Root["Root Hash: H(H1 + H2)"] --> H1["H1: H(H1-1 + H1-2)"]
    Root --> H2["H2: H(H2-1 + H2-2)"]
    H1 --> H1-1["H1-1: Hash(Data 1)"]
    H1 --> H1-2["H1-2: Hash(Data 2)"]
    H2 --> H2-1["H2-1: Hash(Data 3)"]
    H2 --> H2-2["H2-2: Hash(Data 4)"]

To verify that Data 2 is indeed part of the Merkle tree with root hash R, a verifier would:

Compute H1-2 = Hash(Data 2)
Compute H1 = Hash(H1-1 + H1-2) using the provided H1-1
Compute Root = Hash(H1 + H2) using the provided H2
Compare the computed Root with the known Root hash

If they match, it proves Data 2 is part of the tree. Effectively we recompute the state root lazily.

For a Merkle tree with n leaves, the proof size and verification time are both O(log n), making it highly efficient even for very large datasets.

For example, in a tree with 1 million leaf nodes:

A full tree would require storing 1 million hashes
A Merkle proof requires only about 20 hashes (log₂ 1,000,000 ≈ 20)

Message Verification using Inclusion Proofs

Now that we have a clearly defined model on how to get blocks from chain source on chain destination using a light client, and how to prove state from source on destination using source's state root, we will show a simple model on how to securely perform cross chain messaging using a state proof.

First, we need to write our message to a known storage location on the source chain. This is typically done through a smart contract:

// On source chain
contract MessageSender {
    // Maps message IDs to actual messages
    mapping(uint256 => bytes) public messages;

    // Event emitted when a new message is stored
    event MessageStored(uint256 indexed messageId, bytes message);

    function sendMessage(bytes memory message) public returns (uint256) {
        uint256 messageId = hash(message)
        messages[messageId] = message;
        emit MessageStored(messageId, message);
        return messageId;
    }
}

When sendMessage is called, the message is stored in the contract's state at a specific storage slot that can be deterministically calculated from the messageId.

Next, we need to update the light client on the destination chain to reflect the latest state of the source chain:

// On destination chain
contract LightClient {
    // Latest verified block header from source chain
    BlockHeader public latestHeader;

    function updateBlockHeader(BlockHeader memory newHeader, Proof memory proof) public {
        // Verify the proof that newHeader is a valid successor to latestHeader
        require(verifyProof(latestHeader, newHeader, proof), "Invalid block proof");

        // Update the latest header
        latestHeader = newHeader;
    }

    function getStateRoot() public view returns (bytes32) {
        return latestHeader.stateRoot;
    }
}

The light client maintains a record of the latest verified block header, which includes the state root of the source chain. Regular updates to this light client ensure that the destination chain has access to recent state roots.

Finally, we can prove the existence of the message on the destination chain using a Merkle inclusion proof against the state root:

// On destination chain
contract MessageReceiver {
    LightClient public lightClient;
    address public sourceSenderContract;

    constructor(address _lightClient, address _sourceSender) {
        lightClient = LightClient(_lightClient);
        sourceSenderContract = _sourceSender;
    }

    function verifyAndProcessMessage(
        uint256 messageId,
        bytes memory message,
        bytes32[] memory proofNodes,
        uint256 proofPath
    ) public {
        // Get the latest state root from the light client
        bytes32 stateRoot = lightClient.getStateRoot();

        // Calculate the storage slot for this message in the source contract
        bytes32 storageSlot = keccak256(abi.encode(messageId, uint256(1))); // Slot for messages[messageId]

        // Verify the inclusion proof against the state root
        require(
            verifyStorageProof(
                stateRoot,
                sourceSenderContract,
                storageSlot,
                message,
                proofNodes,
                proofPath
            ),
            "Invalid state proof"
        );

        // Message is verified, now process it
        processMessage(messageId, message);
    }

    function processMessage(uint256 messageId, bytes memory message) internal {
        // Application-specific message handling
        // ...
    }

    function verifyStorageProof(
        bytes32 stateRoot,
        address contractAddress,
        bytes32 slot,
        bytes memory expectedValue,
        bytes32[] memory proofNodes,
        uint256 proofPath
    ) internal pure returns (bool) {
        // This function verifies a Merkle-Patricia trie proof
        // It proves that at the given storage slot in the specified contract,
        // the value matches expectedValue in the state with stateRoot

        // Implementation details omitted for brevity
        // This would use the proofNodes and proofPath to reconstruct the path
        // from the leaf (storage value) to the state root

        return true; // Placeholder
    }
}

This mechanism ensures that messages can only be processed on the destination chain if they were genuinely recorded on the source chain, without requiring trust in any intermediaries. The security of the system relies on:

The integrity of the light client, which only accepts valid block headers
The cryptographic properties of Merkle trees, which make it impossible to forge inclusion proofs
The immutability of blockchain state, which ensures the message cannot be altered once written

By combining light client verification with state inclusion proofs, we establish a trustless bridge for cross-chain communication that maintains the security properties of both blockchains.

ZK

In the previous chapter, we learned how a simple light client operates. In this chapter, we will look into leveraging zero-knowledge cryptography to improve the onchain efficiency of our light client.

Recall how a light client verifies that a block is canonical. In the case of Ethereum, we track the sync committee and aggregate the BLS public key. The signature already comes pre-aggregated in the block.

def _aggregate_pubkeys(committee, bits):
    pubkeys = []
    for i, bit in enumerate(bits):
        if bit:
            pubkeys.append(committee[i])
    return bls.Aggregate(pubkeys)

For other chains, pre-aggregation might not occur. For example, Tendermint simply has each validator's signature (and vote) appended in the block. This means that to verify if the block is canonical, we have to perform a signature verification for each validator individually. Here is a pseudo-Python Tendermint block verifier (it doesn't handle voting correctly and misses some components).

import base64
import json
from typing import List, Dict, Any
import nacl.signing

def verify_tendermint_votes(block_data: Dict[Any, Any]) -> bool:
    """Verifies validator signatures for a Tendermint block."""
    # Extract block ID and validator votes
    block_id = block_data["block_id"]["hash"]
    precommits = block_data["last_commit"]["signatures"]
    validators = block_data["validators"]
    
    # Track validation results
    valid_votes = 0
    total_votes = len(precommits)
    
    for vote in precommits:
        if vote["signature"] is None:
            continue
            
        validator_idx = vote["validator_address"]
        validator_pubkey = validators[validator_idx]["pub_key"]["value"]
        
        # Decode signature and public key
        signature = base64.b64decode(vote["signature"])
        pubkey = base64.b64decode(validator_pubkey)
        
        # Create vote message (block ID + vote data)
        vote_data = {
            "type": "precommit",
            "height": block_data["header"]["height"],
            "round": vote["round"],
            "block_id": block_id,
            "timestamp": vote["timestamp"],
            "validator_address": validator_idx
        }
        msg = json.dumps(vote_data, sort_keys=True).encode()
        
        # ANCHOR: verify-signature
        verify_key = nacl.signing.VerifyKey(pubkey)
        try:
            verify_key.verify(msg, signature)
            valid_votes += 1
        except nacl.exceptions.BadSignatureError:
            pass
        # ANCHOR_END: verify-signature

    # Return true if 2/3+ of validators had valid signatures
    return valid_votes >= (2 * total_votes // 3 + 1)

Note how for each vote, we perform:

verify_key = nacl.signing.VerifyKey(pubkey)
try:
    verify_key.verify(msg, signature)
    valid_votes += 1
except nacl.exceptions.BadSignatureError:
    pass

Although this is just a single verify operation, computationally it is quite expensive. Doing this in Solidity would mean that we would spend about ±2 million gas per block verification. This also means that with more validators operational, we have a linear increase in computational cost. This cost translates into a higher fee for end users, making it something we want to avoid.

We can leverage zero-knowledge cryptography to have a constant computational cost, irrespective of the number of signatures we verify, as well as perform arbitrary other computation, such as vote-weight tallying.

First we will explore how to leverage Gnark to build a high performance circuit, analyzing an actual production circuit. Next we will re-implement the same logic using a zkvm.

Intro to Gnark

Gnark is a zkSNARK library developed by ConsenSys that allows developers to design and implement zero-knowledge circuits in Go. Unlike other circuit libraries that use domain-specific languages, Gnark lets you write circuits directly in Go, making it more accessible to developers already familiar with the language.

At its core, a zero-knowledge circuit defines a computational relationship between public inputs and private witness values. When we generate a proof, we're essentially proving that we know witness values that satisfy this relationship, without revealing those values.

When using Gnark, we define a struct that describes our computation, and as member fields, the inputs of our computation. Below we have a circuit which proves knowledge of Sudoku solutions. The two inputs are Challenge, which is the puzzle, and Solution, which is the completely filled in puzzle.

// SudokuCircuit represents a Sudoku circuit. It contains two grids: the
// challenge and solution grids (named Challenge and Solution respectively). The
// challenge grid is public, while the solution grid is private.
type SudokuCircuit struct {
	Challenge SudokuGrid `gnark:"Challenge,public"`
	Solution  SudokuGrid `gnark:"Solution,secret"`
}

Our Sudokugrid type is just a simple 9x9 matrix:

// SudokuGrid represents a 9x9 Sudoku grid in-circuit.
type SudokuGrid [9][9]frontend.Variable

Next is where we define the actual circuit. In this book, we will not go into concepts such as arithmetization or rc1s, although we will see some of that in the actual code. For now it suffices to know, that we define rules that our inputs have to abide by. We refer to these rules as constraints.

For a sudoku solution to be checked, we have to verify the following:

Each cell may only have a value from 1 to 9.
Each row may only contain unique values.
Each column may only contain unique values.
Each 3x3 sub grid may only contain unique values.
The solution is for the given puzzle.

Especially rule 5 is easy to forget. If we would not implement that, our checker would allow any valid solved sudoku to solve our puzzle.

We define the constraints inside the Define function, which accepts an api frontend.API, which is the object we use to actually define our circuit.

// Define defines the constraints of the Sudoku circuit.
func (circuit *SudokuCircuit) Define(api frontend.API) error {

For each rule, we call various assertions on the api. These assertions are not run immediately. Instead we are building a program under the hood, to be run later against actual variables. The actual for loop is not part of our circuit either, instead it is just execution each instruction one-by-one, effectively unrolling the loop.

// Constraint 1: Each cell value in the CompleteGrid must be between 1 and 9
for i := 0; i < 9; i++ {
	for j := 0; j < 9; j++ {
		api.AssertIsLessOrEqual(circuit.Solution[i][j], 9)
		api.AssertIsLessOrEqual(1, circuit.Solution[i][j])
	}
}

For rules 2, 3, and 4, the implementation is similar

// Constraint 2: Each row in the CompleteGrid must contain unique values
for i := 0; i < 9; i++ {
	for j := 0; j < 9; j++ {
		for k := j + 1; k < 9; k++ {
			api.AssertIsDifferent(circuit.Solution[i][j], circuit.Solution[i][k])
		}
	}
}

// Constraint 3: Each column in the CompleteGrid must contain unique values
for j := 0; j < 9; j++ {
	for i := 0; i < 9; i++ {
		for k := i + 1; k < 9; k++ {
			api.AssertIsDifferent(circuit.Solution[i][j], circuit.Solution[k][j])
		}
	}
}

// Constraint 4: Each 3x3 sub-grid in the CompleteGrid must contain unique values
for boxRow := 0; boxRow < 3; boxRow++ {
	for boxCol := 0; boxCol < 3; boxCol++ {
		for i := 0; i < 9; i++ {
			for j := i + 1; j < 9; j++ {
				row1 := boxRow*3 + i/3
				col1 := boxCol*3 + i%3
				row2 := boxRow*3 + j/3
				col2 := boxCol*3 + j%3
				api.AssertIsDifferent(circuit.Solution[row1][col1], circuit.Solution[row2][col2])
			}
		}
	}
}

Rule 5 uses the Select method, which is how we can implement if statements in the circuit.

// Constraint 5: The values in the IncompleteGrid must match the CompleteGrid where provided
for i := 0; i < 9; i++ {
	for j := 0; j < 9; j++ {
		isCellGiven := api.IsZero(circuit.Challenge[i][j])
		api.AssertIsEqual(api.Select(isCellGiven, circuit.Solution[i][j], circuit.Challenge[i][j]), circuit.Solution[i][j])
	}
}

Actually generating a proof requires compiling and setting up the circuit.

// perform the setup. NB! In practice use MPC. This is currently UNSAFE
// approach.
pk, vk, err := groth16.Setup(ccs)
if err != nil {
	return fmt.Errorf("failed to setup circuit: %v", err)
}
// compile the circuit
ccs, err := frontend.Compile(ecc.BN254.ScalarField(), r1cs.NewBuilder, &SudokuCircuit{})
if err != nil {
	return fmt.Errorf("failed to compile circuit: %v", err)
}

In this code example, we are doing an unsafe setup, which makes the circuit unsafe for actual usage. Union ran a setup ceremony, where multiparty computation is used to make the setup safe and production ready. The circuit, vk and pk can be written to disk for later usage; we should only compile and setup once.

With these values, and some sample inputs, we can start generating proofs.

// create the circuit assignments
assignment := &SudokuCircuit{
	Challenge: NewSudokuGrid(nativeChallenge),
	Solution:  NewSudokuGrid(nativeSolution),
}
// create the witness
witness, err := frontend.NewWitness(assignment, ecc.BN254.ScalarField())
if err != nil {
	return fmt.Errorf("failed to create witness: %v", err)
}
// generate the proof
proof, err := groth16.Prove(&ccs, &pk, witness)
if err != nil {
	return fmt.Errorf("failed to generate proof: %v", err)
}

We could now serialize the proof, and send it to a verifier (someone on a different machine).

// create the circuit assignments
assignment := &SudokuCircuit{
	Challenge: NewSudokuGrid(nativeChallenge),
	Solution:  NewSudokuGrid(nativeSolution),
}
// create the witness
witness, err := frontend.NewWitness(assignment, ecc.BN254.ScalarField())
if err != nil {
	return fmt.Errorf("failed to create witness: %v", err)
}
// generate the proof
proof, err := groth16.Prove(&ccs, &pk, witness)
if err != nil {
	return fmt.Errorf("failed to generate proof: %v", err)
}

As you can see, they have only access to the challenge, not the solution. The proof will verify that the prover has a valid solution, without leaking information on that solution.

Now that we know the basics of Gnark, let's analyze the light client circuit. We will see that the applied techniques are very similar to the Sudoku grid. Effectively we are re-implmenenting cryptographic primitives using the frontend.API.

Light Client Circuit

Before reading this section, make sure that you are familiar with Gnark. In this chapter, we will analyze a light client circuit, verifying a modified Tendermint consensus (CometBLS).

At the heart of our light client is a data structure representing validators and their signatures:

type Validator struct {
	HashableX    frontend.Variable
	HashableXMSB frontend.Variable
	HashableY    frontend.Variable
	HashableYMSB frontend.Variable
	Power        frontend.Variable
}

type TendermintLightClientInput struct {
	Sig           gadget.G2Affine
	Validators    [MaxVal]Validator
	NbOfVal       frontend.Variable
	NbOfSignature frontend.Variable
	Bitmap        frontend.Variable
}

Each validator is represented by its public key coordinates (stored in a special format to work within field size limitations) and voting power. The TendermintLightClientInput combines these validators with signature data and metadata such as the number of validators and a bitmap indicating which validators have signed. This is the equivalent of our LightClientHeader as seen in the light client chapter.

The circuit uses several helper functions to efficiently manipulate large field elements:

// Given a variable of size N and limbs of size M, split the variable in N/M limbs.
func Unpack(api frontend.API, packed frontend.Variable, sizeOfInput int, sizeOfElem int) []frontend.Variable {
	nbOfElems := sizeOfInput / sizeOfElem
	if sizeOfElem == 1 {
		return api.ToBinary(packed, nbOfElems)
	} else {
		unpacked := api.ToBinary(packed, sizeOfInput)
		elems := make([]frontend.Variable, nbOfElems)
		for i := 0; i < nbOfElems; i++ {
			elems[i] = api.FromBinary(unpacked[i*sizeOfElem : (i+1)*sizeOfElem]...)
		}
		return elems
	}
}

// Reconstruct a value from it's limbs.
func Repack(api frontend.API, unpacked []frontend.Variable, sizeOfInput int, sizeOfElem int) []frontend.Variable {
	nbOfElems := sizeOfInput / sizeOfElem
	elems := make([]frontend.Variable, nbOfElems)
	for i := 0; i < nbOfElems; i++ {
		elems[i] = api.FromBinary(unpacked[i*sizeOfElem : (i+1)*sizeOfElem]...)
	}
	return elems
}

The Unpack function splits a variable into smaller components, while Repack does the opposite. These functions are needed when working with cryptographic values that exceed the size of the prime field.

The core logic of the light client is in the Verify method:

// Union whitepaper: Algorithm 2. procedure V
func (lc *TendermintLightClientAPI) Verify(message *gadget.G2Affine, expectedValRoot frontend.Variable, powerNumerator frontend.Variable, powerDenominator frontend.Variable) error {

This function verifies that:

The validator set matches a known root hash
A sufficient number of validators (by voting power) have signed
The signature is cryptographically valid

Let's break down the steps in the verification process:

lc.api.AssertIsLessOrEqual(lc.input.NbOfVal, MaxVal)
lc.api.AssertIsLessOrEqual(lc.input.NbOfSignature, lc.input.NbOfVal)
// Ensure that at least one validator/signature are provided
lc.api.AssertIsLessOrEqual(1, lc.input.NbOfSignature)

These constraints ensure basic properties: the number of validators doesn't exceed the maximum, the number of signatures doesn't exceed the number of validators, and there's at least one signature. Next the circuit defines a helper closure with logic to be executed for each validator.

// Facility to iterate over the validators in the lc, this function will
// do the necessary decoding/marshalling for the caller.
//
// This function will reconstruct each validator from the secret inputs by:
// - re-composing the public key from its shifted/msb values
forEachVal := func(f func(i int, signed frontend.Variable, cannotSign frontend.Variable, publicKey *gadget.G1Affine, power frontend.Variable, leaf frontend.Variable) error) error {
...

Note that the function accepts another closure, which it will call after reconstructing some values and adding constraints.

Inside this function, for each validator, we:

Compute a hash of the validator's data (similar to a Merkle leaf)

validator := lc.input.Validators[i]
h, err := mimc.NewMiMC(lc.api)
if err != nil {
	return fmt.Errorf("new mimc: %w", err)
}
// Union whitepaper: (11) H_pre
//
h.Write(validator.HashableX, validator.HashableY, validator.HashableXMSB, validator.HashableYMSB, validator.Power)
leaf := h.Sum()

Reconstruct the full public key by combining its components

// Reconstruct the public key from the merkle leaf
/*
   pk = (val.pk.X | (val.pk.XMSB << 253), val.pk.Y | (val.pk.YMSB << 253))
*/
shiftedX := Unpack(lc.api, validator.HashableX, 256, 1)
shiftedX[253] = validator.HashableXMSB
unshiftedX := Repack(lc.api, shiftedX, 256, 64)

shiftedY := Unpack(lc.api, validator.HashableY, 256, 1)
shiftedY[253] = validator.HashableYMSB
unshiftedY := Repack(lc.api, shiftedY, 256, 64)

var rebuiltPublicKey gadget.G1Affine
rebuiltPublicKey.X.Limbs = unshiftedX
rebuiltPublicKey.Y.Limbs = unshiftedY

Determine if this validator has signed by checking the bitmap

cannotSign := lc.api.IsZero(bitmapMask)

Apply the provided function to process this validator. This is where we pass an additional closure to calculate aggregated values over the entire validator set. This is a pattern often used in functional programming.

aggregatedPublicKey, nbOfKeys, err := bls.WithAggregation(
	func(aggregate func(selector frontend.Variable, publicKey *sw_emulated.AffinePoint[emulated.BN254Fp])) error {
		if err := forEachVal(func(i int, signed frontend.Variable, cannotSign frontend.Variable, publicKey *gadget.G1Affine, power frontend.Variable, leaf frontend.Variable) error {
			actuallySigned := lc.api.Select(cannotSign, 0, signed)
			// totalVotingPower = totalVotingPower + power
			totalVotingPower = lc.api.Add(totalVotingPower, lc.api.Select(cannotSign, 0, power))
			// currentVotingPower = currentVotingPower + if signed then power else 0
			currentVotingPower = lc.api.Add(currentVotingPower, lc.api.Select(actuallySigned, power, 0))
			// Optionally aggregated public key if validator at index signed
			aggregate(actuallySigned, publicKey)
			leafHashes[i] = lc.api.Select(cannotSign, 0, merkle.LeafHash([]frontend.Variable{leaf}))
			return nil
		}); err != nil {
			return err
		}
		return nil
	})
if err != nil {
	return err
}

The aggregated values of interest are:

totalVotingPower := frontend.Variable(0)
currentVotingPower := frontend.Variable(0)

We sum the voting power, since we do not want to verify that 2/3 validators attested the block, but that 2/3 of the voting power attested to it. In Tendermint based chains, validators can have a variable amount of stake, as opposed to Ethereum, where it is always 32 ETH.

Finally we verify the aggregated values.

// Ensure that we actually aggregated the correct number of signatures
lc.api.AssertIsEqual(nbOfKeys, lc.input.NbOfSignature)

// Ensure that the current sum of voting power exceed the expected threshold
votingPowerNeeded := lc.api.Mul(totalVotingPower, powerNumerator)
currentVotingPowerScaled := lc.api.Mul(currentVotingPower, powerDenominator)
lc.api.AssertIsLessOrEqual(votingPowerNeeded, currentVotingPowerScaled)

// Verify that the merkle root is equal to the given root (public input)
rootHash := merkle.RootHash(leafHashes, lc.input.NbOfVal)
lc.api.AssertIsEqual(expectedValRoot, rootHash)

return bls.VerifySignature(aggregatedPublicKey, message, &lc.input.Sig)

These verify that:

The claimed number of signatures matches the actual count.
The voting power of signers exceeds the required threshold (expressed as a fraction).
The validator set matches a known Merkle root.

Much like our Sudoku example, this circuit defines a relationship between public inputs (the expected validator root, message, and signature) and private witness data (the validator set details). When we generate a proof, we're demonstrating knowledge of a valid validator set that signed the message, without revealing the validator details.

This chapter does not cover some of the cryptographic primitives that had to be implemented to perform hashing or BLS aggregation and verification. Those can be found here.

Next we will explore the trusted setup ceremony, an alternative to doing an unsafe setup. All custom circuits that produce SNARKs require one.

Make sure to DYOR:

Interop: Principles, Techniques, and Tools