Client applications, packages, and a public mixnet.
Title
Description
🐈
Namenlos public mixnet(coming soon)
A public Katzenpost mix network that anyone may use, without the burden of operating their own.
📦
kpclientd packages and binaries(coming soon)
Pre-built distribution packages and binaries for the client daemon, so users will not need to build from source.
💬
katzenqt group chat
A decentralised group chat application running over the Katzenpost mix network and the Pigeonhole storage services. Pre-built packages forthcoming. Build and run from source in the meantime.
For operators
You wish to run your own Katzenpost mix network with friends and collaborators.
Deploying and operating Katzenpost servers: installation, configuration, the Docker test mixnet, NAT considerations, and a full configuration appendix.
A focused recipe for running a single Katzenpost mix server inside a Docker container, intended for operators who wish to participate in an existing mix network.
Pinned versions of the Katzenpost stack and brief instructions for building each component (kpclientd, thin clients, katzenqt, server-side) from source.
For application developers
You wish to build software that integrates with Katzenpost.
The protocol specifications that the implementation must honour: Sphinx, KEMSphinx, the wire protocol, the directory authority, replay detection, and more.
This topic provides collects basic commands for installed Katzenpost server components
in a single convenient place. All system commands require superuser privileges.
The commands in this topic do not apply to the Katzenpost Docker image, which has
its
own controls. For more information, see Using the
Katzenpost Docker test network.
Systemd commands
These commands match the suggested systemd setup described in Installing
Katzenpost.
Table 1. Systemd control commands for Katzenpost
Task
Command
Start a mix node.
systemctl start katzenpost-mixserver
Stop a mix node.
systemctl stop katzenpost-mixserver
Restart a mix node.
systemctl restart katzenpost-mixserver
Start a directory authority node.
systemctl start katzenpost-authority
Stop a directory authority node.
systemctl stop katzenpost-authority
Restart a directory authority node.
systemctl restart katzenpost-authority
Server CLI commands
The primary Katzenpost server binaries are katzenpost-mixserver,
which instantiates a mix node, gateway node, or service provider depending on its
configuration, and katzenpost-authority, which instantiates a directory
authority.
Table 2. Command options for the Katzenpost server binaries
Task
Command
Control a mix node.
Run katzenpost-mixserver -h for options.
$ katzenpost-mixserver -h
Usage of katzenpost-mixserver:
-f string
Path to the authority config file. (default "katzenpost.toml")
-g Generate the keys and exit immediately.
-v Get version info.
The -f parameter can
be used to specify a customized path and filename for the server
configuration file, which is typically
/etc/katzenpost-mixserver/katzenpost.toml.
The -g option is used to generate the public and
private signing and link keys. By default, these must be manually
copied to the directory defined by the DataDir
parameter in
/etc/katzenpost-mixserver/katzenpost.toml.
Control a directory authority.
Run katzenpost-authority -h for options.
$ katzenpost-authority -h
Usage of katzenpost-authority:
-f string
Path to the authority config file. (default "authority.toml")
-g Generate the keys and exit immediately.
-v Get version info.
The -f parameter can
be used to specify a customized path and filename for the server
configuration file, which is typically
/etc/katzenpost-authority/authority.toml.
The -g option is used to generate the public and private signing and link
keys. By default, these must be manually copied to the directory defined by the
DataDir parameter in
/etc/katzenpost-authority/authority.toml.
Management interface
Katzenpost provides a management interface that is accessed through a unix domain
socket. The interface supports run-time changes to nodes without requiring a restart.
By
default, the management interface is disabled. To enable it, change the
Management section of the node's configuration file so that
Enable = true:
Use the socat command-line
utility to connect to the management socket and issue commands, with the following
syntax:
# socat unix:/node-datadir/management_sock STDOUT
The following commands are supported.
QUIT - Exit the management socket session.
SHUTDOWN - Shut down the server gracefully.
ADD_USER - Add a user and associate it with a public link key provided in
either hexadecimal or Base64 format.
ADD_USER userkey
UPDATE_USER - Update a user's link key.
UPDATE_USER userkey
REMOVE_USER - Remove a user.
REMOVE_USER user
SET_USER_IDENTITY - Set a user's identity key.
SET_USER_IDENTITY userkey
REMOVE_USER_IDENTITY - Remove a user's identity key. This command must be
followed up with a REMOVE_USER command.
REMOVE_USER_IDENTITY user
USER_IDENTITY - Retrieve a user's identity key.
USER_IDENTITY user
SEND_RATE - Set the packet rate limit to a per-minute integer value.
SEND_RATE value
SEND_BURST - Set the packet burst-rate limit to a per-minute integer
value.
SEND_BURST value
Monitoring
Katzenpost logging information can be viewed in real time with the following
commands:
# journalctl -u katzenpost-mixserver -f -n 2000
or
# journalctl -u katzenpost-authority -f -n 2000
Logging levels include ERROR, WARNING, NOTICE, INFO, and DEBUG, with INFO as the
default. For information about setting the log level, see the documentation for each
node type in Components
and configuration of the Katzenpost mixnet.
The section provides an overview of how to download Katzenpost, set up a development
environment, build the code, install the Katzenpost binaries, and configure the
components.
Requirements
An up-to-date Debian or Ubuntu Linux system is assumed as the build
and hosting environment for all Katzenpost server-side components. Required packages
include the following:
git
gcc
build-essential
libc-dev-bin
Obtain the Katzenpost code
Complete the following steps to set up a local Katzenpost git repository.
Download the latest version of the Go programming language from https://go.dev/dl and unzip it in a
suitable location. As root, set the necessary environment
variables:
# export PATH=$PATH:/<your Go location>/bin# export GO111MODULE=on# export CGO_CFLAGS_ALLOW="-DPARAMS=sphincs-shake-256f"
The go/bin path must be included in your user $PATH
environment variable.
Note
Do not use the Debian/Ubuntu golang packages. They are
probably too old.
Build server components
To build a Katzenpost server component, navigate to the directory containing its
source code and run go build. The paths shown are relative to the
Katzenpost repository root.
Table 1. Server component directories
Component
Source code directory
Binary
Mix, gateway, or service node
server/cmd/server/
server
Directory authority
authority/cmd/dirauth/
dirauth
Build clients
The Katzenpost client components are useful for testing an operational mixnet. To
build them, navigate to the directory containing each component's source code and
run
go build. The paths shown are relative to the Katzenpost
repository root.
Note
The Katzen chat
client is under development and not currently functional. For more information
about the clients generally, see
Clients.
The best way currently to construct a node configuration file is to use one of the
samples in Appendix: Configuration files from the Docker test mixnet, and to modify it
based on the published component parameters, combined with attention to the latest state of the
code tree. Bear in mind that the IP address:port scheme used in the
Docker image is specific to that container environment, and is not transferable to
a
production network without modification.
Katzenpost currently has no configuration automation tool that is ready for
general use.
Configure systemd
If you are running your Katzenpost components under systemd, create and install a systemd
service file for each node type that you plan to deploy. The following scripts are
examples of how to do this.
To create a systemd service file for a directory authority.
The first time that you run a server binary directly or using
systemd, identity and encryption keys are automatically generated and installed if
they
are not already present. The key location is specified by the value of
DataDir in the [Server] section of the configuration. For
configuration parameter details, see Components
and configuration of the Katzenpost mixnet. For server binary commandline options,
see the Quickstart guide.
Once the keys are in place, restart the server to begin operations.
Katzenpost provides a ready-to-deploy Docker
image for developers who need a non-production test environment for developing
and testing client applications and server side plugins. By running this image on
a single computer, you avoid the
need to build and manage a complex multi-node mix net. The image can also be run using
Podman
The test mix network includes the following components:
If both Docker and Podman are present on your system, Katzenpost uses
Podman. Podman is a drop-in daemonless equivalent to Docker that does not
require superuser privileges to run.
On Debian/Ubuntu, these software requirements can be installed with the following commands
(running as superuser). Apt will pull in the needed
dependencies.
# apt update# apt install git golang make podman podman-compose
Note: You can also install Docker and docker-compose instead of Podman, but Podman is recommended as it runs rootless by default.
Preparing to run the container image
Complete the following procedure to obtain, build, and deploy the Katzenpost test
network.
Install the Katzenpost code repository, hosted at https://github.com/katzenpost. The main Katzenpost
repository contains code for the server components as well as the docker image.
Clone the repository with the following command (your directory location may
vary):
Navigate to the new katzenpost subdirectory and ensure
that the code is up to date.
~$ cd katzenpost~/katzenpost$ git checkout main~/katzenpost$ git pull
(Optional) Create a development branch and check it
out.
~/katzenpost$ git checkout -b devel
(Optional) If you are using Podman, enable the Podman socket service:
Enable and start the Podman socket service (as regular user, no superuser privileges needed):
$ systemctl --user enable --now podman.socket
Note
Modern Podman automatically handles the DOCKER_HOST environment variable and socket configuration. The Makefile will detect and use Podman automatically.
Operating the test mixnet
Navigate to katzenpost/docker. The Makefile
contains target operations to create, manage, and test the self-contained Katzenpost
container network. To invoke a target, run a command with the using the following
pattern:
~/katzenpost/docker$ make target
Running make with no target specified returns a list of available
targets.
Table 1. Table 1: Makefile targets
[none]
Display this list of targets.
start
Run the test network in the background.
stop
Stop the test network.
wait
Wait for the test network to have consensus.
watch
Display live log entries until Ctrl-C.
status
Show test network consensus status.
show-latest-vote
Show latest consensus vote.
run-ping
Send a ping over the test network.
clean-bin
Stop all components and delete binaries.
clean-local
Stop all components, delete binaries, and delete data.
clean-local-dryrun
Show what clean-local would delete.
clean
Same as clean-local, but also
deletes go_deps image.
Starting and monitoring the mixnet
The first time that you run make start, the Docker image is
downloaded, built, installed, and started. This takes several minutes. When the
build is complete, the command exits while the network remains running in the
background.
~/katzenpost/docker$ make start
Subsequent runs of make start either start or restart the
network without building the components from scratch. The exception to this is when
you delete any of the Katzenpost binaries (dirauth.alpine, server.alpine, etc.).
In that case, make start rebuilds just the parts of the network
dependent on the deleted binary. For more information about the files created during
the Docker build, see the section called “Network topology and components”.
Note
When running make start , be aware of the following
considerations:
If you are using Podman (recommended), no superuser privileges are required.
Simply run make start as a regular user.
If you intend to use Docker instead of Podman, you may need to run make
as superuser depending on your Docker configuration. You can override the automatic
Podman selection by adding the argument docker=docker to the command:
~/katzenpost/docker$ make start docker=docker
After the make start command exits, the mixnet runs in the
background, and you can run make watch to display a live log of
the network activity.
~/katzenpost/docker$ make watch
...
<output>
...
When installation is complete, the mix servers vote and reach a consensus. You can
use the wait target to wait for the mixnet to get consensus and
be ready to use. This can also take several minutes:
~/katzenpost/docker$ make wait
...
<output>
...
You can confirm that installation and configuration are complete by issuing the
status command from the same or another terminal. When the
network is ready for use, status begins returning consensus
information similar to the following:
~/katzenpost/docker$ make status
...
00:15:15.003 NOTI state: Consensus made for epoch 1851128 with 3/3 signatures: &{Epoch: 1851128 GenesisEpoch: 1851118
...
Testing the mixnet
At this point, you should have a locally running mix network. You can test whether
it is working correctly by using run-ping, which launches a
packet into the network and watches for a successful reply. Run the following
command:
~/katzenpost/docker$ make run-ping
If the network is functioning properly, the resulting output contains lines
similar to the following:
19:29:53.541 INFO gateway1_client: sending loop decoy
!19:29:54.108 INFO gateway1_client: sending loop decoy
19:29:54.632 INFO gateway1_client: sending loop decoy
19:29:55.160 INFO gateway1_client: sending loop decoy
!19:29:56.071 INFO gateway1_client: sending loop decoy
!19:29:59.173 INFO gateway1_client: sending loop decoy
!Success rate is 100.000000 percent 10/10)
lf run-ping fails to receive a reply, it eventually times out
with an error message. If this happens, try the command again.
Note
If you attempt use run-ping too quickly after
starting the mixnet, and consensus has not been reached, the utility may crash
with an error message or hang indefinitely. If this happens, issue (if
necessary) a Ctrl-C key sequence to abort, check the
consensus status with the status command, and then retry
run-ping.
Shutting down the mixnet
The mix network continues to run in the terminal where you started it until you
issue a Ctrl-C key sequence, or until you issue the following
command in another terminal:
~/katzenpost/docker$ make stop
When you stop the network, the binaries and data are left in place. This allows
for a quick restart.
Uninstalling and cleaning up
Several command targets can be used to uninstall the Docker image and restore your
system to a clean state. The following examples demonstrate the commands and their
output.
clean-bin
To stop the network and delete the compiled binaries, run the following
command:
This command leaves in place the cryptographic keys, the state data, and
the logs.
clean-local-dryrun
To display a preview of what clean-local would remove,
without actually deleting anything, run the following command:
~/katzenpost/docker$ make clean-local-dryrun
clean-local
To delete both compiled binaries and data, run the following
command:
~/katzenpost/docker$ make clean-local
[ -e voting_mixnet ] && cd voting_mixnet && DOCKER_HOST=unix:///run/user/1000/podman/podman.sock docker-compose down --remove-orphans; rm -fv running.stamp
Removing voting_mixnet_mix2_1 ... done
Removing voting_mixnet_auth1_1 ... done
Removing voting_mixnet_auth2_1 ... done
Removing voting_mixnet_gateway1_1 ... done
Removing voting_mixnet_mix1_1 ... done
Removing voting_mixnet_auth3_1 ... done
Removing voting_mixnet_mix3_1 ... done
Removing voting_mixnet_servicenode1_1 ... done
Removing voting_mixnet_metrics_1 ... done
removed 'running.stamp'
rm -vf ./voting_mixnet/*.alpine
removed './voting_mixnet/echo_server.alpine'
removed './voting_mixnet/fetch.alpine'
removed './voting_mixnet/memspool.alpine'
removed './voting_mixnet/panda_server.alpine'
removed './voting_mixnet/pigeonhole.alpine'
removed './voting_mixnet/reunion_katzenpost_server.alpine'
removed './voting_mixnet/server.alpine'
removed './voting_mixnet/voting.alpine'
git clean -f -x voting_mixnet
Removing voting_mixnet/
git status .
On branch main
Your branch is up to date with 'origin/main'.
clean
To stop the the network and delete the binaries, the data, and the go_deps
image, run the following command as superuser:
~/katzenpost/docker$ sudo make clean
Network topology and components
The Docker image deploys a working mixnet with all components and component groups
needed to perform essential mixnet functions:
message mixing (including packet reordering, timing randomization, injection
of decoy traffic, obfuscation of senders and receivers, and so on)
service provisioning
internal authentication and integrity monitoring
interfacing with external clients
Warning
While suited for client development and testing, the test mixnet omits performance
and security redundancies. Do not use it in production.
The following diagram illustrates the components and their network interactions. The
gray blocks represent nodes, and the arrows represent information transfer.
Figure 1. Test network topology
On the left, the Client transmits a message (shown by
purple arrows) through the Gateway node, across three
mix node layers, to the Service node. The Service node
processes the request and responds with a reply (shown by the green arrows) that
traverses the mix node layers before exiting the mixnet
via the Gateway node and arriving at the Client.
On the right, directory authorities Dirauth 1,
Dirauth 2, and Dirauth
3 provide PKI services. The directory authorities receive mix descriptors from the other nodes, collate these into a
consensus document containing validated network
status and authentication materials , and make that available to the other nodes.
The elements in the topology diagram map to the mixnet's component nodes as shown
in
the following table. Note that all nodes share the same IP address (127.0.0.1, i.e.,
localhost), but are accessed through different ports. Each node type links to additional
information in Components and configuration of the Katzenpost mixnet.
The following tree
output shows the location, relative to the katzenpost
repository root, of the files created by the Docker build. During testing and use,
you would normally touch only the TOML configuration file associated with each node,
as highlighted in the listing. For help in understanding these files and a complete
list of configuration options, follow the links in Table 2: Test mixnet
hosts.
This section of the Katzenpost technical documentation provides an introduction to
the
software components that make up Katzenpost and guidance on how to configure each
component. The intended reader is a system administrator who wants to implement a
working,
production Katzenpost network.
The core of Katzenpost consists of two program executables, dirauth and server. Running the dirauth command runs a
directory authority node, or dirauth, that
functions as part of the mixnet's public-key infrastructure (PKI). Running the
server runs either a mix node, a
gateway node, or a service node, depending
on the configuration. Configuration settings are provided in an associated
katzenpost-authority.toml or
katzenpost.toml file respectively.
In addition to the server components, Katzenpost also supports connections to
client applications hosted externally to the mix network and communicating with it
through gateway nodes.
A model mix network is shown in Figure 1.
Figure 1. The pictured element types correspond to discrete client and server programs
that
Katzenpost requires to function.
The mix network contains an n-layer topology of mix-nodes, with
three nodes per layer in this example. Sphinx packets traverse the network in one
direction only. The gateway nodes allow clients to interact with the mix network.
The
service nodes provide mix network services that mix network clients can interact with.
All messages sent by clients are handed to a connector daemon
hosted on the client system, passed across the Internet to a gateway, and then relayed
to a service node by way of the nine mix nodes. The service node sends its reply back
across the mix-node layers to a gateway, which transmits it across the Internet to
be
received by the targeted client. The mix, gateway, and service nodes send mix
descriptors to the dirauths and retrieve a consensus
document from them, described below.
In addition to the server components, Katzenpost supports connections to client
applications hosted externally to the mix network and communicating with it through
gateway nodes and, in some cases, a client connector.
Directory authorities (dirauths)
Dirauths compose the decentralized public key infrastructure (PKI) that serves as
the root of security for the entire mix network. Clients, mix nodes, gateways nodes,
and service nodes rely on the PKI/dirauth system to maintain and sign an up-to-date
consensus document, providing a view of the network including connection information
and public cryptographic key materials and signatures.
Every 20 minutes (the current value for an epoch), each mix,
gateway, and service node signs a mix descriptor and uploads it to the dirauths. The
dirauths then vote on a new consensus document. If consensus is reached, each
dirauth signs the document. Clients and nodes download the document as needed and
verify the signatures. Consensus fails when 1/2 + 1 nodes fail, which yields greater
fault tolerance than, for example, Byzantine Fault Tolerance, which fails when 1/3
+
1 of the nodes fail.
The PKI signature scheme is fully configurable by the dirauths. Our recommendation
is to use a hybrid signature scheme consisting of classical Ed25519 and the
post-quantum, stateless, hash-based signature scheme known as Sphincs+ (with the
parameters: "sphincs-shake-256f"), which is designated in Katzenpost
configurations as "Ed25519 Sphincs+". Examples are provided below.
Mix nodes
The mix node is the fundamental building block of the mix network.
Katzenpost mix nodes are arranged in a layered topology to achieve the best
levels of anonymity and ease of analysis while being flexible enough to scale with
traffic demands.
Gateway nodes
Gateway nodes provide external client access to the mix network. Because gateways
are uniquely positioned to identify clients, they are designed to have as little
information about client behavior as possible. Gateways are randomly selected and
have no persistent relationship with clients and no knowledge of whether a client's
packets are decoys or not. When client traffic through a gateway is slow, the node
additionally generates decoy traffic.
Service nodes
Service
nodes provide functionality requested by clients. They are
logically positioned at the deepest point of the mix network, with incoming queries
and outgoing replies both needing to traverse all n layers of
mix nodes. A service node's functionality may involve storing messages, publishing
information outside of the mixnet, interfacing with a blockchain node, and so on.
Service nodes also process decoy packets.
Clients
Client applications should be designed so that the following conditions are
met:
Separate service requests from a client are unlinkable. Repeating the same
request may be lead to linkability.
Service nodes and clients have no persistent relationship.
Clients generate a stream of packets addressed to random or pseudorandom
services regardless of whether a real service request is being made. Most of
these packets will be decoy traffic.
Traffic from a client to a service node must be correctly coupled with
decoy traffic. This can mean that the service node is chosen independently
from traffic history, or that the transmitted packet replaces a decoy packet
that was meant to go to the desired service.
Katzenpost currently includes several client applications. All applications
make extensive use of Sphinx single-use reply blocks (SURBs), which enable service
nodes to send replies without knowing the location of the client. Newer clients
require a connection through the client connector, which
provides multiplexing and privilege separation with a consequent reduction in
processing overhead.
The following client applications are available.
Table 1. Katzenpost clients
Name
Needs connector
Description
Code
Ping
no
The mix network equivalent of an ICMP ping utility, used
for network testing.
This section documents the configuration parameters for each type of Katzenpost
server node. Each node has its own configuration file in TOML format.
Configuring directory authorities
The following configuration is drawn from the reference implementation in
katzenpost/docker/dirauth_mixnet/auth1/authority.toml. In a
real-world mixnet, the component hosts would not be sharing a single IP address. For
more information about the test mixnet, see Using the Katzenpost
Docker test network.
Specifies the human-readable identifier for a node, and must be unique
per mixnet. The identifier can be an FQDN but does not have to
be.
Type: string
Required: Yes
WireKEMScheme
Specifies the key encapsulation mechanism (KEM) scheme
for the PQ
Noise-based wire protocol (link layer) that nodes use
to communicate with each other. PQ Noise is a post-quantum variation of
the Noise protocol
framework, which algebraically transforms ECDH handshake
patterns into KEM encapsulate/decapsulate operations.
This configuration option supports the optional use of
hybrid post-quantum cryptography to strengthen security. The following KEM
schemes are supported:
Classical: "x25519", "x448"
Note
X25519 and X448 are actually non-interactive key-exchanges
(NIKEs), not KEMs. Katzenpost uses
a hashed ElGamal cryptographic construction
to convert them from NIKEs to KEMs.
Specifies the cryptographic signature scheme which will be used by all
components of the mix network when interacting with the PKI system. Mix
nodes sign their descriptors using this signature scheme, and dirauth
nodes similarly sign PKI documents using the same scheme.
The following signature schemes are supported: "ed25519", "ed448",
"Ed25519 Sphincs+", "Ed448-Sphincs+", "Ed25519-Dilithium2",
"Ed448-Dilithium3"
Type: string
Required: Yes
Addresses
Specifies a list of one or more address URLs in a format that contains
the transport protocol, IP address, and port number that the node will
bind to for incoming connections. Katzenpost supports URLs with that
start with either "tcp://" or "quic://" such as:
["tcp://192.168.1.1:30001"] and ["quic://192.168.1.1:40001"].
Type: []string
Required: Yes
DataDir
Specifies the absolute path to a node's state directory. This is
where persistence.db is written to disk and
where a node stores its cryptographic key materials when started with
the "-g" command-line option.
Type: string
Required: Yes
Dirauth: Authorities
section
An Authorities section is configured for each peer authority. We
recommend using TOML's style
for multi-line quotations for key materials.
Specifies the human-readable identifier for the node which must be
unique per mixnet. The identifier can be an FQDN but does not have to
be.
Type: string
Required: Yes
IdentityPublicKey
String containing the node's public identity key in PEM format.
IdentityPublicKey is the node's permanent identifier
and is used to verify cryptographic signatures produced by its private
identity key.
Type: string
Required: Yes
PKISignatureScheme
Specifies the cryptographic signature scheme used by all directory
authority nodes. PKISignatureScheme must match the scheme
specified in the Server section of the configuration.
Type: string
Required: Yes
LinkPublicKey
String containing the peer's public link-layer key in PEM format.
LinkPublicKey must match the specified
WireKEMScheme.
Type: string
Required: Yes
WireKEMScheme
Specifies the key encapsulation mechanism (KEM) scheme
for the PQ
Noise-based wire protocol (link layer) that nodes use
to communicate with each other. PQ Noise is a post-quantum variation of
the Noise protocol
framework, which algebraically transforms ECDH handshake
patterns into KEM encapsulate/decapsulate operations.
This configuration option supports the optional use of
hybrid post-quantum cryptography to strengthen security. The following KEM
schemes are supported:
Classical: "x25519", "x448"
Note
X25519 and X448 are actually non-interactive key-exchanges
(NIKEs), not KEMs. Katzenpost uses
a hashed ElGamal cryptographic construction
to convert them from NIKEs to KEMs.
Specifies a list of one or more address URLs in a format that contains
the transport protocol, IP address, and port number that the node will
bind to for incoming connections. Katzenpost supports URLs with that
start with either "tcp://" or "quic://" such as:
["tcp://192.168.1.1:30001"] and ["quic://192.168.1.1:40001"].
Type: []string
Required: Yes
Dirauth: Logging section
The Logging configuration section controls logging behavior across Katzenpost.
Specifies the maximum allowed rate of packets per client per gateway
node. Rate limiting is done on the gateway nodes.
Type: uint64
Required: Yes
Mu
Specifies the inverse of the mean of the exponential distribution from
which the Sphinx packet per-hop mixing delay will be sampled.
Type: float64
Required: Yes
MuMaxDelay
Specifies the maximum Sphinx packet per-hop mixing delay in
milliseconds.
Type: uint64
Required: Yes
LambdaP
Specifies the inverse of the mean of the exponential distribution that
clients sample to determine the time interval between sending messages,
whether actual messages from the FIFO egress queue or decoy messages if
the queue is empty.
Type: float64
Required: Yes
LambdaPMaxDelay
Specifies the maximum send delay interval for LambdaP in
milliseconds.
Type: uint64
Required: Yes
LambdaL
Specifies the inverse of the mean of the exponential distribution that
clients sample to determine the delay interval between loop
decoys.
Type: float64
Required: Yes
LambdaLMaxDelay
Specifies the maximum send delay interval for LambdaL in
milliseconds.
Type: uint64
Required: Yes
LambdaD
LambdaD is the inverse of the mean of the exponential distribution
that clients sample to determine the delay interval between decoy drop
messages.
Type: float64
Required: Yes
LambdaDMaxDelay
Specifies the maximum send interval in for LambdaD in milliseconds.
Type: uint64
Required: Yes
LambdaM
LambdaM is the inverse of the mean of the exponential distribution
that mix nodes sample to determine the delay between mix loop
decoys.
Type: float64
Required: Yes
LambdaG
LambdaG is the inverse of the mean of the exponential distribution
that gateway nodes to select the delay between gateway node
decoys.
Warning
Do not set this value manually in the TOML configuration file. The
field is used internally by the dirauth server state machine.
Type: float64
Required: Yes
LambdaMMaxDelay
Specifies the maximum delay for LambdaM in milliseconds.
Type: uint64
Required: Yes
LambdaGMaxDelay
Specifies the maximum delay for LambdaG in milliseconds.
Specifies the human-readable identifier for a node, and must be unique
per mixnet. The identifier can be an FQDN but does not have to
be.
Type: string
IdentityPublicKeyPem
Path and file name of a mix node's public identity signing key, also
known as the identity key, in PEM format.
Type: string
Required: Yes
Dirauth: SphinxGeometry
section
Sphinx is an encrypted nested-packet format designed primarily for mixnets.
The original Sphinx paper described a non-interactive key exchange
(NIKE) employing classical encryption. The Katzenpost implementation
strongly emphasizes configurability, supporting key encapsulation mechanisms
(KEMs) as well as NIKEs, and enabling the use of either classical or hybrid
post-quantum cryptography. Hybrid constructions offset the newness of
post-quantum algorithms by offering heavily tested classical algorithms as a
fallback.
Note
Sphinx, the nested-packet format, should not be confused with Sphincs or Sphincs+, which
are post-quantum signature schemes.
Katzenpost Sphinx also relies on the following classical cryptographic
primitives:
CTR-AES256, a stream cipher
HMAC-SHA256, a message authentication code (MAC) function
HKDF-SHA256, a key derivation function (KDF)
AEZv5, a strong pseudorandom permutation (SPRP)
All dirauths must be configured to use the same SphinxGeometry
parameters. Any geometry not advertised by the PKI document will fail. Each
dirauth publishes the hash of its SphinxGeometry parameters in the
PKI document for validation by its peer dirauths.
The SphinxGeometry section defines parameters for the Sphinx
encrypted nested-packet format used internally by Katzenpost.
The settings in this section are generated by the gensphinx utility, which computes the Sphinx geometry based on the
following user-supplied directives:
The number of mix node layers (not counting gateway and service
nodes)
The length of the application-usable packet payload
The selected NIKE or KEM scheme
Warning
The values in the SphinxGeometry configuration section must
be programmatically generated by gensphinx. Many of the
parameters are interdependent and cannot be individually modified. Do not
modify the these values by hand.
The gensphinx output in TOML should then be pasted unchanged
into the node's configuration file, as shown below.
The number of hops a Sphinx packet takes through the mixnet. Because
packet headers hold destination information for each hop, the size of the
header increases linearly with the number of hops.
Type: int
Required: Yes
HeaderLength
The total length of the Sphinx packet header in bytes.
Type: int
Required: Yes
RoutingInfoLength
The total length of the routing information portion of the Sphinx packet
header.
Type: int
Required: Yes
PerHopRoutingInfoLength
The length of the per-hop routing information in the Sphinx packet
header.
Type: int
Required: Yes
SURBLength
The length of a single-use reply block (SURB).
Type: int
Required: Yes
SphinxPlaintextHeaderLength
The length of the plaintext Sphinx packet header.
Type: int
Required: Yes
PayloadTagLength
The length of the payload tag.
Type: int
Required: Yes
ForwardPayloadLength
The total size of the payload.
Type: int
Required: Yes
UserForwardPayloadLength
The size of the usable payload.
Type: int
Required: Yes
NextNodeHopLength
The NextNodeHopLength is derived from the largest
routing-information block that we expect to encounter. Other packets have
NextNodeHop + NodeDelay sections, or a
Recipient section, both of which are shorter.
Type: int
Required: Yes
SPRPKeyMaterialLength
The length of the strong pseudo-random permutation (SPRP) key.
Type: int
Required: Yes
NIKEName
The name of the non-interactive key exchange (NIKE) scheme used by Sphinx
packets.
NIKEName and KEMName are mutually
exclusive.
Type: string
Required: Yes
KEMName
The name of the key encapsulation mechanism (KEM) used by Sphinx
packets.
NIKEName and KEMName are mutually
exclusive.
Type: string
Required: Yes
Configuring mix nodes
The following configuration is drawn from the reference implementation in
katzenpost/docker/dirauth_mixnet/mix1/katzenpost.toml. In a
real-world mixnet, the component hosts would not be sharing a single IP address. For
more information about the test mixnet, see Using the Katzenpost Docker test network.
Specifies the human-readable identifier for a node, and must be unique per mixnet.
The identifier can be an FQDN but does not have to be.
Type: string
Required: Yes
WireKEM
WireKEM specifies the key encapsulation mechanism (KEM) scheme
for the PQ
Noise-based wire protocol (link layer) that nodes use
to communicate with each other. PQ Noise is a post-quantum variation of
the Noise protocol
framework, which algebraically transforms ECDH handshake
patterns into KEM encapsulate/decapsulate operations.
This configuration option supports the optional use of
hybrid post-quantum cryptography to strengthen security. The following KEM
schemes are supported:
Classical: "x25519", "x448"
Note
X25519 and X448 are actually non-interactive key-exchanges
(NIKEs), not KEMs. Katzenpost uses
a hashed ElGamal cryptographic construction
to convert them from NIKEs to KEMs.
Specifies the cryptographic signature scheme that will be used by all
components of the mix network when interacting with the PKI system. Mix
nodes sign their descriptors using this signature scheme, and dirauth nodes
similarly sign PKI documents using the same scheme.
Specifies a list of one or more address URLs in a format that contains the
transport protocol, IP address, and port number that the server will bind to
for incoming connections. Katzenpost supports URLs with that start with
either "tcp://" or "quic://" such as: ["tcp://192.168.1.1:30001"] and
["quic://192.168.1.1:40001"].
Addresses is overridden if BindAddresses is true. In that scenario, one or more advertised, external
addresses is provided as the value of Addresses, and is advertised in the PKI document.
Note that BindAddresses, below, holds the
address values for non-advertised, internal-only listeners. The addition of
BindAddresses to the node configuration
is required for hosts connecting to the Internet through network address
translation (NAT).
Type: []string
Required: Yes
BindAddresses
If true, allows setting of listener
addresses that the server will bind to and accept connections on. These
addresses are not advertised in the PKI document. For more information, see
Addresses, above.
Type: bool, []string
Required: No
MetricsAddress
Specifies the address/port to bind the Prometheus metrics endpoint
to.
Type: string
Required: No
DataDir
Specifies the absolute path to a node's state directory. This is where
persistence.db is written to disk and where a node stores its cryptographic
key materials when started with the "-g" command-line option.
Type: string
Required: Yes
IsGatewayNode
If true, the server is a gateway
node.
Type: bool
Required: No
IsServiceNode
If true, the server is a service
node.
Type: bool
Required: No
Mix node: Logging section
The Logging configuration section controls logging behavior across Katzenpost.
Specifies the human-readable identifier for a node, which must be unique per mixnet.
The identifier can be an FQDN but does not have to be.
Type: string
Required: Yes
IdentityPublicKey
String containing the node's public identity key in PEM format.
IdentityPublicKey is the node's permanent identifier
and is used to verify cryptographic signatures produced by its private
identity key.
Type: string
Required: Yes
PKISignatureScheme
Specifies the cryptographic signature scheme that will be used by all
components of the mix network when interacting with the PKI system. Mix
nodes sign their descriptors using this signature scheme, and dirauth nodes
similarly sign PKI documents using the same scheme.
Type: string
Required: Yes
LinkPublicKey
String containing the peer's public link-layer key in PEM format.
LinkPublicKey must match the specified
WireKEMScheme.
Type: string
Required: Yes
WireKEMScheme
The name of the wire protocol key-encapsulation mechanism (KEM) to use.
Type: string
Required: Yes
Addresses
Specifies a list of one or more address URLs in a format that contains the
transport protocol, IP address, and port number that the server will bind to
for incoming connections. Katzenpost supports URLs that start with
either "tcp://" or "quic://" such as: ["tcp://192.168.1.1:30001"] and
["quic://192.168.1.1:40001"]. The value of Addresses is advertised in the PKI document.
Type: []string
Required: Yes
Mix node: Management section
The Management section specifies connectivity information for the
Katzenpost control protocol which can be used to make run-time configuration
changes. A configuration resembles the following, substituting the node's configured
DataDir value as part of the Path value:
Specifies the path to the management interface socket. If left empty, then management_sock
is located in the configuration's defined DataDir.
Type: string
Required: No
Mix node: SphinxGeometry section
The SphinxGeometry section defines parameters for the Sphinx
encrypted nested-packet format used internally by Katzenpost.
The settings in this section are generated by the gensphinx utility, which computes the Sphinx geometry based on the
following user-supplied directives:
The number of mix node layers (not counting gateway and service
nodes)
The length of the application-usable packet payload
The selected NIKE or KEM scheme
Warning
The values in the SphinxGeometry configuration section must
be programmatically generated by gensphinx. Many of the
parameters are interdependent and cannot be individually modified. Do not
modify the these values by hand.
The gensphinx output in TOML should then be pasted unchanged
into the node's configuration file, as shown below.
The number of hops a Sphinx packet takes through the mixnet. Because
packet headers hold destination information for each hop, the size of the
header increases linearly with the number of hops.
Type: int
Required: Yes
HeaderLength
The total length of the Sphinx packet header in bytes.
Type: int
Required: Yes
RoutingInfoLength
The total length of the routing information portion of the Sphinx packet
header.
Type: int
Required: Yes
PerHopRoutingInfoLength
The length of the per-hop routing information in the Sphinx packet
header.
Type: int
Required: Yes
SURBLength
The length of a single-use reply block (SURB).
Type: int
Required: Yes
SphinxPlaintextHeaderLength
The length of the plaintext Sphinx packet header.
Type: int
Required: Yes
PayloadTagLength
The length of the payload tag.
Type: int
Required: Yes
ForwardPayloadLength
The total size of the payload.
Type: int
Required: Yes
UserForwardPayloadLength
The size of the usable payload.
Type: int
Required: Yes
NextNodeHopLength
The NextNodeHopLength is derived from the largest
routing-information block that we expect to encounter. Other packets have
NextNodeHop + NodeDelay sections, or a
Recipient section, both of which are shorter.
Type: int
Required: Yes
SPRPKeyMaterialLength
The length of the strong pseudo-random permutation (SPRP) key.
Type: int
Required: Yes
NIKEName
The name of the non-interactive key exchange (NIKE) scheme used by Sphinx
packets.
NIKEName and KEMName are mutually
exclusive.
Type: string
Required: Yes
KEMName
The name of the key encapsulation mechanism (KEM) used by Sphinx
packets.
NIKEName and KEMName are mutually
exclusive.
Type: string
Required: Yes
Mix node: Debug section
The Debug section is the Katzenpost server debug configuration
for advanced tuning.
Specifies the number of worker instances to use for inbound Sphinx
packet processing.
Type: int
Required: No
NumProviderWorkers
Specifies the number of worker instances to use for provider specific
packet processing.
Type: int
Required: No
NumKaetzchenWorkers
Specifies the number of worker instances to use for Kaetzchen-specific
packet processing.
Type: int
Required: No
SchedulerExternalMemoryQueue
If true, the experimental disk-backed external memory
queue is enabled.
Type: bool
Required: No
SchedulerQueueSize
Specifies the maximum scheduler queue size before random entries will
start getting dropped. A value less than or equal to zero is treated as
unlimited.
Type: int
Required: No
SchedulerMaxBurst
Specifies the maximum number of packets that will be dispatched per
scheduler wakeup event.
Type:
Required: No
UnwrapDelay
Specifies the maximum unwrap delay due to queueing in
milliseconds.
Type: int
Required: No
GatewayDelay
Specifies the maximum gateway node worker delay due to queueing in milliseconds.
Type: int
Required: No
ServiceDelay
Specifies the maximum provider delay due to queueing in
milliseconds.
Type: int
Required: No
KaetzchenDelay
Specifies the maximum kaetzchen delay due to queueing in
milliseconds.
Type: int
Required: No
SchedulerSlack
Specifies the maximum scheduler slack due to queueing and/or
processing in milliseconds.
Type: int
Required: No
SendSlack
Specifies the maximum send-queue slack due to queueing and/or
congestion in milliseconds.
Type: int
Required: No
DecoySlack
Specifies the maximum decoy sweep slack due to external
delays such as latency before a loop decoy packet will be considered
lost.
Type: int
Required: No
ConnectTimeout
Specifies the maximum time a connection can take to establish a
TCP/IP connection in milliseconds.
Type: int
Required: No
HandshakeTimeout
Specifies the maximum time a connection can take for a link-protocol
handshake in milliseconds.
Type: int
Required: No
ReauthInterval
Specifies the interval at which a connection will be reauthenticated
in milliseconds.
Type: int
Required: No
SendDecoyTraffic
If true, decoy traffic is enabled.
This parameter is experimental and untuned,
and is disabled by default.
Note
This option will be removed once decoy traffic is fully implemented.
Type: bool
Required: No
DisableRateLimit
If true, the per-client rate limiter is disabled.
Note
This option should only be used for testing.
Type: bool
Required: No
GenerateOnly
If true, the server immediately halts
and cleans up after long-term key generation.
Type: bool
Required: No
Configuring gateway nodes
The following configuration is drawn from the reference implementation in
katzenpost/docker/dirauth_mixnet/gateway1/katzenpost.toml.
In a real-world mixnet, the component hosts would not be sharing a single IP
address. For more information about the test mixnet, see Using the Katzenpost Docker test network.
Specifies the human-readable identifier for a node, and must be unique per mixnet.
The identifier can be an FQDN but does not have to be.
Type: string
Required: Yes
WireKEM
WireKEM specifies the key encapsulation mechanism (KEM) scheme
for the PQ
Noise-based wire protocol (link layer) that nodes use
to communicate with each other. PQ Noise is a post-quantum variation of
the Noise protocol
framework, which algebraically transforms ECDH handshake
patterns into KEM encapsulate/decapsulate operations.
This configuration option supports the optional use of
hybrid post-quantum cryptography to strengthen security. The following KEM
schemes are supported:
Classical: "x25519", "x448"
Note
X25519 and X448 are actually non-interactive key-exchanges
(NIKEs), not KEMs. Katzenpost uses
a hashed ElGamal cryptographic construction
to convert them from NIKEs to KEMs.
Specifies the cryptographic signature scheme that will be used by all
components of the mix network when interacting with the PKI system. Mix
nodes sign their descriptors using this signature scheme, and dirauth nodes
similarly sign PKI documents using the same scheme.
Specifies a list of one or more address URLs in a format that contains the
transport protocol, IP address, and port number that the server will bind to
for incoming connections. Katzenpost supports URLs with that start with
either "tcp://" or "quic://" such as: ["tcp://192.168.1.1:30001"] and
["quic://192.168.1.1:40001"].
Addresses is overridden if BindAddresses is true. In that scenario, one or more advertised, external
addresses is provided as the value of Addresses, and is advertised in the PKI document.
Note that BindAddresses, below, holds the
address values for non-advertised, internal-only listeners. The addition of
BindAddresses to the node configuration
is required for hosts connecting to the Internet through network address
translation (NAT).
Type: []string
Required: Yes
BindAddresses
If true, allows setting of listener
addresses that the server will bind to and accept connections on. These
addresses are not advertised in the PKI document. For more information, see
Addresses, above.
Type: bool, []string
Required: No
MetricsAddress
Specifies the address/port to bind the Prometheus metrics endpoint
to.
Type: string
Required: No
DataDir
Specifies the absolute path to a node's state directory. This is where
persistence.db is written to disk and where a node stores its cryptographic
key materials when started with the "-g" command-line option.
Type: string
Required: Yes
IsGatewayNode
If true, the server is a gateway
node.
Type: bool
Required: No
IsServiceNode
If true, the server is a service
node.
Type: bool
Required: No
Gateway node: Logging section
The Logging configuration section controls logging behavior across Katzenpost.
The Gateway section of the configuration is required for configuring a Gateway
node. The section must contain UserDB and SpoolDB
definitions. Bolt is an
embedded database library for the Go programming language that Katzenpost
has used in the past for its user and spool databases. Because Katzenpost
currently persists data on Service nodes instead of Gateways, these databases
will probably be deprecated in favour of in-memory concurrency structures. In
the meantime, it remains necessary to configure a Gateway node as shown below,
only changing the file paths as needed:
Specifies the human-readable identifier for a node, which must be unique per mixnet.
The identifier can be an FQDN but does not have to be.
Type: string
Required: Yes
IdentityPublicKey
String containing the node's public identity key in PEM format.
IdentityPublicKey is the node's permanent identifier
and is used to verify cryptographic signatures produced by its private
identity key.
Type: string
Required: Yes
PKISignatureScheme
Specifies the cryptographic signature scheme that will be used by all
components of the mix network when interacting with the PKI system. Mix
nodes sign their descriptors using this signature scheme, and dirauth nodes
similarly sign PKI documents using the same scheme.
Type: string
Required: Yes
LinkPublicKey
String containing the peer's public link-layer key in PEM format.
LinkPublicKey must match the specified
WireKEMScheme.
Type: string
Required: Yes
WireKEMScheme
The name of the wire protocol key-encapsulation mechanism (KEM) to use.
Type: string
Required: Yes
Addresses
Specifies a list of one or more address URLs in a format that contains the
transport protocol, IP address, and port number that the server will bind to
for incoming connections. Katzenpost supports URLs that start with
either "tcp://" or "quic://" such as: ["tcp://192.168.1.1:30001"] and
["quic://192.168.1.1:40001"]. The value of Addresses is advertised in the PKI document.
Type: []string
Required: Yes
Gateway node: Management section
The Management section specifies connectivity information for the
Katzenpost control protocol which can be used to make run-time configuration
changes. A configuration resembles the following, substituting the node's configured
DataDir value as part of the Path value:
Specifies the path to the management interface socket. If left empty, then management_sock
is located in the configuration's defined DataDir.
Type: string
Required: No
Gateway node: SphinxGeometry section
The SphinxGeometry section defines parameters for the Sphinx
encrypted nested-packet format used internally by Katzenpost.
The settings in this section are generated by the gensphinx utility, which computes the Sphinx geometry based on the
following user-supplied directives:
The number of mix node layers (not counting gateway and service
nodes)
The length of the application-usable packet payload
The selected NIKE or KEM scheme
Warning
The values in the SphinxGeometry configuration section must
be programmatically generated by gensphinx. Many of the
parameters are interdependent and cannot be individually modified. Do not
modify the these values by hand.
The gensphinx output in TOML should then be pasted unchanged
into the node's configuration file, as shown below.
The number of hops a Sphinx packet takes through the mixnet. Because
packet headers hold destination information for each hop, the size of the
header increases linearly with the number of hops.
Type: int
Required: Yes
HeaderLength
The total length of the Sphinx packet header in bytes.
Type: int
Required: Yes
RoutingInfoLength
The total length of the routing information portion of the Sphinx packet
header.
Type: int
Required: Yes
PerHopRoutingInfoLength
The length of the per-hop routing information in the Sphinx packet
header.
Type: int
Required: Yes
SURBLength
The length of a single-use reply block (SURB).
Type: int
Required: Yes
SphinxPlaintextHeaderLength
The length of the plaintext Sphinx packet header.
Type: int
Required: Yes
PayloadTagLength
The length of the payload tag.
Type: int
Required: Yes
ForwardPayloadLength
The total size of the payload.
Type: int
Required: Yes
UserForwardPayloadLength
The size of the usable payload.
Type: int
Required: Yes
NextNodeHopLength
The NextNodeHopLength is derived from the largest
routing-information block that we expect to encounter. Other packets have
NextNodeHop + NodeDelay sections, or a
Recipient section, both of which are shorter.
Type: int
Required: Yes
SPRPKeyMaterialLength
The length of the strong pseudo-random permutation (SPRP) key.
Type: int
Required: Yes
NIKEName
The name of the non-interactive key exchange (NIKE) scheme used by Sphinx
packets.
NIKEName and KEMName are mutually
exclusive.
Type: string
Required: Yes
KEMName
The name of the key encapsulation mechanism (KEM) used by Sphinx
packets.
NIKEName and KEMName are mutually
exclusive.
Type: string
Required: Yes
Gateway node: Debug section
The Debug section is the Katzenpost server debug configuration
for advanced tuning.
Specifies the number of worker instances to use for inbound Sphinx
packet processing.
Type: int
Required: No
NumProviderWorkers
Specifies the number of worker instances to use for provider specific
packet processing.
Type: int
Required: No
NumKaetzchenWorkers
Specifies the number of worker instances to use for Kaetzchen-specific
packet processing.
Type: int
Required: No
SchedulerExternalMemoryQueue
If true, the experimental disk-backed external memory
queue is enabled.
Type: bool
Required: No
SchedulerQueueSize
Specifies the maximum scheduler queue size before random entries will
start getting dropped. A value less than or equal to zero is treated as
unlimited.
Type: int
Required: No
SchedulerMaxBurst
Specifies the maximum number of packets that will be dispatched per
scheduler wakeup event.
Type:
Required: No
UnwrapDelay
Specifies the maximum unwrap delay due to queueing in
milliseconds.
Type: int
Required: No
GatewayDelay
Specifies the maximum gateway node worker delay due to queueing in milliseconds.
Type: int
Required: No
ServiceDelay
Specifies the maximum provider delay due to queueing in
milliseconds.
Type: int
Required: No
KaetzchenDelay
Specifies the maximum kaetzchen delay due to queueing in
milliseconds.
Type: int
Required: No
SchedulerSlack
Specifies the maximum scheduler slack due to queueing and/or
processing in milliseconds.
Type: int
Required: No
SendSlack
Specifies the maximum send-queue slack due to queueing and/or
congestion in milliseconds.
Type: int
Required: No
DecoySlack
Specifies the maximum decoy sweep slack due to external
delays such as latency before a loop decoy packet will be considered
lost.
Type: int
Required: No
ConnectTimeout
Specifies the maximum time a connection can take to establish a
TCP/IP connection in milliseconds.
Type: int
Required: No
HandshakeTimeout
Specifies the maximum time a connection can take for a link-protocol
handshake in milliseconds.
Type: int
Required: No
ReauthInterval
Specifies the interval at which a connection will be reauthenticated
in milliseconds.
Type: int
Required: No
SendDecoyTraffic
If true, decoy traffic is enabled.
This parameter is experimental and untuned,
and is disabled by default.
Note
This option will be removed once decoy traffic is fully implemented.
Type: bool
Required: No
DisableRateLimit
If true, the per-client rate limiter is disabled.
Note
This option should only be used for testing.
Type: bool
Required: No
GenerateOnly
If true, the server immediately halts
and cleans up after long-term key generation.
Type: bool
Required: No
Configuring service nodes
The following configuration is drawn from the reference implementation in
katzenpost/docker/dirauth_mixnet/servicenode1/authority.toml.
In a real-world mixnet, the component hosts would not be sharing a single IP
address. For more information about the test mixnet, see Using the Katzenpost Docker test network.
Specifies the human-readable identifier for a node, and must be unique per mixnet.
The identifier can be an FQDN but does not have to be.
Type: string
Required: Yes
WireKEM
WireKEM specifies the key encapsulation mechanism (KEM) scheme
for the PQ
Noise-based wire protocol (link layer) that nodes use
to communicate with each other. PQ Noise is a post-quantum variation of
the Noise protocol
framework, which algebraically transforms ECDH handshake
patterns into KEM encapsulate/decapsulate operations.
This configuration option supports the optional use of
hybrid post-quantum cryptography to strengthen security. The following KEM
schemes are supported:
Classical: "x25519", "x448"
Note
X25519 and X448 are actually non-interactive key-exchanges
(NIKEs), not KEMs. Katzenpost uses
a hashed ElGamal cryptographic construction
to convert them from NIKEs to KEMs.
Specifies the cryptographic signature scheme that will be used by all
components of the mix network when interacting with the PKI system. Mix
nodes sign their descriptors using this signature scheme, and dirauth nodes
similarly sign PKI documents using the same scheme.
Specifies a list of one or more address URLs in a format that contains the
transport protocol, IP address, and port number that the server will bind to
for incoming connections. Katzenpost supports URLs with that start with
either "tcp://" or "quic://" such as: ["tcp://192.168.1.1:30001"] and
["quic://192.168.1.1:40001"].
Addresses is overridden if BindAddresses is true. In that scenario, one or more advertised, external
addresses is provided as the value of Addresses, and is advertised in the PKI document.
Note that BindAddresses, below, holds the
address values for non-advertised, internal-only listeners. The addition of
BindAddresses to the node configuration
is required for hosts connecting to the Internet through network address
translation (NAT).
Type: []string
Required: Yes
BindAddresses
If true, allows setting of listener
addresses that the server will bind to and accept connections on. These
addresses are not advertised in the PKI document. For more information, see
Addresses, above.
Type: bool, []string
Required: No
MetricsAddress
Specifies the address/port to bind the Prometheus metrics endpoint
to.
Type: string
Required: No
DataDir
Specifies the absolute path to a node's state directory. This is where
persistence.db is written to disk and where a node stores its cryptographic
key materials when started with the "-g" command-line option.
Type: string
Required: Yes
IsGatewayNode
If true, the server is a gateway
node.
Type: bool
Required: No
IsServiceNode
If true, the server is a service
node.
Type: bool
Required: No
Service node: Logging section
The Logging configuration section controls logging behavior across Katzenpost.
The ServiceNode section contains configurations for each network
service that Katzenpost supports.
Services, termed Kaetzchen, can be divided into built-in and external services.
External services are provided through the CBORPlugin, a Go programming language implementation of the Concise Binary Object
Representation (CBOR), a binary data serialization format. While
native services need simply to be activated, external services are invoked by a
separate command and connected to the mixnet over a Unix socket. The plugin
allows mixnet services to be added in any programming language.
Specifies the protocol capability exposed by the agent.
Type: string
Required: Yes
Endpoint
Specifies the provider-side Endpoint where the agent will accept
requests. While not required by the specification, this server only
supports Endpoints that are
lower-case
local parts of an email address.
Type: string
Required: Yes
Command
Specifies the full path to the external plugin program that implements
this Kaetzchen service.
Type: string
Required: Yes
MaxConcurrency
Specifies the number of worker goroutines to start for this
service.
Type: int
Required: Yes
Config
Specifies extra per-agent arguments to be passed to the agent's
initialization routine.
Type: map[string]interface{}
Required: Yes
Disable
If true, disables a configured
agent.
Type: bool
Required: No
Per-service parameters:
echo
The internal echo service must be enabled on every
service node of a production mixnet for decoy traffic to work
properly.
spool
The spool service supports the catshadow
storage protocol,
which
is required by the Katzen chat client. The
example configuration above shows spool enabled with the setting:
Disable = false
Note
Spool, properly memspool, should
not be confused with the spool database on gateway
nodes.
data_store
Specifies the full path to the service database
file.
Type: string
Required: Yes
log_dir
Specifies the path to the node's log directory.
Type: string
Required: Yes
pigeonhole
The pigeonhole courier service supports the
Blinding-and-Capability scheme (BACAP)-based unlinkable messaging
protocols detailed in Place-holder for research paper link. Most of our future protocols
will use the pigeonhole courier service.
db
Specifies the full path to the service database
file.
Type: string
Required: Yes
log_dir
Specifies the path to the node's log directory.
Type: string
Required: Yes
panda
The panda storage and authentication service
currently does not work properly.
fileStore
Specifies the full path to the service database
file.
The http service is completely optional, but allows
the mixnet to be used as an HTTP proxy. This may be useful for
integrating with existing software systems.
Specifies the human-readable identifier for a node, which must be unique per mixnet.
The identifier can be an FQDN but does not have to be.
Type: string
Required: Yes
IdentityPublicKey
String containing the node's public identity key in PEM format.
IdentityPublicKey is the node's permanent identifier
and is used to verify cryptographic signatures produced by its private
identity key.
Type: string
Required: Yes
PKISignatureScheme
Specifies the cryptographic signature scheme that will be used by all
components of the mix network when interacting with the PKI system. Mix
nodes sign their descriptors using this signature scheme, and dirauth nodes
similarly sign PKI documents using the same scheme.
Type: string
Required: Yes
LinkPublicKey
String containing the peer's public link-layer key in PEM format.
LinkPublicKey must match the specified
WireKEMScheme.
Type: string
Required: Yes
WireKEMScheme
The name of the wire protocol key-encapsulation mechanism (KEM) to use.
Type: string
Required: Yes
Addresses
Specifies a list of one or more address URLs in a format that contains the
transport protocol, IP address, and port number that the server will bind to
for incoming connections. Katzenpost supports URLs that start with
either "tcp://" or "quic://" such as: ["tcp://192.168.1.1:30001"] and
["quic://192.168.1.1:40001"]. The value of Addresses is advertised in the PKI document.
Type: []string
Required: Yes
Service node: Management section
The Management section specifies connectivity information for the
Katzenpost control protocol which can be used to make run-time configuration
changes. A configuration resembles the following, substituting the node's configured
DataDir value as part of the Path value:
Specifies the path to the management interface socket. If left empty, then management_sock
is located in the configuration's defined DataDir.
Type: string
Required: No
Service node: SphinxGeometry section
The SphinxGeometry section defines parameters for the Sphinx
encrypted nested-packet format used internally by Katzenpost.
The settings in this section are generated by the gensphinx utility, which computes the Sphinx geometry based on the
following user-supplied directives:
The number of mix node layers (not counting gateway and service
nodes)
The length of the application-usable packet payload
The selected NIKE or KEM scheme
Warning
The values in the SphinxGeometry configuration section must
be programmatically generated by gensphinx. Many of the
parameters are interdependent and cannot be individually modified. Do not
modify the these values by hand.
The gensphinx output in TOML should then be pasted unchanged
into the node's configuration file, as shown below.
The number of hops a Sphinx packet takes through the mixnet. Because
packet headers hold destination information for each hop, the size of the
header increases linearly with the number of hops.
Type: int
Required: Yes
HeaderLength
The total length of the Sphinx packet header in bytes.
Type: int
Required: Yes
RoutingInfoLength
The total length of the routing information portion of the Sphinx packet
header.
Type: int
Required: Yes
PerHopRoutingInfoLength
The length of the per-hop routing information in the Sphinx packet
header.
Type: int
Required: Yes
SURBLength
The length of a single-use reply block (SURB).
Type: int
Required: Yes
SphinxPlaintextHeaderLength
The length of the plaintext Sphinx packet header.
Type: int
Required: Yes
PayloadTagLength
The length of the payload tag.
Type: int
Required: Yes
ForwardPayloadLength
The total size of the payload.
Type: int
Required: Yes
UserForwardPayloadLength
The size of the usable payload.
Type: int
Required: Yes
NextNodeHopLength
The NextNodeHopLength is derived from the largest
routing-information block that we expect to encounter. Other packets have
NextNodeHop + NodeDelay sections, or a
Recipient section, both of which are shorter.
Type: int
Required: Yes
SPRPKeyMaterialLength
The length of the strong pseudo-random permutation (SPRP) key.
Type: int
Required: Yes
NIKEName
The name of the non-interactive key exchange (NIKE) scheme used by Sphinx
packets.
NIKEName and KEMName are mutually
exclusive.
Type: string
Required: Yes
KEMName
The name of the key encapsulation mechanism (KEM) used by Sphinx
packets.
NIKEName and KEMName are mutually
exclusive.
Type: string
Required: Yes
Service node: Debug section
The Debug section is the Katzenpost server debug configuration
for advanced tuning.
Specifies the number of worker instances to use for inbound Sphinx
packet processing.
Type: int
Required: No
NumProviderWorkers
Specifies the number of worker instances to use for provider specific
packet processing.
Type: int
Required: No
NumKaetzchenWorkers
Specifies the number of worker instances to use for Kaetzchen-specific
packet processing.
Type: int
Required: No
SchedulerExternalMemoryQueue
If true, the experimental disk-backed external memory
queue is enabled.
Type: bool
Required: No
SchedulerQueueSize
Specifies the maximum scheduler queue size before random entries will
start getting dropped. A value less than or equal to zero is treated as
unlimited.
Type: int
Required: No
SchedulerMaxBurst
Specifies the maximum number of packets that will be dispatched per
scheduler wakeup event.
Type:
Required: No
UnwrapDelay
Specifies the maximum unwrap delay due to queueing in
milliseconds.
Type: int
Required: No
GatewayDelay
Specifies the maximum gateway node worker delay due to queueing in milliseconds.
Type: int
Required: No
ServiceDelay
Specifies the maximum provider delay due to queueing in
milliseconds.
Type: int
Required: No
KaetzchenDelay
Specifies the maximum kaetzchen delay due to queueing in
milliseconds.
Type: int
Required: No
SchedulerSlack
Specifies the maximum scheduler slack due to queueing and/or
processing in milliseconds.
Type: int
Required: No
SendSlack
Specifies the maximum send-queue slack due to queueing and/or
congestion in milliseconds.
Type: int
Required: No
DecoySlack
Specifies the maximum decoy sweep slack due to external
delays such as latency before a loop decoy packet will be considered
lost.
Type: int
Required: No
ConnectTimeout
Specifies the maximum time a connection can take to establish a
TCP/IP connection in milliseconds.
Type: int
Required: No
HandshakeTimeout
Specifies the maximum time a connection can take for a link-protocol
handshake in milliseconds.
Type: int
Required: No
ReauthInterval
Specifies the interval at which a connection will be reauthenticated
in milliseconds.
Type: int
Required: No
SendDecoyTraffic
If true, decoy traffic is enabled.
This parameter is experimental and untuned,
and is disabled by default.
Note
This option will be removed once decoy traffic is fully implemented.
Type: bool
Required: No
DisableRateLimit
If true, the per-client rate limiter is disabled.
Note
This option should only be used for testing.
Type: bool
Required: No
GenerateOnly
If true, the server immediately halts
and cleans up after long-term key generation.
Any Katzenpost server node can be configured to run behind a properly configured
router that supports network
address translation (NAT) and similar network topologies that traverse public and
private network boundaries. This applies to directory authorities, gateways that allow
clients to connect to the network, mix nodes, and service nodes that provide protocols
over
the mix network such as ping and spool services for storing messages or rendezvous
information.
Typically, the router connecting a LAN with the Internet blocks incoming connections
by
default, and must be configured to forward traffic from the Internet to a destination
host
based on port number. These target addresses are most often drawn from RFC 6598 private address
space, although more exotic topologies involving public IP address may also be targeted.
(Router configuration for NAT topologies in general is beyond the scope of this topic.)
For
such cases, where the host listens on a LAN-side address:port but is
accessed publicly using a different address:port, Katzenpost
provides mechanisms to specify both addresses.
Note
Katzenpost does not support NAT penetration protocols such as NATPMP, STUN, TURN, and UPnP.
Addresses and BindAddresses
In a direct network connection, the values defined in the server
Addresses parameter define the addresses on which the node
listens for incoming connections, and which are advertised to other mixnet components
in
the PKI document. By supplying the optional BindAddresses
parameter, you can define a second address group: LAN-side addresses that are
not advertised in the PKI document. This is useful for NAT
scenarios, which involve both public and private address spaces.
Note
The Addresses and BindAddresses
parameters are closely analogous to Tor's Address and
ORPort parameters. For more information, see the
torrc man page.
Specifies a list of one or more address URIs in a format that
contains the transport protocol (typically TCP), an IP address,
and a port number that the node will bind to for incoming
connections. This value is advertised in the PKI document.
BindAddresses
No
If true (that is, if this
parameter is present), this parameter sets listener
address:port values that the server
will bind to and accept connections on, but that are not
advertised in the PKI document. In this case,
Addresses defines public addresses on
the Internet side of a NAT router, while
BindAddresses defines a different set
of addresses behind the NAT router.
Note
Directory authorities do not support the BindAddresses parameter,
but can still be used behind NAT. For more information, see Hosting a directory authority behind NAT
Hosting mix, gateway, and service nodes behind NAT
This section provides an example of a Katzenpost topology that make use of the
BindAddresses parameter. In this scenario, a mix node behind
NAT listens on local addresses for connections, while advertising a public address
and
port to its peer, a directory authority, that is assumed to have a publicly routable
address.
The configuration file on the NATed mix node is katzenpost.toml.
The relevant section of the configuration file is
[Server].
The Addresses parameter specifies the publicly routable
address:port, 203.0.113.10:1234, over which the mix
node can be reached from the Internet. This value is periodically advertised in
the PKI document to other components of the mix network.
The BindAddresses parameter specifies the LAN
address:port, 192.168.0.2:1234, on which the node
listens for incoming Sphinx packets from peers.
The NAT router has two configured addresses, public address 203.0.113.10 and
private LAN address 192.168.0.1.
The NAT router forwards traffic for 203.0.113.10:1234 to the mix node's LAN
address:port, 192.168.0.2:1234, where the configured
listener is bound.
The configuration in this example applies equally well to a NATed gateway node or
service provider. A NATed gateway node would also be reachable by a client with
knowledge of the gateway's public address.
Hosting a directory authority behind NAT
Directory authorities have no support for the BindAddresses
parameter. They also do not advertise an address in the PKI document, because peers
must already know the address in order to fetch the document, which means that addresses
for dirauths must be provided out-of-band.
Consequently, the Addresses parameter for dirauths performs the same
function as BindAddresses on the other node types, that is, to define the
node's listening address:port values, but not an advertised
address. In a NAT scenario, these addresses can refer to any target that is situated
on
the LAN side of the NAT router.
Figure 2. Accessing a directory authority behind NAT
The configuration file on the NATed dirauth is authority.toml.
The relevant section of the configuration file is
[Server].
The Addresses parameter specifies a private RFC 6598address:port, 192.168.0.2:1234. By definition, this address
cannot be reached directly from the Internet.
There is no BindAddresses parameter.
The NAT device has two configured addresses, public address 198.51.100.50, and
LAN address 192.168.0.1.
The NAT device routes traffic targeting 198.51.100.50:1234 to the
address:port specified in Addresses,
192.168.0.2:1234.
The dirauth does not advertise its address on the mix network. The address
must provided to peers out-of-band.
1.6 -
Appendix: Configuration files from the Docker test mixnet
Appendix: Configuration files from the Docker test
mixnet
As an aid to administrators implementing a Katzenpost mixnet, this appendix provides
lightly edited examples of configuration files for each Katzenpost node type. These
files are drawn from a built instance of the Docker test
mixnet. These code listings are meant to be used as a reference alongside the
detailed configuration documentation in Components and configuration of the Katzenpost mixnet. You cannot use these
listings as a drop-in solution in your own mixnets for reasons explained in the Network topology and components section of the Docker test mixnet documentation.
A detailed design specification for our PQ Noise based wire protocol, which is used for transport encryption between all the mix nodes and dirauth nodes.
The key words “MUST”, “MUST NOT”, “REQUIRED”,
“SHALL”, “SHALL NOT”, “SHOULD”, “SHOULD
NOT”, “RECOMMENDED”, “MAY”, and
“OPTIONAL” in this document are to be interpreted as described in the section called “References”.
The “C” style Presentation Language as described in the section called “References” Section 4 is used to represent data structures, except for
cryptographic attributes, which are specified as opaque byte vectors.
x | y denotes the concatenation of x and y.
1. Introduction
The Katzenpost Mix Network Wire Protocol (KMNWP) is the custom wire protocol for
all
network communications to, from, and within the Katzenpost Mix Network. This protocol
provides mutual authentication, and an additional layer of cryptographic security
and
forward secrecy.
1.2 Key Encapsulation Mechanism
This protocol uses ANY Key Encapsulation Mechanism. However it’s recommended that
most users select a hybrid post quantum KEM such as Xwing. the section called “References”
2. Core Protocol
The protocol is based on Kyber and Trevor Perrin’s Noise Protocol Framework the section called “References” along with “Post Quantum Noise” paper PQNOISE. Older previous versions of our transport were
based on NOISEHFS.
Our transport protocol begins with a prologue, Noise handshake, followed by a stream
of Noise Transport messages in a minimal framing layer, over a TCP/IP connection.
Our Noise protocol is configurable via the KEM selection in the TOML configuration
files, here’s an example PQ Noise protocol string:
Noise_pqXX_Xwing_ChaChaPoly_BLAKE2b
The protocol string is a very condensed description of our protocol. We use the pqXX
two way Noise pattern which is described as follows:
pqXX: -> e <- ekem, s -> skem, s <- skem
The next part of the protocol string specifies the KEM, Xwing
which is a hybrid KEM where the share secret outputs of both X25519 and MLKEM768 are
combined.
Finally the ChaChaPoly_BLAKE2b parts of the protocol string
indicate which stream cipher and hash function we are using.
As a non-standard modification to the Noise protocol, the 65535 byte message length
limit is increased to 1300000 bytes. We send very large messages over our Noise protocol
because of our using the Sphincs+ signature scheme which has signatures that are about
49k bytes.
It is assumed that all parties using the KMNWP protocol have a fixed long or short
lived Xwing keypair XWING, the public
component of which is known to the other party in advance. How such keys are distributed
is beyond the scope of this document.
2.1 Handshake Phase
All sessions start in the Handshake Phase, in which an anonymous authenticated
handshake is conducted.
The handshake is a unmodified Noise handshake, with a fixed prologue prefacing
the initiator’s first Noise handshake message. This prologue is also used as the
prologue input to the Noise HandshakeState
Initialize() operation for both the initiator and responder.
The prologue is defined to be the following structure:
As all Noise handshake messages are fixed sizes, no additional framing is
required for the handshake.
Implementations MUST preserve the Noise handshake hash [h] for
the purpose of implementing authentication (Section 2.3).
Implementations MUST reject handshake attempts by terminating the session
immediately upon any Noise protocol handshake failure and when, as a responder, they
receive a Prologue containing an unknown protocol_version value.
Implementations SHOULD impose reasonable timeouts for the handshake process, and
SHOULD terminate sessions that are taking too long to handshake.
2.1.1 Handshake Authentication
Mutual authentication is done via exchanging fixed sized payloads as part of the
pqXX handshake consisting of the following structure:
ad_len - The length of the optional additional data.
additional_data - Optional additional data, such as a
username, if any.
unix_time - 0 for the initiator, the approximate number
of seconds since 1970-01-01 00:00:00 UTC for the responder.
The initiator MUST send the AuthenticateMessage after it has
received the peer’s response (so after -> s, se in Noise
parlance).
The contents of the optional additional_data field is
deliberately left up to the implementation, however it is RECOMMENDED that
implementations pad the field to be a consistent length regardless of contents to
avoid leaking information about the authenticating identity.
To authenticate the remote peer given an AuthenticateMessage, the receiving peer
must validate the s component of the Noise handshake (the remote
peer’s long term public key) with the known value, along with any of the information
in the additional_data field such as the user name, if any.
If the validation procedure succeeds, the peer is considered authenticated. If
the validation procedure fails for any reason, the session MUST be terminated
immediately.
Responders MAY add a slight amount (+- 10 seconds) of random noise to the
unix_time value to avoid leaking precise load information via packet queueing delay.
2.2 Data Transfer Phase
Upon successfully concluding the handshake the session enters the Data Transfer
Phase, where the initiator and responder can exchange KMNWP messages.
A KMNWP message is defined to be the following structure:
The ciphertext_length field includes the Noise
protocol overhead of 16 bytes, for the Noise Transport message containing
the Ciphertext.
All outgoing Message(s) are preceded by a Noise Transport Message containing a
CiphertextHeader, indicating the size of the Noise Transport
Message transporting the Message Ciphertext. After generating both Noise Transport
Messages, the sender MUST call the Noise CipherState Rekey()
operation.
To receive incoming Ciphertext messages, first the Noise Transport Message
containing the CiphertextHeader is consumed off the network, authenticated and
decrypted, giving the receiver the length of the Noise Transport Message containing
the actual message itself. The second Noise Transport Message is consumed off the
network, authenticated and decrypted, with the resulting message being returned to
the caller for processing. After receiving both Noise Transport Messages, the
receiver MUST call the Noise CipherState Rekey() operation.
Implementations MUST immediately terminate the session any of the
DecryptWithAd() operations fails.
Implementations MUST immediately terminate the session if an unknown command is
received in a Message, or if the Message is otherwise malformed in any way.
Implementations MAY impose a reasonable idle timeout, and terminate the session
if it expires.
3. Predefined Commands
3.1 The no_op Command
The no_op command is a command that explicitly is a No
Operation, to be used to implement functionality such as keep-alives and or
application layer padding.
Implementations MUST NOT send any message payload accompanying this command, and
all received command data MUST be discarded without interpretation.
3.2 The disconnect Command
The disconnect command is a command that is used to signal
explicit session termination. Upon receiving a disconnect command, implementations
MUST interpret the command as a signal from the peer that no additional commands
will be sent, and destroy the cryptographic material in the receive CipherState.
While most implementations will likely wish to terminate the session upon
receiving this command, any additional behavior is explicitly left up to the
implementation and application.
Implementations MUST NOT send any message payload accompanying this command, and
MUST not send any further traffic after sending a disconnect command.
3.3 The send_packet Command
The send_packet command is the command that is used by the
initiator to transmit a Sphinx Packet over the network. The command’s message is the
Sphinx Packet destined for the responder.
Initiators MUST terminate the session immediately upon reception of a
send_packet command.
4. Command Padding
We use traffic padding to hide from a passive network observer which command has
been
sent or received.
Among the set of padded commands we exclude the Consensus command
because it’s contents are a very large payload which is usually many times larger
than
our Sphinx packets. Therefore we only pad these commands:
However we split them up into two directions, client to server and server to client
because they differ in size due to the difference in size between
SendPacket and Message:
The GetConsensus command is a special case because we only want to
pad it when it’s sent over the mixnet. We don’t want to pad it when sending to the
dirauths. Although it would not be so terrible if it’s padded when sent to the dirauths…
it would just needlessly take up bandwidth without providing any privacy benefits.
5. Anonymity Considerations
Adversaries being able to determine that two parties are communicating via KMNWP
is
beyond the threat model of this protocol. At a minimum, it is trivial to determine
that
a KMNWP handshake is being performed, due to the length of each handshake message,
and
the fixed positions of the various public keys.
6. Security Considerations
It is imperative that implementations use ephemeral keys for every handshake as the
security properties of the Kyber KEM are totally lost if keys are ever reused.
Kyber was chosen as the KEM algorithm due to it’s conservative parameterization,
simplicity of implementation, and high performance in software. It is hoped that the
addition of a quantum resistant algorithm will provide forward secrecy even in the
event
that large scale quantum computers are applied to historical intercepts.
Acknowledgments
I would like to thank Trevor Perrin for providing feedback during the design of this
protocol, and answering questions regarding Noise.
References
XWING
Manuel Barbosa, Deirdre Connolly, João Diogo Duarte, Aaron Kaiser, Peter Schwabe,
Karoline Varner, Bas Westerbaan, “X-Wing: The Hybrid KEM You’ve Been Looking
For”,
https://eprint.iacr.org/2024/039.pdf.
Yawning Angel, Benjamin Dowling, Andreas Hülsing, Peter Schwabe and Florian Weber,
“Post Quantum Noise”, September 2023,
https://eprint.iacr.org/2022/539.pdf.
RFC2119
Bradner, S., “Key words for use in RFCs to Indicate Requirement
Levels”, BCP 14, RFC 2119, DOI 10.17487/RFC2119, March 1997, http://www.rfc-editor.org/info/rfc2119.
RFC5246
Dierks, T. and E. Rescorla, “The Transport Layer Security (TLS) Protocol
Version 1.2”, RFC 5246, DOI 10.17487/RFC5246, August 2008, http://www.rfc-editor.org/info/rfc5246.
This document defines the Sphinx cryptographic packet format for decryption mix
networks, and provides a parameterization based around generic cryptographic
primitives types. This document does not introduce any new crypto, but is meant to
serve as an implementation guide.
The following terms are used in this specification.
message
A variable-length sequence of octets sent anonymously through the network.
Short messages are sent in a single packet; long messages are fragmented
across multiple packets.
packet
A Sphinx packet, of fixed
length for each class of traffic, carrying a message payload and metadata for routing.
Packets are routed anonymously through the mixnet and cryptographically transformed
at
each hop.
header
The packet header consisting of several components, which convey the
information necessary to verify packet integrity and correctly process the
packet.
payload
The fixed-length portion of a packet containing an encrypted message or
part of a message, to be delivered anonymously.
group element
An individual element of the group.
group generator
A group element capable of generating any other element of the group, via
repeated applications of the generator and the group operation.
Conventions Used in This Document
The key words “MUST”, “MUST NOT”, “REQUIRED”,
“SHALL”, “SHALL NOT”, “SHOULD”, “SHOULD
NOT”, “RECOMMENDED”, “MAY”, and
“OPTIONAL” in this document are to be interpreted as described in RFC2119.
The C style Presentation Language as described in RFC5246 Section 4 is used to represent data structures,
except for cryptographic attributes, which are specified as opaque byte vectors.
x | y denotes the concatenation of x and y.
x ^ y denotes the bitwise XOR of x and y.
byte an 8-bit octet.
x[a:b] denotes the sub-vector of x where a/b denote the
start/end byte indexes (inclusive-exclusive); a/b may be omitted to signify the
start/end of the vector x respectively.
x[y] denotes the y’th element of list x.
x.len denotes the length of list x.
ZEROBYTES(N) denotes N bytes of 0x00.
RNG(N) denotes N bytes of cryptographic random data.
LEN(N) denotes the length in bytes of N.
CONSTANT_TIME_CMP(x, y) denotes a constant time comparison
between the byte vectors x and y, returning true iff x and y are equal.
1. Introduction
The Sphinx cryptographic packet format is a compact and provably secure design
introduced by George Danezis and Ian Goldberg SPHINX09.
It supports a full set of security features: indistinguishable replies, hiding the
path
length and relay position, detection of tagging attacks and replay attacks, as well
as
providing unlinkability for each leg of the packet’s journey over the network.
2. Cryptographic Primitives
This specification uses the following cryptographic primitives as the foundational
building blocks for Sphinx:
H(M) - A cryptographic hash function which takes an octet
array M to produce a digest consisting of a HASH_LENGTH byte
octet array. H(M) MUST be pre-image and collision resistant.
MAC(K, M) - A cryptographic message authentication code
function which takes a M_KEY_LENGTH byte octet array key
K and arbitrary length octet array message
M to produce an authentication tag consisting of a
MAC_LENGTH byte octet array.
KDF(SALT, IKM) - A key derivation function which takes an
arbitrary length octet array salt SALT and an arbitrary
length octet array initial key IKM, to produce an octet array
of arbitrary length.
S(K, IV) - A pseudo-random generator (stream cipher) which
takes a S_KEY_LENGTH byte octet array key
K and a S_IV_LENGTH byte octet array
initialization vector IV to produce an octet array key stream
of arbitrary length.
SPRP_Encrypt(K, M)/SPRP_Decrypt(K, M) - A strong
pseudo-random permutation (SPRP) which takes a
SPRP_KEY_LENGTH byte octet array key K
and arbitrary length message M, and produces the encrypted
ciphertext or decrypted plaintext respectively.
When used with the default payload authentication mechanism, the SPRP MUST be
“fragile” in that any amount of modifications to M results in
a large number of unpredictable changes across the whole message upon a
SPRP_Encrypt() or SPRP_Decrypt()
operation.
EXP(X, Y) - An exponentiation function which takes the
GROUP_ELEMENT_LENGTH byte octet array group elements
X and Y, and returns X ^^ Y as a GROUP_ELEMENT_LENGTH byte octet array.
Let G denote the generator of the group, and
EXP_KEYGEN() return a
GROUP_ELEMENT_LENGTH byte octet array group element
usable as private key.
The group defined by G and EXP(X, Y)
MUST satisfy the Decision Diffie-Hellman problem.
EXP_KEYGEN() - Returns a new “suitable” private key for
EXP().
2.1 Sphinx Key Derivation Function
Sphinx Packet creation and processing uses a common Key Derivation Function (KDF)
to derive the required MAC and symmetric cryptographic keys from a per-hop shared
secret.
The output of the KDF is partitioned according to the following structure:
The Sphinx Packet Format is parameterized by the implementation based on the
application and security requirements.
AD_LENGTH - The constant amount of per-packet unencrypted
additional data in bytes.
PAYLOAD_TAG_LENGTH - The length of the message payload
authentication tag in bytes. This SHOULD be set to at least 16 bytes (128
bits).
PER_HOP_RI_LENGTH - The length of the per-hop Routing
Information (Section 4.1.1 <4.1.1>) in bytes.
NODE_ID_LENGTH - The node identifier length in bytes.
RECIPIENT_ID_LENGTH - The recipient identifier length in
bytes.
SURB_ID_LENGTH - The Single Use Reply Block
(Section 7 <7.0>) identifier length in bytes.
MAX_HOPS - The maximum number of hops a packet can
traverse.
PAYLOAD_LENGTH - The per-packet message payload length in
bytes, including a PAYLOAD_TAG_LENGTH byte authentication
tag.
KDF_INFO - A constant opaque byte vector used as the info
parameter to the KDF for the purpose of domain separation.
3.2 Sphinx Packet Geometry
The Sphinx Packet Geometry is derived from the Sphinx Parameter Constants
Section 3.1. These are all derived parameters, and are
primarily of interest to implementors.
ROUTING_INFO_LENGTH - The total length of the “routing
information” Sphinx Packet Header component in bytes:
header - The packet header consists of several components,
which convey the information necessary to verify packet integrity and correctly
process the packet.
payload - The application message data.
4.1 Sphinx Packet Header
The Sphinx Packet Header refers to the block of data immediately preceding the
Sphinx Packet Payload in a Sphinx Packet.
The structure of the Sphinx Packet Header is defined as follows:
additional_data - Unencrypted per-packet Additional Data
(AD) that is visible to every hop. The AD is authenticated on a per-hop
basis.
As the additional_data is sent in the clear and traverses the network
unaltered, implementations MUST take care to ensure that the field cannot be
used to track individual packets.
group_element - An element of the cyclic group, used to
derive the per-hop key material required to authenticate and process the
rest of the SphinxHeader and decrypt a single layer of the Sphinx Packet
Payload encryption.
routing_information - A vector of per-hop routing
information, encrypted and authenticated in a nested manner. Each element of
the vector consists of a series of routing commands, specifying all of the
information required to process the packet.
The precise encoding format is specified in Section 4.1.1 <4.1.1>.
MAC - A message authentication code tag covering the
additional_data, group_element, and routing_information.
4.1.1 Per-hop routing information
The routing_information component of the Sphinx Packet Header contains a vector
of per-hop routing information. When processing a packet, the per hop processing is
set up such that the first element in the vector contains the routing commands for
the current hop.
The structure of the routing information is as follows:
While the NullCommand padding field is specified as opaque,
implementations SHOULD zero fill the padding. The choice of 0x00
as the terminal NullCommand is deliberate to ease implementation, as
ZEROBYTES(N) produces a valid NullCommand RoutingCommand,
resulting in “appending zero filled padding” producing valid output.
Implementations MUST pad the routing_commands vector so that it is exactly
PER_HOP_RI_LENGTH bytes, by appending a terminal NullCommand
if necessary.
Every non-terminal hop’s routing_commands MUST include a
NextNodeHopCommand.
4.2 Sphinx Packet Payload
The Sphinx Packet Payload refers to the block of data immediately following the
Sphinx Packet Header in a Sphinx Packet.
For most purposes the structure of the Sphinx Packet Payload can be treated as a
single contiguous byte vector of opaque data.
Upon packet creation, the payload is repeatedly encrypted (unless it is a SURB
Reply, see Section 7.0 via keys derived from the Diffie-Hellman
key exchange between the packet’s group_element and the public
key of each node in the path.
Authentication of packet integrity is done by prepending a tag set to a known
value to the plaintext prior to the first encrypt operation. By virtue of the
fragile nature of the SPRP function, any alteration to the encrypted payload as it
traverses the network will result in an irrecoverably corrupted plaintext when the
payload is decrypted by the recipient.
5. Sphinx Packet Creation
For the sake of brevity, the pseudocode for all of the operations will take a vector
of the following PathHop structure as a parameter named path[] to specify the path
a
packet will traverse, along with the per-hop routing commands and per-hop public keys.
struct {
/* There is no need for a node_id here, as
routing_commands[0].next_hop specifies that
information for all non-terminal hops. */
opaque public_key[GROUP_ELEMENT_LENGTH];
RoutingCommand routing_commands<1...2^8-1>;
} PathHop;
It is assumed that each routing_commands vector except for the terminal entry
contains at least a RoutingCommand consisting of a partially assembled
NextNodeHopCommand with the next_hop element filled in with the
identifier of the next hop.
5.1 Create a Sphinx Packet Header
Both the creation of a Sphinx Packet and the creation of a SURB requires the
generation of a Sphinx Packet Header, so it is specified as a distinct operation.
additional_data The Additional Data that is visible to
every node along the path in the header.
path The vector of PathHop structures in hop order,
specifying the node id, public key, and routing commands for each hop.
Outputs: sphinx_header The resulting Sphinx Packet Header.
payload_keys The vector of SPRP keys used to encrypt the
Sphinx Packet Payload, in hop order.
The Sphinx_Create_Header operation consists of the following
steps:
Derive the key material for each hop.
num_hops = route.len
route_keys = [ ]
route_group_elements = [ ]
priv_key = EXP_KEYGEN()
/* Calculate the key material for the 0th hop. */
group_element = EXP( G, priv_key )
route_group_elements += group_element
shared_secret = EXP( path[0].public_key, priv_key )
route_keys += Sphinx_KDF( KDF_INFO, shared_secret )
blinding_factor = keys[0].blinding_factor
/* Calculate the key material for rest of the hops. */
for i = 1; i < num_hops; ++i:
shared_secret = EXP( path[i].public_key, priv_key )
for j = 0; j < i; ++j:
shared_secret = EXP( shared_secret, keys[j].blinding_factor )
route_keys += Sphinx_KDF( KDF_INFO, shared_secret )
group_element = EXP( group_element, keys[i-1].blinding_factor )
route_group_elements += group_element
At the conclusion of the derivation process:
route_keys - A vector of per-hop SphinxKeys.
route_group_elements - A vector of per-hop group
elements.
Derive the routing_information keystream and encrypted padding for each
hop.
ri_keystream = [ ]
ri_padding = [ ]
for i = 0; i < num_hops; ++i:
keystream = ZEROBYTES( ROUTING_INFO_LENGTH + PER_HOP_RI_LENGTH ) ^
S( route_keys[i].header_encryption,
route_keys[i].header_encryption_iv )
ks_len = LEN( keystream ) - (i + 1) * PER_HOP_RI_LENGTH
padding = keystream[ks_len:]
if i > 0:
prev_pad_len = LEN( ri_padding[i-1] )
padding = padding[:prev_pad_len] ^ ri_padding[i-1] |
padding[prev_pad_len]
ri_keystream += keystream[:ks_len]
ri_padding += padding
At the conclusion of the derivation process:
ri_keystream - A vector of per-hop routing_information
encryption keystreams.
ri_padding - The per-hop encrypted routing_information
padding.
Create the routing_information block.
/* Start with the terminal hop, and work backwards. */
i = num_hops - 1
/* Encode the terminal hop's routing commands. As the
terminal hop can never have a NextNodeHopCommand, there
are no per-hop alterations to be made. */
ri_fragment = path[i].routing_commands |
ZEROBYTES( PER_HOP_RI_LENGTH - LEN( path[i].routing_commands ) )
/* Encrypt and MAC. */
ri_fragment ^= ri_keystream[i]
mac = MAC( route_keys[i].header_mac, additional_data |
route_group_elements[i] | ri_fragment |
ri_padding[i-1] )
routing_info = ri_fragment
if num_hops < MAX_HOPS:
pad_len = (MAX_HOPS - num_hops) * PER_HOP_RI_LENGTH
routing_info = routing_info | RNG( pad_len )
/* Calculate the routing info for the rest of the hops. */
for i = num_hops - 2; i >= 0; --i:
cmds_to_encode = [ ]
/* Find and finalize the NextNodeHopCommand. */
for j = 0; j < LEN( path[i].routing_commands; j++:
cmd = path[i].routing_commands[j]
if cmd.command == next_node_hop:
/* Finalize the NextNodeHopCommand. */
cmd.MAC = mac
cmds_to_encode = cmds_to_encode + cmd /* Append */
/* Append a terminal NullCommand. */
ri_fragment = cmds_to_encode |
ZEROBYTES( PER_HOP_RI_LENGTH - LEN( cmds_to_encode ) )
/* Encrypt and MAC */
routing_info = ri_fragment | routing_info /* Prepend. */
routing_info ^= ri_keystream[i]
if i > 0:
mac = MAC( route_keys[i].header_mac, additional_data |
route_group_elements[i] | routing_info |
ri_padding[i-1] )
else:
mac = MAC( route_keys[i].header_mac, additional_data |
route_group_elements[i] | routing_info )
At the conclusion of the derivation process:
routing_info - The completed routing_info block.
mac - The MAC for the 0th hop.
Assemble the completed Sphinx Packet Header and Sphinx Packet Payload
SPRP key vector.
Mix nodes process incoming packets first by performing the
Sphinx_Unwrap operation to authenticate and decrypt the packet,
and if applicable prepare the packet to be forwarded to the next node.
If Sphinx_Unwrap returns an error for any given packet, the packet
MUST be discarded with no additional processing.
After a packet has been unwrapped successfully, a replay detection tag is checked
to
ensure that the packet has not been seen before. If the packet is a replay, the packet
MUST be discarded with no additional processing.
The routing commands for the current hop are interpreted and executed, and finally
the packet is forwarded to the next mix node over the network or presented to the
application if the current node is the final recipient.
6.1 Sphinx_Unwrap Operation
The Sphinx_Unwrap operation is the majority of the per-hop
packet processing, handling authentication, decryption, and modifying the packet
prior to forwarding it to the next node.
private_routing_key A group element GROUP_ELEMENT_LENGTH
bytes in length, that serves as the unwrapping Mix’s private key.
sphinx_packet A Sphinx packet to unwrap.
Outputs:
error Indicating a unsuccessful unwrap operation if
applicable.
sphinx_packet The resulting Sphinx packet.
routing_commands A vector of RoutingCommand, specifying
the post unwrap actions to be taken on the packet.
replay_tag A tag used to detect whether this packet was
processed before.
The Sphinx_Unwrap operation consists of the following steps:
(Optional) Examine the Sphinx Packet Header’s Additional Data.
If the header’s additional_data element contains information
required to complete the unwrap operation, such as specifying the packet format
version or the cryptographic primitives used examine it now.
Implementations MUST NOT treat the information in the
additional_data element as trusted until after the completion
of Step 3 (“Validate the Sphinx Packet Header”).
Calculate the hop’s shared secret, and replay_tag.
Upon the completion of the Sphinx_Unwrap operation,
implementations MUST take several additional steps. As the exact behavior is
mostly implementation specific, pseudocode will not be provided for most of the
post processing steps.
Apply replay detection to the packet.
The replay_tag value returned by Sphinx_Unwrap MUST be
unique across all packets processed with a given
private_routing_key.
The exact specifics of how to detect replays is left up to the
implementation, however any replays that are detected MUST be discarded
immediately.
Act on the routing commands, if any.
The exact specifics of how implementations chose to apply routing commands is
deliberately left unspecified, however in general:
If there is a NextNodeHopCommand, the packet
should be forwarded to the next node based on the
next_hop field upon completion of the post
processing.
The lack of a NextNodeHopCommand indicates that the packet is
destined for the current node.
If there is a SURBReplyCommand, the packet should
be treated as a SURBReply destined for the current node, and decrypted
accordingly (See Section 7.2)
If the implementation supports multiple recipients on a single node,
the RecipientCommand command should be used to
determine the correct recipient for the packet, and the payload
delivered as appropriate.
It is possible for both a RecipientCommand and a NextNodeHopCommand
to be present simultaneously in the routing commands for a given hop.
The behavior when this situation occurs is implementation defined.
Authenticate the packet if required.
If the packet is destined for the current node, the integrity of the payload
MUST be authenticated.
A Single Use Reply Block (SURB) is a delivery token with a short lifetime, that
can be used by the recipient to reply to the initial sender.
SURBs allow for anonymous replies, when the recipient does not know the sender of
the message. Usage of SURBs guarantees anonymity properties but also makes the reply
messages indistinguishable from forward messages both to external adversaries as
well as the mix nodes.
When a SURB is created, a matching reply block Decryption Token is created, which
is used to decrypt the reply message that is produced and delivered via the SURB.
The Sphinx SURB wire encoding is implementation defined, but for the purposes of
illustrating creation and use, the following will be used:
Structurally a SURB consists of three parts, a pre-generated Sphinx Packet
Header, a node identifier for the first hop to use when using the SURB to reply,
and cryptographic keying material by which to encrypt the reply’s payload. All
elements must be securely transmitted to the recipient, perhaps as part of a
forward Sphinx Packet’s Payload, but the exact specifics on how to accomplish
this is left up to the implementation.
When creating a SURB, the terminal routing_commands vector SHOULD include a
SURBReplyCommand, containing an identifier to ensure that the payload can be
decrypted with the correct set of keys (Decryption Token). The routing command
is left optional, as it is conceivable that implementations may chose to use
trial decryption, and or limit the number of outstanding SURBs to solve this
problem.
7.2 Decrypt a Sphinx Reply Originating from a SURB
A Sphinx Reply packet that was generated using a SURB is externally
indistinguishable from a forward Sphinx Packet as it traverses the network.
However, the recipient of the reply has an additional decryption step, the
packet starts off unencrypted, and accumulates layers of Sphinx Packet Payload
decryption as it traverses the network.
Determining which decryption token to use when decrypting the SURB reply can
be done via the SURBReplyCommand’s id field, if one is included at the time of
the SURB’s creation.
decryption_token The vector of keys allowing a client
to decrypt the reply ciphertext payload. This decryption_token is
generated when the SURB is created.
payload The Sphinx Packet ciphertext payload.
Outputs:
error Indicating a unsuccessful unwrap operation if
applicable.
message The plaintext message.
The Sphinx_Decrypt_SURB_Reply operation consists of the following steps:
Encrypt the message to reverse the decrypt operations the payload
acquired as it traversed the network.
The process for using a SURB to reply anonymously is slightly different from the
standard packet creation process, as the Sphinx Packet Header is already generated
(as part of the SURB), and there is an additional layer of Sphinx Packet Payload
encryption that must be performed.
Depending on the mix topology, there is no hard requirement that the per-hop
routing info is padded to one fixed constant length.
For example, assuming a layered topology (referred to as stratified topology
in the literature) MIXTOPO10, where the layer
of any given mix node is public information, as long as the following two
invariants are maintained, there is no additional information available to an
adversary:
All packets entering any given mix node in a certain layer are
uniform in length.
All packets leaving any given mix node in a certain layer are uniform
in length.
The only information available to an external or internal observer is the
layer of any given mix node (via the packet length), which is information they
are assumed to have by default in such a design.
9.2 Additional Data Field Considerations
The Sphinx Packet Construct is crafted such that any given packet is bitwise
unlinkable after a Sphinx_Unwrap operation, provided that the optional
Additional Data (AD) facility is not used. This property ensures that external
passive adversaries are unable to track a packet based on content as it
traverses the network. As the on-the-wire AD field is static through the
lifetime of a packet (ie: left unaltered by the Sphinx_Unwrap
operation), implementations and applications that wish to use this facility MUST
NOT transmit AD that can be used to distinctly identify individual packets.
9.3 Forward Secrecy Considerations
Each node acting as a mix MUST regenerate their asymmetric key pair
relatively frequently. Upon key rotation the old private key MUST be securely
destroyed. As each layer of a Sphinx Packet is encrypted via key material
derived from the output of an ephemeral/static Diffie-Hellman key exchange,
without the rotation, the construct does not provide Perfect Forward Secrecy.
Implementations SHOULD implement defense-in-depth mitigations, for example by
using strongly forward-secure link protocols to convey Sphinx Packets between
nodes.
This frequent mix routing key rotation can limit SURB usage by directly
reducing the lifetime of SURBs. In order to have a strong Forward Secrecy
property while maintaining a higher SURB lifetime, designs such as forward
secure mixes SFMIX03 could be used.
9.4 Compulsion Threat Considerations
Reply Blocks (SURBs), forward and reply Sphinx packets are all vulnerable to
the compulsion threat, if they are captured by an adversary. The adversary can
request iterative decryptions or keys from a series of honest mixes in order to
perform a deanonymizing trace of the destination.
While a general solution to this class of attacks is beyond the scope of this
document, applications that seek to mitigate or resist compulsion threats could
implement the defenses proposed in COMPULS05
via a series of routing command extensions.
9.5 SURB Usage Considerations for Volunteer Operated Mix Networks
Given a hypothetical scenario where Alice and Bob both wish to keep their
location on the mix network hidden from the other, and Alice has somehow
received a SURB from Bob, Alice MUST not utilize the SURB directly because in
the volunteer operated mix network the first hop specified by the SURB could be
operated by Bob for the purpose of deanonymizing Alice.
This problem could be solved via the incorporation of a “cross-over
point” such as that described in MIXMINION, for example by having Alice delegating the transmission
of a SURB Reply to a randomly selected crossover point in the mix network, so
that if the first hop in the SURB’s return path is a malicious mix, the only
information gained is the identity of the cross-over point.
10. Security Considerations
10.1 Sphinx Payload Encryption Considerations
The payload encryption’s use of a fragile (non-malleable) SPRP is deliberate
and implementations SHOULD NOT substitute it with a primitive that does not
provide such a property (such as a stream cipher based PRF). In particular there
is a class of correlation attacks (tagging attacks) targeting anonymity systems
that involve modification to the ciphertext that are mitigated if alterations to
the ciphertext result in unpredictable corruption of the plaintext (avalanche
effect).
Additionally, as the PAYLOAD_TAG_LENGTH based tag-then-encrypt payload
integrity authentication mechanism is predicated on the use of a non-malleable
SPRP, implementations that substitute a different primitive MUST authenticate
the payload using a different mechanism.
Alternatively, extending the MAC contained in the Sphinx Packet Header to
cover the Sphinx Packet Payload will both defend against tagging attacks and
authenticate payload integrity. However, such an extension does not work with
the SURB construct presented in this specification, unless the SURB is only used
to transmit payload that is known to the creator of the SURB.
Bradner, S., “Key words for use in RFCs to Indicate Requirement
Levels”, BCP 14, RFC 2119, DOI 10.17487/RFC2119, March 1997, http://www.rfc-editor.org/info/rfc2119.
RFC5246
Dierks, T. and E. Rescorla, “The Transport Layer Security (TLS) Protocol
Version 1.2”, RFC 5246, DOI 10.17487/RFC5246, August 2008, http://www.rfc-editor.org/info/rfc5246.
This document defines the replay detection for any protocol that uses Sphinx
cryptographic packet format. This document is meant to serve as an implementation
guide and document the existing replay protect for deployed mix networks.
The following terms are used in this specification.
epoch
A fixed time interval defined in section 4.2 Sphinx Mix and Provider Key
Rotation. The epoch is currently set to 20 minutes.
A new PKI document containing public key material
is published for each epoch and is valid only for that epoch.
packet
A Sphinx packet, of fixed
length for each class of traffic, carrying a message payload and metadata for routing.
Packets are routed anonymously through the mixnet and cryptographically transformed
at
each hop.
header
The packet header consisting of several components, which convey the
information necessary to verify packet integrity and correctly process the
packet.
payload
The fixed-length portion of a packet containing an encrypted message or
part of a message, to be delivered anonymously.
group
A finite set of elements and a binary operation that satisfy the
properties of closure, associativity, invertability, and the presence of an
identity element.
group element
An individual element of the group.
group generator
A group element capable of generating any other element of the group, via
repeated applications of the generator and the group operation.
SEDA
Staged Event Driven Architecture. 1. A
highly parallelizable computation model. 2. A computational pipeline
composed of multiple stages connected by queues utilizing active queue
management algorithms that can evict items from the queue based on dwell
time or other criteria where each stage is a thread pool. 3. The only
correct way to efficiently implement a software based router on general
purpose computing hardware.
Conventions Used in This Document
The key words “MUST”, “MUST NOT”, “REQUIRED”,
“SHALL”, “SHALL NOT”, “SHOULD”, “SHOULD
NOT”, “RECOMMENDED”, “MAY”, and
“OPTIONAL” in this document are to be interpreted as described in RFC2119.
1. Introduction
The Sphinx cryptographic packet format is a compact and provably secure design
introduced by George Danezis and Ian Goldberg SPHINX09.
Although it supports replay detection, the exact mechanism of replay detection is
neither described in SPHINX09 nor is it described in our
SPHINXSPEC. Therefore we shall describe in detail
how to efficiently detect Sphinx packet replay attacks.
2. Sphinx Cryptographic Primitives
This specification borrows the following cryptographic primitives constants from
our
SPHINXSPEC:
H(M) - A cryptographic hash function which takes an byte
array M to produce a digest consisting of a HASH_LENGTH byte
array. H(M) MUST be pre-image and collision resistant.
EXP(X, Y) - An exponentiation function which takes the
GROUP_ELEMENT_LENGTH byte array group elements
X and Y, and returns X ^^ Y as a GROUP_ELEMENT_LENGTH byte array.
Let G denote the generator of the group, and
EXP_KEYGEN() return a GROUP_ELEMENT_LENGTH
byte array group element usable as private key.
The group defined by G and EXP(X, Y) MUST
satisfy the Decision Diffie-Hellman problem.
2.1 Sphinx Parameter Constants
HASH_LENGTH - 32 bytes. Katzenpost currently uses
SHA-512/256. RFC6234
GROUP_ELEMENT_LENGTH - 32 bytes. Katzenpost currently
uses X25519. RFC7748
3. System Overview
Mixes as currently deployed, have two modes of operation:
Sphinx routing keys and replay caches are persisted to disk
Sphinx routing keys and replay caches are persisted to memory
These two modes of operation fundamentally represent a tradeoff between mix server
availability and notional compulsion attack resistance. Ultimately it will be the
mix
operator’s decision to make since they affect the security and availability of their
mix
servers. In particular since mix networks are vulnerable to the various types of
compulsion attacks (see SPHINXSPEC section 9.4
Compulsion Threat Considerations) and therefore there is some advantage to NOT
persisting the Sphinx routing keys to disk. The mix operator can simply poweroff the
mix
server before seizure rather than physically destroying the disk in order to prevent
capture of the Sphinx routing keys. An argument can be made for the use of full disk
encryption, however this may not be practical for servers hosted in remote locations.
On the other hand, persisting Sphinx routing keys and replay caches to disk is useful
because it allows mix operators to shutdown their mix server for maintenance purposes
without loosing these Sphinx routing keys and replay caches. This means that as soon
as
the maintenance operation is completed the mix server is able to rejoin the network.
Our
current PKI system KATZMIXPKI does NOT provide a
mechanism to notify Directory Authorities of such an outage or maintenance period.
Therefore if there is loss of Sphinx routing keys this results in a mix outage until
the
next epoch.
The two modes of operation both completely prevent replay attacks after a system
restart. In the case of the disk persistence, replay attacks are prevented because
all
packets traversing the mix have their replay tags persisted to disk cache. This cache
is
therefore once again used to prevent replays after a system restart. In the case of
memory persistence replays are prevented upon restart because the Sphinx routing keys
are destroyed and therefore the mix will not participant in the network until at least
the next epoch rotation. However availability of the mix may require two epoch rotations
because in accordance with KATZMIXPKI mixes publish
future epoch keys so that Sphinx packets flowing through the network can seamlessly
straddle the epoch boundaries.
4. Sphinx Packet Replay Cache
4.1 Sphinx Replay Tag Composition
The following excerpt from our SPHINXSPEC shows
how the replay tag is calculated.
However this tag is not utilized in replay detection until the rest of the Sphinx
packet is fully processed and it’s header MAC verified as described in SPHINXSPEC.
4.2 Sphinx Replay Tag Caching
It would be sufficient to use a key value store or hashmap to detect the presence
of
a duplicate replay tag however we additionally employ a bloom filter to increase
performance. Sphinx keys must periodically be rotated and destroyed to mitigate
compulsion attacks and therefore our replay caches must likewise be rotated. This
kind
of key erasure scheme limits the window of time that an adversary can perform a
compulsion attack. See our PKI specification KATZMIXPKI for more details regarding epoch key rotation and the grace
period before and after the epoch boundary.
We tune our bloom filter for line-speed; that is to say the bloom filter for a given
replay cache is tuned for the maximum number of Sphinx packets that can be sent on
the
wire during the epoch duration of the Sphinx routing key. This of course has to take
into account the size of the Sphinx packets as well as the maximum line speed of the
network interface. This is a conservative tuning heuristic given that there must be
more
than this maximum number of Sphinx packets in order for there to be duplicate packets.
Our bloomfilter with hashmap replay detection cache looks like this:
Figure 1. replay cache
Note that this diagram does NOT express the full complexity of the replay caching
system. In particular it does not describe how entries are entered into the bloom
filter
and hashmap. Upon either bloom filter mismatch or hashmap mismatch both data structures
must be locked and the replay tag inserted into each.
For the disk persistence mode of operation the hashmap can simply be replaced with
an
efficient key value store. Persistent stores may use a write back cache and other
techniques for efficiency.
4.3 Epoch Boundaries
Since mixes publish future epoch keys (see KATZMIXPKI) so that Sphinx packets flowing through the network can
seamlessly straddle the epoch boundaries, our replay detection forms a special kind
of double bloom filter system. During the epoch grace period mixes perform trial
decryption of Sphinx packets. The replay cache used will be the one that is
associated with the Sphinx routing key which was successfully used to decrypt
(unwrap transform) the Sphinx packet. This is not a double bloom filter in the
normal sense of this term since each bloom filter used is distinct and associated
with it’s own cache, furthermore, replay tags are only ever inserted into one cache
and one bloom filter.
4.4 Cost Of Checking Replays
The cost of checking a replay tag from a single replay cache is the sum of the
following operations:
Sphinx packet unwrap operation
A bloom filter lookup
A hashmap or cache lookup
Therefore these operations are roughly O(1) in complexity. However Sphinx packets
processed near epoch boundaries will not be constant time due to trial decryption
with two Sphinx routing keys as mentioned above in section “3.3 Epoch
Boundaries”.
5. Concurrent Processing of Sphinx Packet Replay Tags
The best way to implement a software based router is with a SEDA computational pipeline. We therefore need a mechanism to allow multiple
threads to reference our rotating Sphinx keys and associated replay caches. Here we
shall describe a shadow memory system which the mix server uses such that the individual
worker threads shall always have a reference to the current set of candidate mix keys
and associates replay caches.
5.1 PKI Updates
The mix server periodically updates it’s knowledge of the network by downloading
a new consensus document as described in KATZMIXPKI. The individual threads in the “cryptoworker”
thread pool which process Sphinx packets make use of a MixKey
data structure which consists of:
Sphinx routing key material (public and private X25519 keys)
Replay Cache
Reference Counter
Each of these “cryptoworker” thread pool has it’s own hashmap
associating epochs to a reference to the MixKey. The mix server
PKI threat maintains a single hashmap which associates the epochs with the
corresponding MixKey. We shall refer to this hashmap as
MixKeys. After a new MixKey is added to
MixKeys, a “reshadow” operation is performed for
each “cryptoworker” thread. The “reshadow” operation
performs two tasks:
Removes entries from each “cryptoworker” thread’s hashmap
that are no longer present in MixKeys and decrements the
MixKey reference counter.
Adds entries present in MixKeys but are not present in
the thread’s hashmap and increments the MixKey reference
counter.
Once a given MixKey reference counter is decremented to zero,
the MixKey and it’s associated on disk data is purged. Note that
we do not discuss synchronization primitives, however it should be obvious that
updating the replay cache should likely make use of a mutex or similar primitive to
avoid data races between “cryptoworker” threads.
Bradner, S., “Key words for use in RFCs to Indicate Requirement
Levels”, BCP 14, RFC 2119, DOI 10.17487/RFC2119, March 1997, http://www.rfc-editor.org/info/rfc2119.
RFC6234
Eastlake 3rd, D. and T. Hansen, “US Secure Hash Algorithms (SHA and SHA-based HMAC and HKDF)”, RFC 6234, DOI 10.17487/RFC6234, May 2011,
https://www.rfc-editor.org/info/rfc6234.
Welsh, M., Culler, D., Brewer, E., “SEDA: An Architecture
for Well-Conditioned, Scalable Internet Services”, 2001, ACM
Symposium on Operating Systems Principles,
http://www.sosp.org/2001/papers/welsh.pdf.
The following terms are used in this specification.
PKI
Public key infrastructure
directory authority system
Refers to specific PKI schemes used by Mixminion and Tor.
MSL
Maximum segment lifetime, currently set to 120 seconds.
mix descriptor
A database record which describes a component mix.
family
Identifier of security domains or entities operating one or more mixes in
the network. This is used to inform the path selection algorithm.
nickname
A nickname string that is unique in the consensus document, see Katzenpost
Mix Network Specification section 2.2. Network Topology.
layer
The layer indicates which network topology layer a particular mix resides
in.
provider
A service operated by a third party that Clients communicate directly with
to communicate with the Mixnet. It is responsible for Client authentication,
forwarding outgoing messages to the Mixnet, and storing incoming messages
for the Client. The Provider MUST have the ability to perform cryptographic
operations on the relayed messages.
Conventions used in this document
The key words “MUST”, “MUST NOT”, “REQUIRED”,
“SHALL”, “SHALL NOT”, “SHOULD”, “SHOULD
NOT”, “RECOMMENDED”, “MAY”, and
“OPTIONAL” in this document are to be interpreted as described in RFC2119.
The “C” style Presentation Language as described in RFC5246 Section 4 is used to represent data structures for
additional cryptographic wire protocol commands. KATZMIXWIRE
1. Introduction
Mixnets are designed with the assumption that a Public Key Infrastructure (PKI)
exists and it gives each client the same view of the network. This specification is
inspired by the Tor and Mixminion Directory Authority systems MIXMINIONDIRAUTHTORDIRAUTH whose main features are precisely what we
need for our PKI. These are decentralized systems meant to be collectively operated
by
multiple entities.
The mix network directory authority system (PKI) is essentially a cooperative
decentralized database and voting system that is used to produce network consensus
documents which mix clients periodically retrieve and use for their path selection
algorithm when creating Sphinx packets. These network consensus documents are derived
from a voting process between the Directory Authority servers.
This design prevents mix clients from using only a partial view of the network for
their path selection so as to avoid fingerprinting and bridging attacks FINGERPRINTING, BRIDGING, and LOCALVIEW.
The PKI is also used by Authority operators to specify network-wide parameters, for
example in the Katzenpost Decryption Mix Network KATZMIXNET the Poisson mix strategy is used and, therefore, all clients must
use the same lambda parameter for their exponential distribution function when choosing
hop delays in the path selection. The Mix Network Directory Authority system, aka
PKI,
SHALL be used to distribute such network-wide parameters in the network consensus
document that have an impact on security and performance.
1.2 Security properties overview
This Directory Authority system has the following feature goals and security
properties:
All Directory Authority servers must agree with each other on the set of
Directory Authorities.
All Directory Authority servers must agree with each other on the set of
mixes.
This system is intentionally designed to provide identical network
consensus documents to each mix client. This mitigates epistemic attacks
against the client path selection algorithm such as fingerprinting and
bridge attacks FINGERPRINTINGBRIDGING.
This system is NOT byzantine-fault-tolerant, it instead allows for manual
intervention upon consensus fault by the Directory Authority operators.
Further, these operators are responsible for expelling bad acting operators
from the system.
This system enforces the network policies such as mix join policy wherein
intentionally closed mixnets will prevent arbitrary hosts from joining the
network by authenticating all descriptor signatures with a list of allowed
public keys.
The Directory Authority system for a given mix network is essentially the
root of all authority.
1.3 Differences from Tor and Mixminion directory authority systems
In this document we specify a Directory Authority system which is different from
that of Tor’s and Mixminion’s in a number of ways:
The list of valid mixes is expressed in an allowlist. For the time being
there is no specified “bandwidth authority” system which
verifies the health of mixes (Further research required in this area).
There’s no non-directory channel to inform clients that a node is down,
so it will end up being a lot of packet loss, since clients will continue to
include the missing node in their path selection until keys published by the
node expire and it falls out of the consensus.
The schema of the mix descriptors is different from that used in
Mixminion and Tor, including a change which allows our mix descriptor to
express n Sphinx mix routing public keys in a single
mix descriptor whereas in the Tor and Mixminion Directory Authority systems,
n descriptors are used.
The serialization format of mix descriptors is different from that used
in Mixminion and Tor.
The shared random number computation is performed every voting round, and
is required for a vote to be accepted by each authority. The shared random
number is used to deterministically generate the network topology.
2. Overview of mix PKI interaction
Each Mix MUST rotate the key pair used for Sphinx packet processing periodically
for
forward secrecy reasons and to keep the list of seen packet tags short. SPHINX09SPHINXSPEC The Katzenpost Mix Network uses a fixed
interval (epoch), so that key rotations happen simultaneously
throughout the network, at predictable times.
Each Directory Authority server MUST use some time synchronization protocol in order
to correctly use this protocol. This Directory Authority system requires time
synchronization to within a few minutes.
Let each epoch be exactly 1200 seconds (20 minutes) in duration,
and the 0th Epoch begin at 2017-06-01 00:00 UTC.
To facilitate smooth operation of the network and to allow for delays that span
across epoch boundaries, Mixes MUST publish keys to the PKI for at least 3 epochs
in
advance, unless the mix will be otherwise unavailable in the near future due to planned
downtime.
At an epoch boundary, messages encrypted to keys from the previous epoch are accepted
for a grace period of 2 minutes.
Thus, at any time, keys for all Mixes for the Nth through N + 2nd epoch will be
available, allowing for a maximum round trip (forward message + SURB) delay + transit
time of 40 minutes. SURB lifetime is limited to a single epoch because of the key
rotation epoch, however this shouldn’t present any usability problems since SURBs
are
only used for sending ACK messages from the destination Provider to the sender.
2.1 PKI protocol schedule
There are two main constraints to Authority schedule:
There MUST be enough key material extending into the future so that
clients are able to construct Sphinx packets with a forward and reply paths.
All participants should have enough time to participate in the protocol;
upload descriptors, vote, generate documents, download documents, establish
connections for user traffic.
The epoch duration of 20 minutes is more than adequate for these two constraints.
NOTE: perhaps we should make it shorter? but first let’s do some scaling
and bandwidth calculations to see how bad it gets…
2.1.1 Directory authority server schedule
Directory Authority server interactions are conducted according to the following
schedule, where T is the beginning of the current epoch, and
P is the length of the epoch period.
T - Epoch begins
T + P/2 - Vote exchange
T + (5/8)*P - Reveal exchange
T + (6/8)*P - Tabulation and signature exchange
T + (7/8)*P - Publish consensus
2.1.2 Mix schedule
Mix PKI interactions are conducted according to the following schedule, where T
is the beginning of the current epoch.
T + P/8 - Deadline for publication of all mixes documents for the
next epoch.
T + (7/8)*P - This marks the beginning of the period where mixes
perform staggered fetches of the PKI consensus document.
T + (8/9)*P - Start establishing connections to the new set of
relevant mixes in advance of the next epoch.
T + P - 1MSL - Start accepting new Sphinx packets encrypted to
the next epoch’s keys.
T + P + 1MSL - Stop accepting new Sphinx packets encrypted to the
previous epoch’s keys, close connections to peers no longer listed in the PKI
documents and erase the list of seen packet tags.
Mix layer changes are controlled by the Directory Authorities and therefore a mix
can be reassigned to a different layer in our stratified topology at any new epoch.
Mixes will maintain incoming and outgoing connections to the various nodes until all
mix keys have expired, iff the node is still listed anywhere in the current
document.
3. Voting for consensus protocol
In our Directory Authority protocol, all the actors conduct their behavior according
to a common schedule as outlined in section “2.1 PKI Protocol Schedule”. The
Directory Authority servers exchange messages to reach consensus about the network.
Other tasks they perform include collecting mix descriptor uploads from each mix for
each key rotation epoch, voting, shared random number generation, signature exchange
and
publishing of the network consensus documents.
3.1 Protocol messages
There are only two document types in this protocol:
mix_descriptor: A mix descriptor describes a mix.
directory: A directory contains a list of descriptors and
other information that describe the mix network.
Mix descriptor and directory documents MUST be properly signed.
3.1.1 Mix descriptor and directory signing
Mixes MUST compose mix descriptors which are signed using their private identity
key, an ed25519 key. Directories are signed by one or more Directory Authority
servers using their authority key, also an ed25519 key. In all cases, signing is
done using JWS RFC7515.
3.2 Vote exchange
As described in section “2.1 PKI Protocol Schedule”, the Directory
Authority servers begin the voting process 1/8 of an epoch period after the start
of
a new epoch. Each Authority exchanges vote directory messages with each other.
Authorities archive votes from other authorities and make them available for
retrieval. Upon receiving a new vote, the authority examines it for new descriptors
and includes any valid descriptors in its view of the network.
Each Authority includes in its vote a hashed value committing to a choice of a
random number for the vote. See section 4.3 for more details.
3.2.1 Voting Wire Protocol Commands
The Katzenpost Wire Protocol as described in KATZMIXWIRE is
used by Authorities to exchange votes. We define additional wire protocol commands
for sending votes:
enum{:vote(22),vote_status(23),}Command;
The structures of these commands are defined as follows:
The vote command is used to send a PKI document to a peer Authority during the
voting period of the PKI schedule.
The payload field contains the signed and serialized PKI document representing
the sending Authority’s vote. The public_key field contains the public identity key
of the sending Authority which the receiving Authority can use to verify the
signature of the payload. The epoch_number field is used by the receiving party to
quickly check the epoch for the vote before deserializing the payload.
Each authority MUST include its commit value for the shared random computation in
this phase along with its signed vote. This computation is derived from the Tor
Shared Random Subsystem, TORSRV.
3.2.3 The vote_status Command
The vote_status command is used to reply to a vote command. The error_code field
indicates if there was a failure in the receiving of the PKI document.
The epoch_number field of the vote struct is compared with the epoch that is
currently being voted on. vote_too_early and vote_too_late are replied back to the
voter to report that their vote was not accepted.
3.3 Reveal exchange
As described in section “2.1 PKI Protocol Schedule”, the Directory
Authority servers exchange the reveal values after they have exchanged votes which
contain a commit value. Each Authority exchanges reveal messages with each other.
3.3.1 Reveal Wire Protocol Commands
The Katzenpost Wire Protocol as described in KATZMIXWIRE is used by Authorities to exchange reveal values previously
committed to in their votes. We define additional wire protocol commands for
exchanging reveals:
enum{:reveal(25),reveal_status(26),}Command;
The structures of these commands are defined as follows:
The reveal command is used to send a reveal value to a peer authority during the
reveal period of the PKI schedule.
The payload field contains the signed and serialized reveal value. The public_key
field contains the public identity key of the sending Authority which the receiving
Authority can use to verify the signature of the payload. The epoch_number field is
used by the receiving party to quickly check the epoch for the reveal before
deserializing the payload.
3.3.3 The reveal_status Command
The reveal_status command is used to reply to a reveal command. The error_code
field indicates if there was a failure in the receiving of the shared random reveal
value.
The epoch_number field of the reveal struct is compared with the epoch that is
currently being voted on. reveal_too_early and reveal_too_late are replied back to
the authority to report their reveal was not accepted. The status code
reveal_not_authorized is used if the Authority is rejected. The
reveal_already_received is used to report that a valid reveal command was already
received for this round.
3.4 Cert exchange
The Cert command is the same as a Vote but contains the set of Reveal values as
seen by the voting peer. In order to ensure that a misconfigured or malicious
Authority operator cannot amplify their ability to influence the threshold voting
process, after Reveal messages have been exchanged, Authorities vote again,
including the Reveals seen by them. Authorities may not introduce new MixDescriptors
at this phase in the protocol.
Otherwise, a consensus partition can be obtained by withholding Reveal values from
a threshold number of Peers. In the case of an even-number of Authorities, a denial
of service by a single Authority was observed.
3.5 Vote tabulation for consensus computation
The main design constraint of the vote tabulation algorithm is that it MUST be a
deterministic process that produces the same result for each directory authority
server. This result is known as a network consensus file.
A network consensus file is a well formed directory struct where the
status field is set to consensus and
contains 0 or more descriptors, the mix directory is signed by 0 or more directory
authority servers. If signed by the full voting group then this is called a fully
signed consensus.
Validate each vote directory:
that the liveness fields correspond to the following epoch
status is vote
version number matches ours
Compute a consensus directory:
Here we include a modified section from the Mixminion PKI spec MIXMINIONDIRAUTH:
For each distinct mix identity in any vote directory:
If there are multiple nicknames for a given identity, do not include any
descriptors for that identity.
If half or fewer of the votes include the identity, do not include any
descriptors for the identity. This also guarantees that there will
be only one identity per nickname.
If we are including the identity, then for each distinct descriptor that
appears in any vote directory:
Do not include the descriptor if it will have expired on the date
the directory will be published.
Do not include the descriptor if it is superseded by other
descriptors for this identity.
Do not include the descriptor if it not valid in the next epoch.
Otherwise, include the descriptor.
Sort the list of descriptors by the signature field so that creation of
the consensus is reproducible.
Set directory status field to
consensus.
Compute a shared random number from the values revealed in the
“Reveal” step. Authorities whose reveal value does not
verify their commit value MUST be excluded from the consensus round.
Authorities ensure that their peers MUST participate in Commit-and-Reveal,
and MUST use correct Reveal values obtained from other Peers as part of the
“Cert” exchange.
Generate or update the network topology using the shared random number as
a seed to a deterministic random number generator that determines the order
that new mixes are placed into the topology.
3.6 Signature collection
Each Authority signs their view of consensus, and exchanges detached Signatures
with each other. Upon receiving each Signature it is added to the signatures on the
Consensus if it validates the Consensus. The Authority SHOULD warn the administrator
if network partition is detected.
If there is disagreement about the consensus directory, each authority collects
signatures from only the servers which it agrees with about the final consensus.
// TODO: consider exchanging peers votes amongst authorities (or hashes thereof)
to // ensure that an authority has distributed one and only unique vote amongst its
peers.
3.7 Publication
If the consensus is signed by a majority of members of the voting group then it’s
a valid consensus and it is published.
4. PKI Protocol data structures
4.1 Mix descriptor format
Note that there is no signature field. This is because mix descriptors are
serialized and signed using JWS. The IdentityKey field is a
public ed25519 key. The MixKeys field is a map from epoch to
public X25519 keys which is what the Sphinx packet format uses.
Note
XXX David: replace the following example with a JWS example:
After the votes are collected from the voting round, and before signature
exchange, the Shared Random Value field of the consensus document is the output of
H
over the input string calculated as follows:
Validated Reveal commands received including the authorities own reveal
are sorted by reveal value in ascending order and appended to the input in
format IdentityPublicKeyBytes_n | RevealValue_n
However instead of the Identity Public Key bytes we instead encode the Reveal
with the blake2b 256 bit hash of the public key bytes.
If a SharedRandomValue for the previous epoch exists, it is appended to
the input string, otherwise 32 NUL (x00) bytes are used.
The Katzenpost Wire Protocol as described in KATZMIXWIRE is used by both clients and by Directory Authority peers. In
the following section we describe additional wire protocol commands for publishing
mix descriptors, voting and consensus retrieval.
5.1 Mix descriptor publication
The following commands are used for publishing mix descriptors and setting
mix descriptor status:
The vote command is used to send a PKI document to a peer
Authority during the voting period of the PKI schedule.
The payload field contains the signed and serialized PKI document
representing the sending Authority’s vote. The public_key field contains the
public identity key of the sending Authority which the receiving Authority can
use to verify the signature of the payload. The epoch_number field is used by
the receiving party to quickly check the epoch for the vote before deserializing
the payload.
6.2. The vote_status command
The vote_status command is used to reply to a vote
command. The error_code field indicates if there was a failure in the receiving
of the PKI document.
enum{vote_ok(0),/*Noneerrorcondition.*/vote_too_early(1),/*TheAuthorityshouldtryagainlater.*/vote_too_late(2),/*Thisroundofvotingwasmissed.*/vote_not_authorized(3),/*Thevoter's key is not authorized. */vote_not_signed(4),/*Thevotesignatureverificationfailed*/vote_malformed(5),/*Thevotepayloadwasinvalid*/vote_already_received(6),/*Thevotewasalreadyreceived*/vote_not_found(7),/*Thevotewasnotfound*/}
The epoch_number field of the vote struct is compared with the epoch that is
currently being voted on. vote_too_early and vote_too_late are replied back to
the voter to report that their vote was not accepted.
6.3. The get_vote command
The get_vote command is used to request a PKI document
(vote) from a peer Authority. The epoch field contains the epoch from which to
request the vote, and the public_key field contains the public identity key of
the Authority of the requested vote. A successful query is responded to with a
vote command, and queries that fail are responded to with a vote_status command
with error_code vote_not_found(7).
7. Retrieval of consensus
Providers in the Katzenpost mix network system KATZMIXNET may cache validated network consensus files and serve
them to clients over the mix network’s link layer wire protocol KATZMIXWIRE. We define additional wire protocol
commands for requesting and sending PKI consensus documents:
The get_consensus command is a command that is used to retrieve a recent
consensus document. If a given get_consensus command contains an Epoch value
that is either too big or too small then a reply consensus command is sent with
an empty payload. Otherwise if the consensus request is valid then a consensus
command containing a recent consensus document is sent in reply.
Initiators MUST terminate the session immediately upon reception of a
get_consensus command.
7.2 The consensus command
The consensus command is a command that is used to send a recent consensus
document. The error_code field indicates if there was a failure in retrieval of
the PKI consensus document.
The cert command is used to send a PKI document to a peer
Authority during the voting period of the PKI schedule. It is the same as the
vote command, but must contain the set of
SharedRandomCommit and SharedRandomReveal values as seen by the Authority during
the voting process.
7.4. The CertStatus command
The cert_status command is the response to a
cert command, and is the same as a
vote_status response, other than the command identifier.
Responses are CertOK, CertTooEarly, CertNotAuthorized, CertNotSigned,
CertAlreadyReceived, CertTooLate
8. Signature exchange
Signatures exchange is the final round of the consensus protocol and consists
of the Sig and SigStatus commands.
8.1. The sig command
The sig command contains a detached Signature from PublicKey
of Consensus for Epoch.
8.2. The sig_status command
The sig_status command is the response to a
sig command. Responses are SigOK, SigNotAuthorized,
SigNotSigned, SigTooEarly, SigTooLate, SigAlreadyReceived, and SigInvalid.
Make a Bandwidth Authority system to measure health of the network. Also
perform load balancing as described in PEERFLOW?
Implement byzantine attack defenses as described in MIRANDA and MIXRELIABLE where mix link performance proofs are recorded and
used in a reputation system.
Choose a different serialization/schema language?
Use a append only merkle tree instead of this voting protocol.
11. Anonymity considerations
This system is intentionally designed to provide identical network consensus
documents to each mix client. This mitigates epistemic attacks against the
client path selection algorithm such as fingerprinting and bridge attacks FINGERPRINTING, BRIDGING.
If consensus has failed and thus there is more than one consensus file,
clients MUST NOT use this compromised consensus and refuse to run.
We try to avoid randomizing the topology because doing so splits the
anonymity sets on each mix into two. That is, packets belonging to the previous
topology versus the current topology are trivially distinguishable. On the other
hand if enough mixes fall out of consensus eventually the mixnet will need to be
rebalanced to avoid an attacker compromised path selection. One example of this
would be the case where the adversary controls the only mix is one layer of the
network topology.
12. Security considerations
The Directory Authority / PKI system for a given mix network is essentially
the root of all authority in the system. The PKI controls the contents of the
network consensus documents that mix clients download and use to inform their
path selection. Therefore if the PKI as a whole becomes compromised then so will
the rest of the system in terms of providing the main security properties
described as traffic analysis resistance. Therefore a decentralized voting
protocol is used so that the system is more resilient when attacked, in
accordance with the principle of least authority. SECNOTSEP
Short epoch durations make it is more practical to make corrections to
network state using the PKI voting rounds.
Fewer epoch keys published in advance is a more conservative security policy
because it implies reduced exposure to key compromise attacks.
A bad acting Directory Authority who lies on each vote and votes
inconsistently can trivially cause a denial of service for each voting round.
Acknowledgements
We would like to thank Nick Mathewson for answering design questions and thorough
design review.
Leibowitz, H., Piotrowska, A., Danezis, G., Herzberg, A., “No right
to remain silent: Isolating Malicious Mixes”, 2017,
https://eprint.iacr.org/2017/1000.pdf.
Dingledine, R., Freedman, M., Hopwood, D., Molnar, D., “A Reputation
System to Increase MIX-Net Reliability”, 2001, Information Hiding, 4th
International Workshop,
https://www.freehaven.net/anonbib/cache/mix-acc.pdf.
Bradner, S., “Key words for use in RFCs to Indicate Requirement
Levels”, BCP 14, RFC 2119, DOI 10.17487/RFC2119, March 1997, http://www.rfc-editor.org/info/rfc2119.
RFC5246
Dierks, T. and E. Rescorla, “The Transport Layer Security (TLS) Protocol
Version 1.2”, RFC 5246, DOI 10.17487/RFC5246, August 2008, http://www.rfc-editor.org/info/rfc5246.
This document describes the high level architecture and detailed protocols and
behavior required of mix nodes participating in the Katzenpost Mix Network.
The following terms are used in this specification.
KiB
Defined as 1024 8 bit octets.
mixnet
A mixnet also known as a mix network is a network of mixes that can be
used to build various privacy preserving protocols.
mix
A cryptographic router that is used to compose a mixnet. Mixes use a
cryptographic operation on messages being routed which provides bitwise
unlinkability with respect to input versus output messages. Katzenpost is a
decryption mixnet that uses the Sphinx cryptographic packet format.
node
Clients are NOT considered nodes in the mix network. However note that
network protocols are often layered; in our design documents we describe
“mixnet hidden services” which can be operated by mixnet clients. Therefore
if you are using node in some adherence to mathematical terminology one
could conceivably designate a client as a node. That having been said, it
would not be appropriate to the discussion of our core mixnet protocol to
refer to the clients as nodes.
entry mix, entry node
A mix that has some additional features:
An entry mix is always the first hop in routes where the message
originates from a client.
An entry mix authenticates client’s direct connections via the
mixnet’s wire protocol.
An entry mix queues reply messages and allows clients to retrieve
them later.
service mix
A service mix is a mix that has some additional features:
A service mix is always the last hop in routes where the message
originates from a client.
A service mix runs mixnet services which use a Sphinx SURB based
protocol.
user
An agent using the Katzenpost system.
client
Software run by the User on its local device to participate in the Mixnet.
Again let us reiterate that a client is not considered a “node in the
network” at the level of analysis where we are discussing the core mixnet
protocol in this here document.
Katzenpost
A project to design many improved decryption mixnet protocols.
classes of traffic
We distinguish the following classes of traffic:
SURB Replies (also sometimes referred to as ACKs)
Forward messages
packet
A Sphinx packet, of fixed
length for each class of traffic, carrying a message payload and metadata for routing.
Packets are routed anonymously through the mixnet and cryptographically transformed
at
each hop.
payload
The fixed-length portion of a packet containing an encrypted message or
part of a message, to be delivered anonymously.
message
A variable-length sequence of octets sent anonymously through the network.
Short messages are sent in a single packet; long messages are fragmented
across multiple packets.
MSL
Maximum segment lifetime, currently set to 120 seconds.
Conventions Used in This Document
The key words “MUST”, “MUST NOT”, “REQUIRED”,
“SHALL”, “SHALL NOT”, “SHOULD”, “SHOULD
NOT”, “RECOMMENDED”, “MAY”, and
“OPTIONAL” in this document are to be interpreted as described in RFC2119
1. Introduction
This specification provides the design of a mix network meant provide an anonymous
messaging protocol between clients and public mixnet services.
Various system components such as client software, end to end messaging protocols,
Sphinx cryptographic packet format and wire protocol are described in their own
specification documents.
2. System Overview
The presented system design is based on LOOPIX Below,
we present the system overview.
The entry mixes are responsible for authenticating clients, accepting packets from
the client, and forwarding them to the mix network, which then relays packets to the
destination service mix. Our network design uses a strict topology where forward message
traverse the network from entry mix to service mix. Service mixes can optionally reply
if the forward message contained a Single Use Reply Block (see SPHINXSPEC.
The PKI system that handles the distribution of various network wide parameters,
and
information required for each participant to participate in the network such as IP
address/port combinations that each node can be reached at, and cryptographic public
keys. The specification for the PKI is beyond the scope of this document and is instead
covered in KATZMIXPKI.
The mix network provides neither reliable nor in-order delivery semantics. The
described mix network is neither a user facing messaging system nor is it an
application. It is intended to be a low level protocol which can be composed to form
more elaborate mixnet protocols with stronger more useful privacy notions.
2.1 Threat Model
Here we cannot present the threat model to the higher level mixnet protocols.
However this low level core mixnet protocol does have it’s own threat model which
we
attempt to elucidate here.
We assume that the clients only talk to mixnet services. These services make use
of a client provided delivery token known as a SURB (Single Use Reply Block) to send
their replies to the client without knowing the client’s entry mix. This system
guarantees third-party anonymity, meaning that no parties other than client and the
service are able to learn that the client and service are communicating. Note that
this is in contrast with other designs, such as Mixminion, which provide sender
anonymity towards recipients as well as anonymous replies.
Mixnet clients will randomly select an entry node to use and may reconnect if
disconnected for under a duration threshold. The entry mix can determine the
approximate message volume originating from and destined to a given client. We
consider the entry mix follows the protocol and might be an honest-but-curious
adversary.
External local network observers can not determine the number of Packets
traversing their region of the network because of the use of decoy traffic sent by
the clients. Global observers will not be able to de-anonymize packet paths if there
are enough packets traversing the mix network. Longer term statistical disclosure
attacks are likely possible in order to link senders and receivers.
A malicious mix only has the ability to remember which input packets correspond
to the output packets. To discover the entire path all of the mixes in the path
would have to be malicious. Moreover, the malicious mixes can drop, inject, modify
or delay the packets for more or less time than specified.
2.2 Network Topology
The Katzenpost Mix Network uses a layered topology consisting of a fixed number
of layers, each containing a set of mixes. At any given time each Mix MUST only be
assigned to one specific layer. Each Mix in a given layer N is connected to every
other Mix in the previous and next layer, and or every participating Provider in the
case of the mixes in layer 0 or layer N (first and last layer). :
Note: Multiple distinct connections are collapsed in the figure for sake of
brevity/clarity.
The network topology MUST also maximize the number of security domains traversed
by the packets. This can be achieved by not allowing mixes from the same security
domain to be in different layers.
Requirements for the topology:
Should allow for non-uniform throughput of each mix (Get bandwidth
weights from the PKI).
Should maximize distribution among security domains, in this case the mix
descriptor specified family field would indicate the security domain or
entity operating the mix.
Other legal jurisdictional region awareness for increasing the cost of
compulsion attacks.
3. Packet Format Overview
For the packet format of the transported messages we use the Sphinx cryptographic
packet format. The detailed description of the packet format, construction, processing
and security / anonymity considerations see SPHINXSPEC, “The Sphinx Mix Network Cryptographic Packet Format
Specification”.
As the Sphinx packet format is generic, the Katzenpost Mix Network must provide a
concrete instantiation of the format, as well as additional Sphinx per-hop routing
information commands.
3.1 Sphinx Cryptographic Primitives
For the current version of the Katzenpost Mix Network, let the following
cryptographic primitives be used as described in the Sphinx specification.
H(M) - As the output of this primitive is only used
locally to a Mix, any suitable primitive may be used.
MAC(K, M) - HMAC-SHA256 RFC6234, M_KEY_LENGTH of 32 bytes (256 bits), and MAC_LENGTH of
32 bytes (256 bits).
KDF(SALT, IKM) - HKDF-SHA256, HKDF-Expand only, with SALT
used as the info parameter.
S(K, IV) - CTR-AES256 [SP80038A], S_KEY_LENGTH of 32 bytes (256 bits), and S_IV_LENGTH
of 12 bytes (96 bits), using a 32 bit counter.
SPRP_Encrypt(K, M)/SPRP_Decrypt(K, M) - AEZv5 AEZV5, SPRP_KEY_LENGTH of 48 bytes (384 bits). As
there is a disconnect between AEZv5 as specified and the Sphinx usage, let
the following be the AEZv5 parameters:
nonce - 16 bytes, reusing the per-hop Sphinx header IV.
additional_data - Unused.
tau - 0 bytes.
EXP(X, Y) - X25519 RFC7748
scalar multiply, GROUP_ELEMENT_LENGTH of 32 bytes (256 bits), G is the
X25519 base point.
3.2 Sphinx Packet Parameters
The following parameters are used as for the Katzenpost Mix Network instantiation
of the Sphinx Packet Format:
AD_SIZE - 2 bytes.
SECURITY_PARAMETER - 32 bytes. (except for our SPRP which
we plan to upgrade)
PER_HOP_RI_SIZE - (XXX/ya: Addition is hard, let’s go
shopping.)
NODE_ID_SIZE - 32 bytes, the size of the Ed25519 public
key, used as Node identifiers.
RECIPIENT_ID_SIZE - 64 bytes, the maximum size of
local-part component in an e-mail address.
SURB_ID_SIZE - Single Use Reply Block ID size, 16 bytes.
MAX_HOPS - 5, the ingress provider, a set of three mixes,
and the egress provider.
PAYLOAD_SIZE - (XXX/ya: Subtraction is hard, let’s go
shopping.)
KDF_INFO - The byte string
Katzenpost-kdf-v0-hkdf-sha256.
The Sphinx Packet Header additional_data field is specified as
follows:
Double check to ensure that this causes the rest of the packet header to be 4
byte aligned, when wrapped in the wire protocol command and framing. This might need
to have 3 bytes reserved instead.
All nodes MUST reject Sphinx Packets that have additional_data
that is not as specified in the header.
Design decision.
We can eliminate a trial decryption step per packet around the epoch
transitions by having a command that rewrites the AD on a per-hop basis and
including an epoch identifier.
I am uncertain as to if the additional complexity is worth it for a situation
that can happen for a few minutes out of every epoch.
3.3 Sphinx Per-hop Routing Information Extensions
The following extensions are added to the Sphinx Per-Hop Routing Information
commands.
Let the following additional routing commands be defined in the extension
RoutingCommandType range (0x80 - 0xff):
enum{mix_delay(0x80),}KatzenpostCommandType;
The mix_delay command structure is as follows:
struct {
uint32_t delay_ms;
} NodeDelayCommand;
4. Mix Node Operation
All Mixes behave in the following manner:
Accept incoming connections from peers, and open persistent connections to
peers as needed Section 4.1 <4.1>.
Periodically interact with the PKI to publish Identity and Sphinx packet
public keys, and to obtain information about the peers it should be
communicating with, along with periodically rotating the Sphinx packet keys for
forward secrecy Section 4.2 <4.2>.
Process inbound Sphinx Packets, delay them for the specified time and forward
them to the appropriate Mix and or Provider Section 4.3 <4.3>.
All Nodes are identified by their link protocol signing key, for the purpose of the
Sphinx packet source routing hop identifier.
All Nodes participating in the Mix Network MUST share a common view of time, via
NTP
or similar time synchronization mechanism.
4.1 Link Layer Connection Management
All communication to and from participants in the Katzenpost Mix Network is done
via the Katzenpost Mix Network Wire Protocol KATZMIXWIRE.
Nodes are responsible for establishing the connection to the next hop, for
example, a mix in layer 0 will accept inbound connections from all Providers listed
in the PKI, and will proactively establish connections to each mix in layer 1.
Nodes MAY accept inbound connections from unknown Nodes, but MUST not relay any
traffic until they became known via listing in the PKI document, and MUST terminate
the connection immediately if authentication fails for any other reason.
Nodes MUST impose an exponential backoff when reconnecting if a link layer
connection gets terminated, and the minimum retry interval MUST be no shorter than
5
seconds.
Nodes MAY rate limit inbound connections as required to keep load and or resource
use at a manageable level, but MUST be prepared to handle at least one persistent
long lived connection per potentially eligible peer at all times.
4.2 Sphinx Mix and Provider Key Rotation
Each Node MUST rotate the key pair used for Sphinx packet processing periodically
for forward secrecy reasons and to keep the list of seen packet tags short. The
Katzenpost Mix Network uses a fixed interval (epoch), so that key
rotations happen simultaneously throughout the network, at predictable times.
Let each epoch be exactly 10800 seconds (3 hours) in duration,
and the 0th Epoch begin at 2017-06-01 00:00 UTC. For more details
see our “Katzenpost Mix Network Public Key Infrastructure
Specification” document. KATZMIXPKI
4.3 Sphinx Packet Processing
The detailed processing of the Sphinx packet is described in the Sphinx
specification: “The Sphinx Mix Network Cryptographic Packet Format
Specification”. Below, we present an overview of the steps which the node
is performing upon receiving the packet:
Records the time of reception.
Perform a Sphinx_Unwrap operation to authenticate and
decrypt a packet, discarding it immediately if the operation fails.
Apply replay detection to the packet, discarding replayed packets
immediately.
Act on the routing commands.
All packets processed by Mixes MUST contain the following commands.
NextNodeHopCommand, specifying the next Mix or Provider
that the packet will be forwarded to.
NodeDelayCommand, specifying the delay in milliseconds to
be applied to the packet, prior to forwarding it to the Node specified by
the NextNodeHopCommand, as measured from the time of reception.
Mixes MUST discard packets that have any commands other than a
NextNodeHopCommand or a NodeDelayCommand.
Note that this does not apply to Providers or Clients, which have additional
commands related to recipient and SURB (Single Use Reply Block)
processing.
Nodes MUST continue to accept the previous epoch’s key for up to 1MSL past the
epoch transition, to tolerate latency and clock skew, and MUST start accepting the
next epoch’s key 1MSL prior to the epoch transition where it becomes the current
active key.
Upon the final expiration of a key (1MSL past the epoch transition), Nodes MUST
securely destroy the private component of the expired Sphinx packet processing key
along with the backing store used to maintain replay information associated with the
expired key.
Nodes MAY discard packets at any time, for example to keep congestion and or load
at a manageable level, however assuming the Sphinx_Unwrap
operation was successful, the packet MUST be fed into the replay detection
mechanism.
Nodes MUST ensure that the time a packet is forwarded to the next Node is around
the time of reception plus the delay specified in
NodeDelayCommand. Since exact millisecond processing is
unpractical, implementations MAY tolerate a small window around that time for
packets to be forwarded. That tolerance window SHOULD be kept minimal.
Nodes MUST discard packets that have been delayed for significantly more time
than specified by the NodeDelayCommand.
5. Anonymity Considerations
5.1 Topology
Layered topology is used because it offers the best level of anonymity and ease
of analysis, while being flexible enough to scale up traffic. Whereas most mixnet
papers discuss their security properties in the context of a cascade topology, which
does not scale well, or a free-route network, which quickly becomes intractable to
analyze when the network grows, while providing slightly worse anonymity than a
layered topology. MIXTOPO10
Important considerations when assigning mixes to layers, in order of decreasing
importance, are:
Security: do not allow mixes from one security domain to be in different
layers to maximise the number of security domains traversed by a packet
Performance: arrange mixes in layers to maximise the capacity of the
layer with the lowest capacity (the bottleneck layer)
Security: arrange mixes in layers to maximise the number of jurisdictions
traversed by a packet (this is harder to do really well than it seems,
requires understanding of legal agreements such as MLATs).
5.2 Mixing strategy
As a mixing technique the Poisson mix strategy LOOPIX and KESDOGAN98 is used, which
REQUIRES that a packet at each hop in the route is delayed by some amount of time,
randomly selected by the sender from an exponential distribution. This strategy
allows to prevent the timing correlation of the incoming and outgoing traffic from
each node. Additionally, the parameters of the distribution used for generating the
delay can be tuned up and down depending on the amount of traffic in the network and
the application for which the system is deployed.
6. Security Considerations
The source of all authority in the mixnet system comes from the Directory Authority
system which is also known as the mixnet PKI. This system gives the mixes and clients
a
consistent view of the network while allowing human intervention when needed. All
public
mix key material and network connection information is distributed by this Directory
Authority system.
Bradner, S., “Key words for use in RFCs to Indicate Requirement
Levels”, BCP 14, RFC 2119, DOI 10.17487/RFC2119, March 1997, http://www.rfc-editor.org/info/rfc2119.
RFC5246
Dierks, T. and E. Rescorla, “The Transport Layer Security (TLS) Protocol
Version 1.2”, RFC 5246, DOI 10.17487/RFC5246, August 2008, http://www.rfc-editor.org/info/rfc5246.
RFC6234
Eastlake 3rd, D. and T. Hansen, “US Secure Hash Algorithms (SHA and SHA-based HMAC and HKDF)”, RFC 6234, DOI 10.17487/RFC6234, May 2011,
https://www.rfc-editor.org/info/rfc6234.
In the context of continuous time mixing strategies such as the memoryless mix
used by Katzenpost, n-1 attacks may use strategic packetloss. Nodes can also fail
for benign reasons. Determining whether or not it’s an n-1 attack is outside the
scope of this work.
This document describes how we will communicate statistics from mix nodes to mix
network directory authorities which tells them about the packetloss they are
observing.
The following terms are used in this specification.
Wire protocol
Refers to our PQ Noise based protocol which currently uses TCP but in the
near future will optionally use QUIC. This protocol has messages known as
wire protocol commands, which are used for various mixnet
functions such as sending or retrieving a message, dirauth voting etc. For
more information, please see our design doc: wire protocol specification
Providers
Refers to a set of node on the edge of the network which have two roles,
handle incoming client connections and run mixnet services. Soon we should
get rid of Providers and replace it with two different
sets, gateway nodes and service nodes.
Epoch
The Katzenpost epoch is currently set to a 20 minute duration. Each new
epoch there is a new PKI document published containing public key material
that will only be valid for that epoch.
1. Design Overview
Nodes (mixes, gateways, and providers) need to upload packet-loss statistics to the
directory authorities, so that authorities can label malfunctioning nodes as such
in the
consensus in the next epoch.
Nodes currently sign and upload a Descriptor in each epoch.
In the future, they would instead upload an UploadDescStats containing:
Descriptor
Stats
Signature
Contains a map from pairs-of-mixes to the ratio of count-of-loops-sent vs.
count-of-loops-received.
2. Tracking packet loss and detecting faulty mixes
Katzenpost lets different elements in the network track whether other elements are
functioning correctly. A node A will do this by sending packets in randomly generated
loops through the network, and tracking whether the loop comes back or not. When it
comes back, it will mark that as evidence, that the nodes on the path of that loop
are
functioning correctly.
Experimental setup, node A:
Data: each network node A collects a record of emitted
test loops in a certain epoch, their paths and whether they returned or not.
Importantly, each loop is the same length and includes l steps.
A segment is defined as a possible connection from a device in the network to
another, for example from a node in the layer k to a node in
the layer k+1. Each loop is a sequence of such segments.
Each node A will create 3 hashmaps,
sent_loops_A, completed_loops_A and
ratios_A. Each of these will use a pair of concatenated
mixnode ID’s as the key. The ordering of the ID’s will be from lesser topology
layer to greater, e.g. the two-tuple (n, n+1) which is represented here as a 64
byte array:
Every time the node A sends out a test loop, for each segment in the loop
path, it will increment the value in sent_loops_A.
When a test loop returns, for each step in the loop path, it will increment
the value in completed_loops_A.
Generate a new map entry in ratios_A for each
mix-node-pair p, if sent_loops_A[p]==0 set
ratios_A[p]=1. Else ratios_A[p] = completed_loops_A[p]/sent_loops_A[p]
Plot the resulting distribution, and calculate the standard deviation to
detect anomalies. Have the node report significant anomalies after a sufficient
time period as to not leak information on the route of individual loops.
Anomalies may have to be discarded if the corresponding
sent_loops_A[p] is small.
You would expect the distribution of values in completed_loops to
approximate a binomial distribution. In an absence of faulty nodes,
ratios should be 1, and when there are some faulty nodes values
at faulty nodes should approach 0 (if the node doesn’t work at all), and be binomially
distributed at nodes that can share a loop with faulty nodes.
Therefore each mix node generates a statistics report to upload to the dirauth nodes,
of the struct type:
The report is subsequently uploaded to the directory authorities, which combine the
reports of individual nodes into a health status of the network and arrive at a
consensus decision about the topology of the network.
3. Uploading Stats to Dirauths
Stats reports are uploaded along with the mix descriptor every Epoch. A
cryptographic signature covers both of these fields:
Statistics reports collected during the XXX period of time, that is, the time
between descriptor N+1 upload and descriptor N+2 upload, are what will affect the
topology choices in epoch N+2 if the dirauths collectively decide to act on the very
latest statistics reports in order to determine for example if a mix node should be
removed from the network:
Here I present a modification of the Sphinx cryptographic packet format that uses
a KEM instead of a NIKE whilst preserving the properties of bitwise unlinkability,
constant packet size and route length hiding.
We’ll express our KEM Sphinx header in pseudo code. The Sphinx body will be exactly
the same as the section called “References” Our basic KEM API has three functions:
PRIV_KEY, PUB_KEY = GEN_KEYPAIR(RNG)
ct, ss = ENCAP(PUB_KEY) - Encapsulate generates a shared
secret, ss, for the public key and encapsulates it into a ciphertext.
ss = DECAP(PRIV_KEY, ct) - Decapsulate computes the shared
key, ss, encapsulated in the ciphertext, ct, for the private key.
Additional notation includes:
|| = concatenate two binary blobs together
PRF = pseudo random function, a cryptographic hash function,
e.g. Blake2b.
Therefore we must embed these KEM ciphertexts in the KEMSphinx header, one KEM
ciphertext per mix hop.
2. Post Quantum Hybrid KEM
Special care must be taken in order correctly compose a hybrid post quantum KEM that
is IND-CCA2 robust.
The hybrid post quantum KEMs found in Cloudflare’s circl library are suitable to be
used with Noise or TLS but not with KEM Sphinx because they are not IND-CCA2 robust.
Noise and TLS achieve IND-CCA2 security by mixing in the public keys and ciphertexts
into the hash object and therefore do not require an IND-CCA2 KEM.
Firstly, our post quantum KEM is IND-CCA2 however we must specifically take care to
make our NIKE to KEM adapter have semantic security. Secondly, we must make a security
preserving KEM combiner.
2.1 NIKE to KEM adapter
We easily achieve our IND-CCA2 security by means of hashing together the DH shared
secret along with both of the public keys:
The KEM Combiners paper ??? makes the observation that if a
KEM combiner is not security preserving then the resulting hybrid KEM will not have
IND-CCA2 security if one of the composing KEMs does not have IND-CCA2 security.
Likewise the paper points out that when using a security preserving KEM combiner,
if
only one of the composing KEMs has IND-CCA2 security then the resulting hybrid KEM
will have IND-CCA2 security.
Our KEM combiner uses the split PRF design from the paper when combining two KEM
shared secrets together we use a hash function to also mix in the values of both KEM
ciphertexts. In this pseudo code example we are hashing together the two shared
secrets from the two underlying KEMs, ss1 and ss2. Additionally the two ciphertexts
from the underlying KEMs, cct1 and cct2, are also hashed together:
MAC for this hop (authenticates header fields 1 thru 4)
KEM Sphinx header elements:
Version number (MACed but not encrypted)
One KEM ciphertext for use with the next hop
Encrypted per routing commands AND KEM ciphertexts, one for each additional
hop
MAC for this hop (authenticates header fields 1 thru 4)
We can say that KEMSphinx differs from NIKE Sphinx by replacing the header’s group
element (e.g. an X25519 public key) field with the KEM ciphertext. Subsequent KEM
ciphertexts for each hop are stored inside the Sphinx header “routing
information” section.
First we must have a data type to express a mix hop, and we can use lists of these
hops to express a route:
type PathHop struct {
public_key kem.PublicKey
routing_commands Commands
}
Here’s how we construct a KEMSphinx packet header where path is a list of PathHop,
and
indicates the route through the network:
Derive the KEM ciphertexts for each hop.
route_keys = []
route_kems = []
for i := 0; i < num_hops; i++ {
kem_ct, ss := ENCAP(path[i].public_key)
route_kems += kem_ct
route_keys += ss
}
Derive the routing_information keystream and encrypted padding for each hop.
Same as in the section called “References” except for the fact that each routing info
slot is now increased by the size of the KEM ciphertext.
Create the routing_information block.
Here we modify the Sphinx implementation to pack the next KEM ciphertext into each
routing information block. Each of these blocks is decrypted for each mix mix hop
which
will decrypt the KEM ciphertext for the next hop in the route.
Assemble the completed Sphinx Packet Header and Sphinx Packet Payload SPRP
key vector. Same as in SPHINXSPEC except the
kem_element field is set to the first KEM ciphertext,
route_kems[0]:
Most of the design here will be exactly the same as in SPHINXSPEC. However there are a few notable differences:
The shared secret is derived from the KEM ciphertext instead of a DH.
Next hop’s KEM ciphertext stored in the encrypted routing information.
Acknowledgments
I would like to thank Peter Schwabe for the original idea of simply replacing the
Sphinx NIKE with a KEM and for answering all my questions. I’d also like to thank
Bas
Westerbaan for answering questions.
In order to join or initiate a conversation, participants need to
exchange cryptographic key material. To address this problem we have
a slightly unusual design: Contact vouchers.
In many systems, invites to conversations flow from an existing
member of the conversation to the user being invited. In our
“Contact Voucher” protocol this flow is reversed: A member
wishing to join a conversation hands a “Contact Voucher”
(out of band) to the existing member, who then inducts the new
member into the group.
This design mitigates two potential problem with the former way of
doing it:
If the Contact Voucher is observed by a third-party, the
third-party does not get to read neither participants’ actual
messages.
Passive adversaries learn
that the voucher was spent, but do not get to observe
further interactions.
Active adversaries can
create a new fake group to induct the member into but does
not learn anything about the existing group.
In the future to prevent this one-way impersonation we
could allow a “both parties bring something on
paper to the meeting”:
Bob brings Contact Voucher
Alice brings fingerprint for the
VoucherReplyPublicKey (thwarts the active attacker)
Only one thing needs to delivered out of band to achieve a
2-pass protocol (instead of a 3-pass protocol).
Only one of the parties need to bring key material to a meeting
in order to establish contact.
Self-authenticating BACAP payload
The first message sent (The VoucherPayload) is authenticated in
the following manner:
The VoucherPayload is computed (first).
A cryptographic hash of the VoucherPayload is computed. This
hash is the
Voucher*.
The Voucher is then used to
derive a BACAP read/write capability set.
The VoucherPayload is uploaded to the sequence described by
the capability (at index 0).
Anyone who intercepts the
Voucher can read
and write the sequence.
But: Since the Voucher is a
hash over the VoucherPayload, writing the sequence with
anything but the VoucherPayload will be detectable by the
recipient.
This means that the contents cannot be
undetectably be modified by the interceptor.
2.9 - Certificate format
Certificate format
DavidStainton
Abstract
This document proposes a certificate format that Katzenpost mix server, directory
authority server and clients will use.
The following terms are used in this specification.
Conventions Used in This Document
The key words “MUST”, “MUST NOT”,
“REQUIRED”, “SHALL”, “SHALL NOT”,
“SHOULD”, “SHOULD NOT”, “RECOMMENDED”,
“MAY”, and “OPTIONAL” in this document are to be
interpreted as described in RFC2119.
1. Introduction
Mixes and Directory Authority servers need to have key agility in the sense of
operational abilities such as key rotation and key revocation. That is, we wish for
mixes and authorities to periodically utilize a long-term signing key for generating
certificates for new short-term signing keys.
Yet another use-case for these certificate is to replace the use of JOSE RFC7515 in the voting Directory Authority system KATZMIXPKI for the multi-signature documents exchanged
for voting and consensus.
1.1. Document Format
The CBOR RFC7049 serialization format is used to
serialize certificates:
Signature is a cryptographic signature which has an associated signer ID.
type Signature struct {
// Identity is the identity of the signer.
Identity []byte
// Signature is the actual signature value.
Signature []byte
}
Certificate structure for serializing certificates.
type certificate struct {
// Version is the certificate format version.
Version uint32
// Expiration is seconds since Unix epoch.
Expiration int64
// KeyType indicates the type of key
// that is certified by this certificate.
KeyType string
// Certified is the data that is certified by
// this certificate.
Certified []byte
// Signatures are the signature of the certificate.
Signatures []Signature
}
That is, one or more signatures sign the certificate. However the
Certified field is not the only information that is signed.
The Certified field along with the other non-signature fields are
all concatenated together and signed. Before serialization the signatures are sorted
by their identity so that the output is binary deterministic.
1.2 Certificate Types
The certificate type field indicates the type of certificate.
So far we have only two types:
identity key certificate
directory authority certificate
Both mixes and directory authority servers have a secret, long-term identity key.
This key is ideally stored encrypted and offline, it’s used to sign key certificate
documents. Key certificates contain a medium-term signing key that is used to sign
other documents. In the case of an “authority signing key”, it is used
to sign vote and consensus documents whereas the “mix singing key” is
used to sign mix descriptors which are uploaded to the directory authority servers.
1.3. Certificate Key Types
It’s more practical to continue using Ed25519 ED25519 keys but it’s also possible that in the future we could upgrade
to a stateless hash based post quantum cryptographic signature scheme such as
SPHINCS-256 or SPHINCS+. SPHINCS256
Our golang implementation is agnostic to the specific cryptographic signature scheme
which is used. Cert can handle single and multiple signatures per document and has
a
variety of helper functions that ease use for multi signature use cases.
Acknowledgments
This specification was inspired by Tor Project’s certificate format specification
document:
C. Bormann, P. Hoffman, “Concise Binary Object Representation (CBOR)”, Internet Engineering Task Force (IETF), October 2013,
https://www.rfc-editor.org/info/rfc7049.
Bradner, S., “Key words for use in RFCs to Indicate Requirement
Levels”, BCP 14, RFC 2119, DOI 10.17487/RFC2119, March 1997, http://www.rfc-editor.org/info/rfc2119.
RFC7693
Saarinen, M-J., Ed., and J-P. Aumasson, “The BLAKE2 Cryptographic Hash and Message Authentication Code (MAC)”, RFC 7693, DOI 10.17487/RFC7693, November 2015,
http://www.rfc-editor.org/info/rfc7693.
Bernstein, D., Hopwood, D., Hulsing, A., Lange, T., Niederhagen, R., Papachristodoulou,
L., Schwabe, P., Wilcox O’Hearn, Z.,
“SPHINCS: practical stateless hash-based signatures”,
http://sphincs.cr.yp.to/sphincs-20141001.pdf.
2.10 - Provider-side autoresponder extension
Provider-side autoresponder extension (Kaetzchen)
YawningAngel
KaliKaneko
DavidStainton
Abstract
This interface is meant to provide support for various autoresponder agents
“Kaetzchen” that run on Katzenpost provider instances, thus
bypassing the need to run a discrete client instance to provide functionality. The
use-cases for such agents include, but are not limited to, user identity key lookup,
a discard address, and a loop-back responder for the purpose of cover
traffic.
The key words “MUST”, “MUST NOT”, “REQUIRED”,
“SHALL”, “SHALL NOT”, “SHOULD”, “SHOULD
NOT”, “RECOMMENDED”, “MAY”, and
“OPTIONAL” in this document are to be interpreted as described in RFC2119.
Terminology
The following terms are used in this specification.
SURB
Single use reply block. SURBs are used to achieve recipient anonymity,
that is to say, SURBs function as a cryptographic delivery token that
you can give to another client entity so that they can send you a
message without them knowing your identity or location on the network.
See SPHINXSPEC and SPHINX.
BlockSphinxPlaintext
The payload structure which is encapsulated by the Sphinx body.
1. Extension Overview
Each Kaetzchen agent will register as a potential recipient on its Provider. When
the Provider receives a forward packet destined for a Kaetzchen instance, it will
hand off the fully unwrapped packet along with its corresponding SURB to the agent,
which will then act on the packet and optionally reply utilizing the SURB.
1.1 Agent Requirements
Each agent operation MUST be idempotent.
Each agent operation request and response MUST fit within one Sphinx
packet.
Each agent SHOULD register a recipient address that is prefixed with (Or
another standardized delimiter, agreed to by all participating providers in
a given mixnet).
Each agent SHOULD register a recipient address that consists of an RFC5322
dot-atom value, and MUST register recipient addresses that are at most 64
octets in length.
The first byte of the agent’s response payload MUST be 0x01 to allow
clients to easily differentiate between SURB-ACKs and agent
responses.
1.2 Mix Message Formats
Messages from clients to Kaetzchen use the following payload format in the forward
Sphinx packet:
struct {
uint8_t flags;
uint8_t reserved; /* Set to 0x00. */
select (flags) {
case 0:
opaque padding[sizeof(SphinxSURB)];
case 1:
SphinxSURB surb;
}
opaque plaintext[];
} KaetzchenMessage;
The plaintext component of a KaetzchenMessage MUST be padded by
appending “0x00” bytes to make the final total size of a
KaetzchenMessage equal to that of a
BlockSphinxPlaintext.
Messages (replies) from the Kaetzchen to client use the following payload format
in the SURB generated packet:
struct {
opaque plaintext[];
} KaetzchenReply;
The plaintext component of a KaetzchenReply MUST be padded by
appending “0x00” bytes to make the final total size of a
KaetzchenReply equal to that of a
BlockSphinxPlaintext
2. PKI Extensions
Each provider SHOULD publish the list of publicly accessible Kaetzchen agent
endpoints in its MixDescriptor, along with any other information required to utilize
the agent.
Provider should make this information available in the form of a map in which the
keys are the label used to identify a given service, and the value is a map with
arbitrary keys.
Valid service names refer to the services defined in extensions to this
specification. Every service MUST be implemented by one and only one Kaetzchen
agent.
For each service, the provider MUST advertise a field for the endpoint at which
the Kaetzchen agent can be reached, as a key value pair where the key is
endpoint, and the value is the provider side endpoint
identifier.
In the event that the mix keys for the entire return path are compromised, it is
possible for adversaries to unwrap the SURB and determine the final recipient of the
reply.
Depending on what sort of operations a given agent implements, there may be
additional anonymity impact that requires separate consideration.
Clients MUST NOT have predictable retransmission otherwise this makes active
confirmations attacks possible which could be used to discover the ingress Provider
of the client.
4. Security Considerations
It is possible to use this mechanism to flood a victim with unwanted traffic by
constructing a request with a SURB destined for the target.
Depending on the operations implemented by each agent, the added functionality may
end up being a vector for Denial of Service attacks in the form of CPU or network
overload.
Unless the agent implements additional encryption, message integrity and privacy
is limited to that which is provided by the base Sphinx packet format and
parameterization.
Acknowledgments
The inspiration for this extension comes primarily from a design by Vincent
Breitmoser.
Bradner, S., “Key words for use in RFCs to Indicate Requirement
Levels”, BCP 14, RFC 2119, DOI 10.17487/RFC2119, March 1997, http://www.rfc-editor.org/info/rfc2119.
BlockSphinxPlaintext
The payload structure which is encapsulated by the Sphinx body.
classes of traffic
We distinguish the following classes of traffic:
SURB Replies (also sometimes referred to as ACKs)
Forward messages
client
Software run by the User on its local device to participate in the Mixnet.
Again let us reiterate that a client is not considered a “node in the
network” at the level of analysis where we are discussing the core mixnet
protocol in this here document.
directory authority system
Refers to specific PKI schemes used by Mixminion and Tor.
entry mix, entry node
A mix that has some additional features:
An entry mix is always the first hop in routes where the message
originates from a client.
An entry mix authenticates client’s direct connections via the
mixnet’s wire protocol.
An entry mix queues reply messages and allows clients to retrieve
them later.
epoch
A fixed time interval defined in section 4.2 Sphinx Mix and Provider Key
Rotation. The epoch is currently set to 20 minutes.
A new PKI document containing public key material
is published for each epoch and is valid only for that epoch.
family
Identifier of security domains or entities operating one or more mixes in
the network. This is used to inform the path selection algorithm.
group
A finite set of elements and a binary operation that satisfy the
properties of closure, associativity, invertability, and the presence of an
identity element.
group element
An individual element of the group.
group generator
A group element capable of generating any other element of the group, via
repeated applications of the generator and the group operation.
header
The packet header consisting of several components, which convey the
information necessary to verify packet integrity and correctly process the
packet.
KiB
Defined as 1024 8 bit octets.
Katzenpost
A project to design many improved decryption mixnet protocols.
layer
The layer indicates which network topology layer a particular mix resides
in.
message
A variable-length sequence of octets sent anonymously through the network.
Short messages are sent in a single packet; long messages are fragmented
across multiple packets.
mix descriptor
A database record which describes a component mix.
mix
A cryptographic router that is used to compose a mixnet. Mixes use a
cryptographic operation on messages being routed which provides bitwise
unlinkability with respect to input versus output messages. Katzenpost is a
decryption mixnet that uses the Sphinx cryptographic packet format.
mixnet
A mixnet also known as a mix network is a network of mixes that can be
used to build various privacy preserving protocols.
MSL
Maximum segment lifetime, currently set to 120 seconds.
nickname
A nickname string that is unique in the consensus document, see Katzenpost
Mix Network Specification section 2.2. Network Topology.
node
Clients are NOT considered nodes in the mix network. However note that
network protocols are often layered; in our design documents we describe
“mixnet hidden services” which can be operated by mixnet clients. Therefore
if you are using node in some adherence to mathematical terminology one
could conceivably designate a client as a node. That having been said, it
would not be appropriate to the discussion of our core mixnet protocol to
refer to the clients as nodes.
packet
A Sphinx packet, of fixed
length for each class of traffic, carrying a message payload and metadata for routing.
Packets are routed anonymously through the mixnet and cryptographically transformed
at
each hop.
payload
The fixed-length portion of a packet containing an encrypted message or
part of a message, to be delivered anonymously.
PKI
Public key infrastructure
provider
A service operated by a third party that Clients communicate directly with
to communicate with the Mixnet. It is responsible for Client authentication,
forwarding outgoing messages to the Mixnet, and storing incoming messages
for the Client. The Provider MUST have the ability to perform cryptographic
operations on the relayed messages.
SEDA
Staged Event Driven Architecture. 1. A
highly parallelizable computation model. 2. A computational pipeline
composed of multiple stages connected by queues utilizing active queue
management algorithms that can evict items from the queue based on dwell
time or other criteria where each stage is a thread pool. 3. The only
correct way to efficiently implement a software based router on general
purpose computing hardware.
service mix
A service mix is a mix that has some additional features:
A service mix is always the last hop in routes where the message
originates from a client.
A service mix runs mixnet services which use a Sphinx SURB based
protocol.
SURB
Single use reply block. SURBs are used to achieve recipient anonymity,
that is to say, SURBs function as a cryptographic delivery token that
you can give to another client entity so that they can send you a
message without them knowing your identity or location on the network.
See SPHINXSPEC and SPHINX.
user
An agent using the Katzenpost system.
wire protocol
Refers to our PQ Noise based protocol which currently uses TCP but in the
near future will optionally use QUIC. This protocol has messages known as
wire protocol commands, which are used for various mixnet
functions such as sending or retrieving a message, dirauth voting etc. For
more information, please see our design doc: wire protocol specification
2.12 - References
References
AB96
Anderson, R., Biham, E., “Two Practical and Provably Secure Block Ciphers: BEAR and LION”, 1996.
Piotrowska, A., Hayes, J., Elahi, T., Meiser, S., Danezis, G., “The Loopix Anonymity System”USENIX, August 2017,
https://arxiv.org/pdf/1703.00536.pdf.
MIRANDA
Leibowitz, H., Piotrowska, A., Danezis, G., Herzberg, A., “No right
to remain silent: Isolating Malicious Mixes”, 2017,
https://eprint.iacr.org/2017/1000.pdf.
Dingledine, R., Freedman, M., Hopwood, D., Molnar, D., “A Reputation
System to Increase MIX-Net Reliability”, 2001, Information Hiding, 4th
International Workshop,
https://www.freehaven.net/anonbib/cache/mix-acc.pdf.
Maines, L., Piva, M., Rimoldi, A., Sala, M., “On the provable security of BEAR and LION schemes”, May 2011, arXiv:1105.0259,
https://arxiv.org/abs/1105.0259.
Yawning Angel, Benjamin Dowling, Andreas Hülsing, Peter Schwabe and Florian Weber,
“Post Quantum Noise”, September 2023,
https://eprint.iacr.org/2022/539.pdf.
RFC2119
Bradner, S., “Key words for use in RFCs to Indicate Requirement
Levels”, BCP 14, RFC 2119, DOI 10.17487/RFC2119, March 1997, http://www.rfc-editor.org/info/rfc2119.
RFC5246
Dierks, T. and E. Rescorla, “The Transport Layer Security (TLS) Protocol
Version 1.2”, RFC 5246, DOI 10.17487/RFC5246, August 2008, http://www.rfc-editor.org/info/rfc5246.
Eastlake 3rd, D. and T. Hansen, “US Secure Hash Algorithms (SHA and SHA-based HMAC and HKDF)”, RFC 6234, DOI 10.17487/RFC6234, May 2011,
https://www.rfc-editor.org/info/rfc6234.
RFC7049
C. Bormann, P. Hoffman, “Concise Binary Object Representation (CBOR)”, Internet Engineering Task Force (IETF), October 2013,
https://www.rfc-editor.org/info/rfc7049.
Saarinen, M-J., Ed., and J-P. Aumasson, “The BLAKE2 Cryptographic Hash and Message Authentication Code (MAC)”, RFC 7693, DOI 10.17487/RFC7693, November 2015,
http://www.rfc-editor.org/info/rfc7693.
Welsh, M., Culler, D., Brewer, E., “SEDA: An Architecture
for Well-Conditioned, Scalable Internet Services”, 2001, ACM
Symposium on Operating Systems Principles,
http://www.sosp.org/2001/papers/welsh.pdf.
Manuel Barbosa, Deirdre Connolly, João Diogo Duarte, Aaron Kaiser, Peter Schwabe,
Karoline Varner, Bas Westerbaan, “X-Wing: The Hybrid KEM You’ve Been Looking
For”,
https://eprint.iacr.org/2024/039.pdf.
The Pigeonhole protocol establishes anonymous cryptographic communication channels
which
have a readcap (read capability) and a writecap
(write capability). For example, if Alice and Bob want to communicate, they can each
create
their own Pigeonhole/BACAP channels and exchange readcaps on those channels. Now when
Bob
writes to his channel, Alice can read those messages because she has Bob’s readcap.
Likewise,
when Alice writes to her channel, Bob can read those messages because he has Alice’s
readcap.
This is the most basic construction using BACAP and Pigeonhole.
Here we extend this basic design to work as a minimal group-chat protocol, without
key
rotation.
BACAP primitives give us two message types:
SingleMessage
AllOrNothingMessage (used for big messages: upload n
chunks to a temporary stream and then put a pointer to that in your own stream as
a single
message.)
The group state consists of:
a MembershipCap for each member, containing:
a BACAP readcap
a nickname
a MembershipHash (a hash over all of the MembershipCaps)
The group chat is completely decentralized. Each member must keep track of every other
member.
Group Chat Message Types
All messages are SingleMessage if they fit in one BACAP slot, or an
AllOrNothingMessage if they are too big.
Text type payloads are normal chat text messages.
// TextPayload encapsulates a normal text message.
typeTextPayloadstruct{// Payload contains a normal UTF-8 text message to be displayed inline.
Payload[]byte}
Introduction type messages introduce new group members.
// Introduction introduces a new member to the group.
typeIntroductionstruct{// DisplayName is the party's name to be displayed in chat clients.
DisplayNamestring// UniversalReadCap is the BACAP UniversalReadCap
// which lets you read all messages posted by this user.
UniversalReadCap*bacap.UniversalReadCap}
FileUpload type
A FileUpload can be used for various purposes such as uploading an image to
be displayed inline by the chat client. Likewise, a sound bite could be made visible
in the
chat along with a play-button. Beyond that, we can support arbitrary file attachments.
The protocol flow for making a new group from scratch (using whatever authentication
protocol) is essentially for everybody to exchange PleaseAdd messages.
// PleaseAdd is a message used by a client to try and gain access to a chat group.
typePleaseAddstruct{// DisplayName is the party's name to be displayed in chat clients.
DisplayNamestring// UniversalReadCap is the BACAP UniversalReadCap
// which lets you read all messages posted by this user.
UniversalReadCap*bacap.UniversalReadCap}typeSignedPleaseAddstruct{// PleaseAdd contains the CBOR serialized PleaseAdd struct.
PleaseAdd[]byte// Signature contains the cryptographic signature over the PleaseAdd field.
Signature[]byte}
For introduction to an existing group over an existing channel between an introducer
member and new member, an Invitation message is used.
typeInvitationstruct{GroupNamestring}
The Invitation protocol flow works as follows.
There exists a group called YoloGroup. A member of the group invites a potential new
member with an Invitation message.
If the invited party wants to join, then they reply with a
SignedPleaseAdd message meaning “I want to join your group.” This
provides the invited party’s BACAP universal readcap, their display name, and a
cryptographic signature produced by their BACAP writecap.
The introducer receives the SignedPleaseAdd message.
If the introducer does not like the DisplayName, they reply to the invited party
with a PleaseReviseDisplayName message that contains the original
SignedPleaseAdd. Then they wait for a new
SignedPleaseAdd.
If the introducer approves of the DisplayName, then:
Because existing members need the new member’s readcap, the introducer
publishes the SignedPleaseAdd to their own BACAP stream for the
rest of the group to read.
Because the new member needs existing members’ readcaps, the introducer
replies to the new member with ReplyWho message containing readcaps
for all existing members.
IMPORTANT: The content of both replies must
be sent in the same AllOrNothingMessage, despite the
SignedPleaseAdd being written to the introducer’s own BACAP
stream for the group and the ReplyWho being written to the BACAP
stream the introducer is using to communicate with the new member.
NOTE: We could send all of this information
as part of the initial Invitation, but that would allow silent
members to read other members’ streams without them knowing it, which is an
anti-goal.
Addenda
GOOD QUESTION: If we are adding a lot of people at once,do we really need to upload
all of
the members n times?
A Katzenpost mixnet client has several responsibilities at minimum:
compose Sphinx packets
decrypt SURB replies
send and receive Noise protocol messages
keep up to date with the latest PKI document
This document describes the design of the new Katzenpost mix network
client known as client2. In particular we discuss it’s multiplexing and
privilege separation design elements as well as the protocol used by the
thin client library.
Therefore applications will be integrated with Katzenpost using the
connector library known as a thin client library which gives them the
capability to talk with the connector daemon and in that way interact
with the mix network. The library itself does not do any
mixnet-related cryptography since that is already handled by the
connector daemon. In particular, the PKI document is stripped by the
daemon before it’s passed on to clients using the connector
library. Likewise, the library doesn't decrypt SURB replies or
compose Sphinx packets, with Noise, Sphinx, and PKI related
cryptography being handled by the daemon.
2. Connector library and daemon protocol
The thin client daemon protocol uses a local network socket,
either Unix domain socket or TCP.
2.3 Protocol messages
Note that there are two protocol message types and they are always CBOR
encoded. We send over length prefixed CBOR blobs. That is to say, a length
prefix encoded as a big endian unsigned four byte integer (uint32).
The client sends the Request messages and the daemon sends the Response messages.
2.4 Protocol description
Upon connecting to the daemon socket the client must wait for two
messages. The first message received must have its is_status field set
to true and its is_connected field indicating whether or not the
daemon has a mixnet PQ Noise protocol connection to an entry node.
The client then awaits the second message, which contains the PKI
document in its payload field. This marks the end of the initial
connection sequence. Note that this PKI document is stripped of all
cryptographic signatures.
In the next protocol phase, the client may send Request messages to
the daemon in order to cause the daemon to encapsulate the given payload
in a Sphinx packet and send it to the gateway node. Likewise the daemon
my send the client Response messages at any time during this protocol
phase. These Response messages may indicate a connection status
change, a new PKI document, or a message-sent or reply event.
2.15 -
Pigeonhole Protocol Design Specification
Abstract
In this specification we describe the components and protocols that
compose Pigeonhole scattered storage. We define the behavior of
communication clients that send and retrieve individual messages,
BACAP streams, and AllOrNothing streams. Client actions are mediated
through courier services that interact with storage replicas.
Introduction
Pigeonhole scattered storage enables persistent anonymous
communication in which participants experience a coherent sequence of
messages or a continuous data stream, but where user relationships and
relations between data blocks remain unlinkable not only from the
perspective of third-party observers, but also from that of the mixnet
components. This latter attribute provides resilience against
deanonymization by compromised mixnet nodes.
The data blocks that Pigeonhole stores are supplied by the BACAP
(Blinding-and-Capability) scheme. The Pigeonhole protocol scatters
messages around the many storage servers and among a space of BACAP Box IDs.
From a passive network observer’s perspective all of this is seemingly random.
All communication among users consists of
user-generated read or write queries to Pigeonhole storage, never
directly to other users.
Many protocols are possible to compose using Pigeonhole communication channels,
including group communications. This specification
describes the protocols that are also detailed in our paper, in section entitled
“5.6. End-to-end reliable group channels”. For more
information about BACAP, see “Echomix: a Strong Anonymity System with
Messaging”, chapter 4: https://arxiv.org/abs/2501.02933. For an
understanding of how the core BACAP primitives are implemented,see
https://github.com/katzenpost/hpqc/blob/main/bacap/bacap.go.
Message-layout snippets in this specification are given in the
trunnel binary-format description language (matching
pigeonhole/pigeonhole_messages.trunnel); code snippets showing
in-memory courier or replica state are given in Go.
Glossary
Box: BACAP’s unit of data storage. Each box has a box ID (which also serves as its public key), a signature, and a ciphertext signed payload.
Courier: Service that runs on a service node and interacts with
storage replicas. Proxies requests from clients and routes replies
back to clients (via SURBs).
Storage replica: The actual storage nodes where message ciphertexts
are stored and retrieved.
Intermediate replica: See “5.4.1. Writing messages”:
Intermediate replicas are chosen independently of the two final
replicas for that box ID, which they are derived using the sharding
scheme. The reason Alice designates intermediate replicas, as
opposed to addressing the final replicas directly, is to avoid
revealing to the courier which shard the box falls into.
Designated replica (a.k.a. final replica, shard replica):
One of the two replicas selected deterministically for a given Box
ID by the Shard2 consistent-hash algorithm. The intermediate
replicas replicate writes through to the designated replicas.
EnvelopeHash: BLAKE2b-256(CourierEnvelope.sender_pubkey || CourierEnvelope.ciphertext). Used by the courier for deduplication
of retransmissions and for demultiplexing replica replies.
MKEM: Multi-recipient KEM addressed to the pair of intermediate
replicas. One MKEM ciphertext carries, for each recipient, a
separate DEK encapsulation (Dek1, Dek2); either recipient may
decapsulate and recover the padded ReplicaInnerMessage.
Replica-epoch: A one-week period, distinct from the 20-minute
mixnet PKI epoch, during which a given replica-side MKEM envelope
keypair is valid. See “Epochs” below.
Deployment requirements
Minimum number of storage replicas
A conforming deployment MUST run at least four storage replicas
(n ≥ 4). The reason is the intermediate-replica unlinkability property
described in the Glossary and in §5.4.1 of the paper: for each
CourierEnvelope, the client picks two intermediate replicas which
MUST be disjoint from the two final (sharded) replicas that hold
the Box. Disjointness is what prevents the courier from learning
which shard the Box falls into.
n = 2: there is only one possible shard set, and the
intermediate set is necessarily identical to it.
n = 3: only one replica sits outside any given two-shard set,
so the client cannot pick two independent non-shard intermediates.
Implementations in this case fall back to an intermediate set that
includes at least one shard replica, which tells the courier that
replica is one of the two shards for the Box.
n ≥ 4: at least two non-shard replicas exist, so the
intermediate set can be drawn uniformly at random from the
non-shard subset, and disjointness is preserved.
Information exposure when n < 4
When the disjointness invariant is broken (n = 3), the courier
learns that at least one of its two intermediate replicas is in the
Box’s shard set. This exposure is bounded:
The courier does NOT learn the Box ID. The Box ID lives inside the
MKEM-encrypted inner message addressed to the intermediate
replicas; only the replicas can decrypt it.
The courier does NOT learn the other shard member. Knowing one
element of the two-element shard set does not constrain the
identity of the other among the remaining replicas.
The courier does NOT gain a way to link Boxes in the same
sequence. BACAP’s unlinkability of consecutive Box IDs (paper §4)
is a property of the capability-derivation scheme and is
independent of sharding.
Consequently, the main unlinkability properties of Pigeonhole are
not lost at n = 3, but the defense-in-depth margin provided by
disjoint intermediate replicas is. Operators SHOULD treat n ≥ 4 as
the minimum supported configuration.
Epochs
Katzenpost has two distinct notions of “epoch” that operate on very
different timescales, and Pigeonhole touches both:
Mixnet epoch (a.k.a. normal epoch, PKI epoch): the short
cadence on which the directory authorities publish a new PKI
document. The default is 20 minutes. Mix nodes, gateways, service
nodes, and clients all synchronise on this epoch. Mix-node Sphinx
replay keys rotate once per mixnet epoch.
Pigeonhole storage-replica epoch: the long cadence on which
each storage replica rotates its MKEM envelope keypair. The default
is one week. A replica publishes, in every mixnet-epoch PKI
descriptor, two envelope public keys: the current replica-epoch
key and the next replica-epoch key.
Epoch tolerance for CourierEnvelope
A CourierEnvelope carries an epoch field that identifies the
replica-epoch whose envelope key the client used to encrypt the MKEM
ciphertext. Conforming couriers and storage replicas MUST accept
epoch ∈ {current − 1, current, current + 1} where current is the
courier’s / replica’s own view of the current replica-epoch.
The current − 1 tolerance handles the grace window immediately
after a replica-epoch boundary, when a client with a slightly stale
PKI view still encrypts to the previous envelope public key.
Combined with current, this gives a two replica-epoch data TTL
— roughly two weeks — because the future replica-epoch key is by
definition a key that hasn’t started being used yet, so current
and current − 1 are the epochs where actual data flows.
The current + 1 tolerance handles the same boundary seen from the
other side: a client whose clock or PKI view is slightly ahead of
the courier / replica.
Envelopes outside this three-epoch window MUST be rejected, because
by definition no replica still holds the matching decapsulation key:
Older than current − 1: the envelope public key has been pruned
from replicas (see the replica’s envelope-key GC worker). The
replica cannot decrypt and the courier can fail fast rather than
forward a doomed request.
Newer than current + 1: no replica has generated that envelope
key yet (replicas generate current and current + 1 only).
Couriers SHOULD reject with a dedicated envelope-level error code
(EnvelopeErrorInvalidEpoch) so clients can distinguish “stale
encryption” from other courier-side rejections.
Pigeonhole message format and constants
The Pigeonhole message types are defined in trunnel at
pigeonhole/pigeonhole_messages.trunnel;
the Go bindings live in
pigeonhole/trunnel_messages.go.
All integer fields are big-endian and all variable-length fields carry
an explicit length prefix. This trunnel encoding replaces the earlier
CBOR encoding with a fixed-overhead binary format whose serialised
size can be computed deterministically from the Sphinx geometry.
Carriage of these messages differs by hop:
Client → Courier: a CourierQuery (its layout is given in the
“CourierQuery” section) is carried inside
a Sphinx packet payload. The reverse direction uses a SURB supplied
by the client.
Courier → Replica and Replica → Replica: the courier and
replicas do not use Sphinx packets between themselves. They communicate
over the Katzenpost wire protocol defined in core/wire and
core/wire/commands; the relevant commands are ReplicaMessage,
ReplicaMessageReply, ReplicaWrite, and ReplicaWriteReply. Some of
these commands embed trunnel-serialised pigeonhole blobs as their
payload.
Fundamental size constants
Constant
Value
Source
BACAP_BOX_ID_SIZE
32 bytes
Ed25519 public key
BACAP_SIGNATURE_SIZE
64 bytes
Ed25519 signature
HASH_SIZE
32 bytes
BLAKE2b-256
MKEM_DEK_SIZE
60 bytes
mkem.DEKSize
CTIDH-1024 public key
160 bytes
hpqc/nike/ctidh/ctidh1024
X25519 public key
32 bytes
hpqc/nike/x25519
Hybrid CTIDH-1024 × X25519 NIKE public key
192 bytes
sum of the above
BACAP payload encryption overhead
16 bytes
ChaCha20-Poly1305 AEAD
MKEM encapsulation overhead
28 bytes
ChaCha20-Poly1305 nonce (12) + tag (16)
Maximum BACAP payload
The maximum plaintext BACAP payload a single Box can carry is derived
backwards from the Sphinx UserForwardPayloadLength by subtracting
every layer of framing and cryptographic overhead that sits between the
Sphinx payload and the BACAP plaintext. The authoritative calculation
is performed by NewGeometryFromSphinx in
pigeonhole/geo/geometry.go.
With a hybrid CTIDH-1024 × X25519 NIKE the fixed per-packet overhead
between UserForwardPayloadLength and the BACAP plaintext is on the
order of 560 bytes; the exact figure depends on the configured NIKE
scheme and should always be obtained from
geometry.NewGeometryFromSphinx() rather than computed by hand.
The CourierEnvelope as seen by the courier
The courier, upon unwrapping the Sphinx payload of a client’s packet,
sees a CourierQuery whose content is a CourierEnvelope with the
following layout:
struct courier_envelope {
u8 intermediate_replicas[2]; // replica indices in the PKI
u8 dek1[MKEM_DEK_SIZE]; // DEK encapsulation for replica 0
u8 dek2[MKEM_DEK_SIZE]; // DEK encapsulation for replica 1
u8 reply_index; // which replica's reply to prefer
u64 epoch; // replica-epoch under which the
// MKEM ciphertext was produced
u16 sender_pubkey_len;
u8 sender_pubkey[sender_pubkey_len]; // client's ephemeral hybrid NIKE pk
u32 ciphertext_len;
u8 ciphertext[ciphertext_len]; // MKEM-encrypted ReplicaInnerMessage
}
Notable points:
The ciphertext is opaque to the courier. It is an MKEM envelope
addressed to the pair of intermediate replicas; either replica can
decapsulate using its own DEK (dek1 or dek2 respectively).
The epoch field names the replica-epoch whose envelope keys were
used to produce the MKEM ciphertext. See the “Epochs” section above
for the tolerance window.
Prior to encryption, the inner ReplicaInnerMessage is zero-padded
to ReplicaInnerMessageWriteSize() so that reads, writes and
tombstones produce MKEM ciphertexts of identical length.
The ReplicaInnerMessage as seen by a replica
Once an intermediate replica decrypts the MKEM envelope, it obtains a
ReplicaInnerMessage — a discriminated union over message_type:
A ReplicaWrite with payload_len == 0 is a tombstone; see
“Tombstones” below.
Message types and interactions
Overview
A client sends a CourierQuery inside a Sphinx packet payload.
The courier’s reply travels back to the client by means of a SURB
the client also supplied.
A client always designates two intermediate replicas per
CourierEnvelope. The courier dispatches the corresponding pair of
ReplicaMessage wire commands, one to each intermediate, and
collects up to two ReplicaMessageReply results.
The reply_index field is a preference indicating which of the two
replica replies the client would like forwarded first. It is not a
strict selector: should the preferred slot still be empty when the
courier is ready to respond, the courier falls back to whichever
reply it does hold, and indicates in the CourierEnvelopeReply the
actual index that was served (see
courier/server/plugin.go).
Clients MUST resend identical CourierEnvelope bodies — same
sender_pubkey and ciphertext — until they receive a reply. The
courier deduplicates resends by EnvelopeHash. Only the Sphinx-layer
SURB is rotated between retransmissions.
CourierQuery
A CourierQuery is the top-level discriminated union that a client
places into a Sphinx packet payload for the courier:
struct courier_query {
u8 query_type IN [0, 1];
union content[query_type] {
0: struct courier_envelope envelope; // read or write a single box
1: struct copy_command copy_command; // AllOrNothing copy
};
}
Verifies epoch ∈ {current − 1, current, current + 1} per the
replica-epoch tolerance window, rejecting with
EnvelopeErrorInvalidEpoch otherwise.
Looks EnvelopeHash up in its dedup cache. If present, the courier
returns the cached reply (or an ACK if no reply has yet arrived)
without re-dispatching to replicas.
On a cache miss, the courier constructs two ReplicaMessage wire
commands — one bound for intermediate_replicas[0] carrying
dek1, the other for intermediate_replicas[1] carrying dek2
— and forwards them over the wire protocol.
The courier immediately sends an ACK reply to the client so it may
stop retransmitting.
ReplicaMessage (wire command)
ReplicaMessage is not a trunnel pigeonhole type; it is the
core/wire/commands command sent from a courier to a replica. Its
payload fields are copied verbatim from the matching CourierEnvelope:
// core/wire/commands
typeReplicaMessagestruct{SenderEPubKey[]byte// copied from CourierEnvelope.sender_pubkey
DEK*[MKEM_DEK_SIZE]byte// dek1 or dek2, depending on destination
Ciphertext[]byte// copied from CourierEnvelope.ciphertext
}
The recipient replica decapsulates the MKEM envelope using the
replica-epoch key that corresponds to CourierEnvelope.epoch (trying
each of the three keys in its tolerance window), yielding a padded
ReplicaInnerMessage. The replica then dispatches on message_type
to ReplicaRead or ReplicaWrite handling.
ReplicaMessageReply (wire command)
In response to a ReplicaMessage, the courier expects an asynchronous
ReplicaMessageReply wire command from the replica:
// core/wire/commands
typeReplicaMessageReplystruct{ErrorCodeuint8// see the replica error-code table below
EnvelopeHash*[HASH_SIZE]byte// lets the courier demultiplex the reply
EnvelopeReply[]byte// MKEM-encrypted ReplicaMessageReplyInnerMessage
}
The EnvelopeReply byte blob is produced by the replica via the MKEM
scheme’s EnvelopeReply() method (see
replica/handlers.go).
It carries a ReplicaMessageReplyInnerMessage — a discriminated union
over either a ReplicaReadReply or a ReplicaWriteReply — padded to
ReplicaReplyInnerMessageReadSize() so that read replies and write
replies are indistinguishable in size, and encrypted to the client’s
ephemeral NIKE public key under the replica’s envelope keypair.
CourierBookKeeping
For each outstanding EnvelopeHash, the courier maintains an in-memory
dedup entry. Its actual structure is:
// courier/server/plugin.go
typeCourierBookKeepingstruct{Epochuint64// replica-epoch at cache insertion
CreatedAttime.Time// for TTL eviction (~5 minutes)
QueryTypeuint8// the query_type that produced this entry
IntermediateReplicas[2]uint8EnvelopeReplies[2]*commands.ReplicaMessageReply}
Note that the courier does not cache SURBs or SURB timestamps. A
client’s SURB is consumed by the Sphinx-layer plugin infrastructure at
the moment the courier emits a reply and is not retained by the
courier’s Pigeonhole state. Should the client not receive that reply,
it is expected to retransmit a fresh Sphinx packet carrying a fresh
SURB but the identical CourierEnvelope body; the courier, recognising
the EnvelopeHash, replies using the new SURB.
The dedup cache has a TTL of 5 minutes
(DedupCacheTTL in courier/server/plugin.go).
CourierEnvelopeReply
The courier’s reply to a CourierEnvelope has the following trunnel
layout:
A reply_type of ACK (0) indicates the courier has received the
envelope and dispatched it to the replicas but has not yet received a
reply for the requested index. A reply_type of PAYLOAD (1)
indicates the payload field carries the MKEM-encrypted
EnvelopeReply produced by a replica.
Embedded pigeonhole types
These trunnel structs are not carried on the wire in isolation; they
are embedded inside MKEM envelopes and their replies.
ReplicaRead
Embedded inside the MKEM-encrypted ReplicaInnerMessage a client
sends to a replica (via the courier) for a read operation.
Embedded inside the MKEM-encrypted ReplicaMessageReplyInnerMessage a
replica returns for a successful read. Padding is applied at the outer
ReplicaMessageReplyInnerMessage level; this struct carries no
padding of its own.
Used both (a) embedded inside the MKEM-encrypted ReplicaInnerMessage
for a client write, and (b) carried directly as a core/wire/commands
command between replicas during replication.
A ReplicaWrite with payload_len == 0 is a tombstone; see the
“Tombstones” section below.
ReplicaWriteReply
Embedded inside a ReplicaMessageReplyInnerMessage on the client
reply path, and also used as the core/wire/commands reply to
inter-replica replication.
struct replica_write_reply {
u8 error_code;
}
Tombstones
A ReplicaWrite whose payload_len == 0 is a tombstone: it marks
a Box as deleted without revealing that fact to the courier.
Replicas treat a tombstone write as an ordinary overwrite. An
existing Box at the same box_id is replaced by the tombstone,
and subsequent reads of that box_id return
ReplicaErrorTombstone (see “Replica error codes”).
ReplicaErrorTombstone is an expected outcome rather than a
failure: it positively confirms that the Box was deleted, as
distinct from ReplicaErrorBoxNotFound.
Because the inner ReplicaInnerMessage is zero-padded to
ReplicaInnerMessageWriteSize() before MKEM encryption (see “The
CourierEnvelope as seen by the courier”), a tombstone write produces
an MKEM ciphertext of exactly the same length as a non-empty write
or a read. A passive observer therefore cannot distinguish deletion
from any other operation.
Tombstones are also used by the AllOrNothing copy protocol: after
processing a CopyCommand, the courier overwrites every Box of the
temporary stream with tombstones (see “Pigeonhole AllOrNothing
protocol”).
EnvelopeHash
The EnvelopeHash uniquely identifies a CourierEnvelope for the
purposes of deduplication and reply demultiplexing. It is computed as:
where sender_pubkey and ciphertext are the corresponding fields of
the CourierEnvelope. The implementation is
CourierEnvelope.EnvelopeHash()
in pigeonhole/helpers.go.
A retransmitted CourierEnvelope carries the identical
sender_pubkey and ciphertext as the original, and therefore hashes
to the same value; only the surrounding Sphinx packet (and its SURB)
changes between attempts.
Returned by a replica in ReplicaMessageReply.ErrorCode,
ReplicaReadReply.error_code, and ReplicaWriteReply.error_code.
Code
Name
Meaning
0
ReplicaSuccess
Operation completed successfully
1
ReplicaErrorBoxIDNotFound
Read miss (expected outcome)
2
ReplicaErrorInvalidBoxID
Malformed box ID
3
ReplicaErrorInvalidSignature
BACAP signature verification failed
4
ReplicaErrorDatabaseFailure
Transient RocksDB error
5
ReplicaErrorInvalidPayload
Malformed payload
6
ReplicaErrorStorageFull
Storage capacity exceeded
7
ReplicaErrorInternalError
Internal server error
8
ReplicaErrorInvalidEpoch
Replica-epoch envelope key unavailable
9
ReplicaErrorReplicationFailed
Replication to shard peer failed
10
ReplicaErrorBoxAlreadyExists
Idempotent-write outcome (expected)
11
ReplicaErrorTombstone
Read returned a tombstone (expected)
Codes 1, 10 and 11 are “expected outcomes” — they correspond to
normal protocol states rather than faults. The thin-client helper
thin.IsExpectedOutcome(err) treats these three codes as non-errors.
Courier envelope error codes
Returned by the courier in CourierEnvelopeReply.error_code.
Code
Name
Meaning
0
EnvelopeErrorSuccess
Operation completed
1
EnvelopeErrorInvalidEnvelope
Malformed envelope (e.g. reply_index > 1)
2
EnvelopeErrorCacheCorruption
Internal cache inconsistency
3
EnvelopeErrorPropagationError
Failed to dispatch to replicas
4
EnvelopeErrorInvalidEpoch
CourierEnvelope.epoch outside tolerance window
Copy command status codes
Returned by the courier in the status field of a CopyCommandReply
(its layout is given in the “CopyCommandReply” section).
Code
Name
Meaning
0
CopyStatusSucceeded
All destination writes completed
1
CopyStatusInProgress
Courier has accepted the command; processing continues
2
CopyStatusFailed
Aborted; see error_code + failed_envelope_index
Sharding and replica selection
For each Box, two designated (final) replicas are derived
deterministically from the Box ID using the Shard2 consistent-hash
algorithm (see
replica/common/shard.go):
for each online replica r with identity key k_r:
h_r = BLAKE2b-256(k_r || box_id)
return the two replicas whose h_r are smallest
The two intermediate replicas chosen by the client for a given
CourierEnvelope are drawn independently of the designated replicas,
per pigeonhole.GetRandomIntermediateReplicas:
n ≥ 4 replicas: two intermediates are drawn uniformly at random
from the replicas that are not in the designated (shard) set —
preserving the disjointness invariant described in “Deployment
requirements” above.
n = 3: fallback — at least one intermediate must coincide with
a designated replica; this weakens the intermediate/final
disjointness guarantee but does not compromise Box unlinkability.
n < 3: rejected.
Intermediate replicas, upon accepting a ReplicaWrite, compute the
designated replicas themselves and forward the ReplicaWrite to them
via the core/wire/commandsReplicaWrite command (see
replica/connector.go).
Fixed-throughput connections and decoy traffic
Per §5.3 of the Echomix paper, the courier-to-replica and
replica-to-replica wires are fixed-throughput Poisson streams paced
at LambdaR, the per-connection rate parameter published in the
dirauth Parameters block. An external network observer who can see
only encrypted wire timing must not be able to infer load from the
wire; the wire fires at the same constant rate whether or not real
user traffic is flowing, with the gaps filled by indistinguishable
decoy traffic.
The implementation realises this in two pieces, one on each end of a
connection.
Originator-side pacing with decoy fill
Each courier holds one outgoing connection per replica; each
replica holds one outgoing connection per peer replica. Each such
connection has a sender goroutine driven by an exponential
distribution at rate LambdaR. On every tick:
if the connection’s outbound queue holds a real command (a
ReplicaMessage from the courier-side, or a ReplicaWrite
replication command from the replica-side), the sender drains one
and forwards it;
otherwise, the sender constructs a fresh ReplicaDecoy and
forwards that.
The wire’s outbound rate is therefore Poisson at rate LambdaR at
all times, irrespective of how much real traffic the originator
happens to be carrying. Real and decoy commands are uniformly padded
and encrypted under the PQ Noise transport, so an observer cannot
distinguish them by size or content.
The receiving end of each connection does not use a separately
paced sender. Each inbound command produces exactly one reply (a
real ReplicaWriteReply or ReplicaMessageReply for envelopes,
echoed ReplicaDecoy for decoys). The reply rate therefore equals
the inbound rate by construction, which is LambdaR Poisson.
Replies are NOT drained at a fixed rate. Instead, each reply is
held for an independently sampled uniform random delay, drawn from
Uniform[0, replyJitterMax], and inserted into a per-connection
min-heap keyed on its ready-at timestamp. A worker goroutine pops
the next-due reply and forwards it to the wire.
The per-reply random delay follows §5.4 of the paper:
Each reply is independently delayed, with delays sampled from a
uniform distribution to mitigate the courier’s ability to infer
links between responses that pertain to the same box.
Any strictly positive value satisfies the §5.4 independence
requirement. The reference implementation uses 50 ms, chosen to
exceed typical per-message processing-time variance on a healthy
replica while remaining small relative to the user-perceptible
round-trip latency. There is no hard constraint that the value be
identical across replicas in the same network. See “Wire-level
indistinguishability” below.
Wire-level indistinguishability
A Poisson stream composed with independent random shifts, of any
distribution, remains a Poisson stream at the same rate. This is the
property that protects the courier-replica and replica-replica
wires from a passive observer, as follows.
The reply rate from a replica equals the inbound command rate,
which is LambdaR Poisson.
Each reply is shifted by an independent random delay (from
Uniform[0, replyJitterMax] on the responder side, or an
exponential interval on the originator side).
The composition is still Poisson at rate LambdaR.
A passive observer who can see only encrypted wire timing therefore
sees an indistinguishable Poisson stream regardless of the
underlying shift distribution. Pairing a specific outbound event to
a specific inbound event by timing alone is not possible: the
shifts can arbitrarily reorder events relative to each other, so
many distinct inbound predecessors are equally plausible for any
given outbound.
Bounded responder queue depth
The per-item delay scheduler avoids the M/M/1 boundary case that
arose in earlier designs which paced the responder side at the same
rate as its peer’s command stream. By Little’s Law, the steady-state
heap depth on each inbound connection is the product of the arrival
rate and the mean reply delay:
expected depth = LambdaR × replyJitterMax / 2
For LambdaR = 5/s (one tick per ~200 ms) and replyJitterMax = 50 ms, the expected per-connection depth is 0.125 entries, well
below any operationally meaningful queue capacity. The heap depth
remains bounded over arbitrary time horizons because, unlike a
fixed-rate drainer, the per-item scheduler has no “consumer
underrun” failure mode in which the queue can accumulate against a
matched producer.
Protocol sequence visualizations
For simplicity, the following diagrams omit replication while illustrating the Pigeonhole write and read operations.
Pigeonhole write operation
Pigeonhole read operation
Pigeonhole AllOrNothing protocol
The All Or Nothing delivery mechanism ensures that a set of
associated BACAP writes either succeeds or fails atomically from the
point of view of a replica or second-party client reader. This
behavior prevents an adversary from detecting a correlation between
(A) the sending client’s failure to transmit multiple messages at once
with (B) a network interruption on the sending client’s side of the
network. Regardless of the number of messages in the set, the
adversary gets to observe “at most once” that the sending client
interacted with the network.
The protocol works as follows.
Step 1
The client uploads a “temporary Pigeonhole stream”.
The stream conveys a sequence of CourierEnvelopes. Each
CourierEnvelope is serialised and prefixed with a single 4-byte
(u32) length field giving the size of that one envelope; the
resulting length-prefixed blobs are concatenated back-to-back into one
continuous byte stream.
That byte stream is then split across the BACAP Boxes of the temporary
stream. A serialised CourierEnvelope is strictly larger than the
maximum BACAP Box payload (it wraps a full Box payload plus its own
metadata), so a single envelope does not fit in one Box and the
concatenated stream necessarily spans several Boxes. Envelope
boundaries therefore do not align with Box boundaries; the precise
framing is given in the “Temporary Stream data format” section below.
Step 2
The client sends a random courier the “Copy” command which
encapsulates the write capability to the temporary Pigeonhole stream written in
Step 1 above. When the courier receives this copy command it extracts
the read cap from the given write cap and uses it to read the stream
of data. The courier then reads a box at a time and tries to extract 0
or 1 envelopes from each accumulation of stream segments.
After processing the command, the courier then overwrites the
temporary stream with tombstones.
Temporary Stream data format
Each box in the temporary stream is a serialized CopyStreamElement.
Defined in trunnel as:
// CopyStreamElement - wraps a CourierEnvelope chunk with stream position flags.
// Overhead: 1 byte (flags) + 4 bytes (envelope_len) = 5 bytes
struct copy_stream_element {
// Flags: bit 0 = isStart, bit 1 = isFinal
u8 flags;
// The CourierEnvelope serialized bytes
u32 envelope_len;
u8 envelope_data[envelope_len];
}
The purpose of this specific format is to use the isStart and isFinal
flags to tell the courier the first box and last box of the stream to
process. The payloads encapsulated within the EnvelopeData fields
of many of these CopyStreamElements is itself a stream of data which
contains 4 byte length prefixed CourierEnvelopes.
A key property of this encoding is that envelope boundaries do not
align with box boundaries. Each BACAP box payload has a maximum size
of N bytes, but a serialized CourierEnvelope (which contains a full
box payload plus metadata) exceeds N bytes. Therefore envelopes are
serialized into a continuous byte stream and split across multiple
boxes in the temporary copy stream:
The courier reads the stream box by box, accumulating data until it
can extract complete envelopes. The isStart and isFinal flags on
the CopyStreamElement wrappers tell the courier where the stream
begins and ends.
Each embedded CourierEnvelope is processed as a normal write and
results in a ReplicaMessage being dispatched to the intermediate
replicas over the wire protocol.
The courier does NOT need to keep track of the EnvelopeHash for each
contained CourierEnvelope for the purpose of replying to the client
(the “client” of these envelopes is, in this case, the courier
itself), but it does need to keep resending them until the
intermediate replicas have ACKed them.
The courier MUST keep track of the hash of the CopyCommand (computed
as BLAKE2b-256(write_cap)) and MUST NOT process a given command more
than once. This dedup cache has a TTL of 30 minutes
(CopyDedupCacheTTL in courier/server/plugin.go).
CopyCommand
Sent by a client to its chosen courier after the client has
successfully uploaded every Box of the temporary stream. The trunnel
layout is:
Consults its copy dedup cache. If an in-progress entry is found,
the courier responds immediately with CopyStatusInProgress. If a
completed entry within its TTL is found, the courier returns the
cached terminal reply.
Otherwise, the courier reconstructs the BACAP WriteCap from the
bytes, derives the corresponding ReadCap, and reads the
temporary stream Box by Box, feeding the decrypted BACAP payloads
into a CopyStreamEnvelopeDecoder.
Each complete CourierEnvelope emitted by the decoder is
dispatched to its two intermediate replicas; the courier waits for
acknowledgements with bounded retries and exponential backoff.
Upon processing the Box bearing the isFinal flag (or upon an
unrecoverable failure), the courier attempts — on a best-effort
basis — to overwrite every Box of the temporary stream with a
tombstone.
Replica error handling during a CopyCommand is classified as
temporary or permanent by
courier/server/copy_errors.go:
transient errors (storage full, database failure, internal error,
replication failed, box-ID not found) trigger bounded retries against
the same shard before failover; permanent errors (invalid box ID,
invalid signature, invalid payload, invalid epoch, box already
exists, tombstone) cause immediate failover or abort.
CopyCommandReply
The courier’s reply to a CopyCommand has the following trunnel
layout:
struct copy_command_reply {
u8 status; // CopyStatus{Succeeded, InProgress, Failed}
u8 error_code; // replica error code (meaningful iff status == Failed)
u64 failed_envelope_index; // 1-based sequential position in the
// CourierEnvelope stream at which processing
// stopped (meaningful iff status == Failed)
}
The status codes are enumerated in “Copy command status codes”
above. failed_envelope_index counts envelopes within the stream,
not boxes: the first envelope in the first box of the temporary stream
is index 1.
A client receiving CopyStatusInProgress should continue polling —
i.e. resend the same CopyCommand via a fresh SURB after a short
interval (CopyPollInterval, currently 5 seconds) — until it receives
either CopyStatusSucceeded or CopyStatusFailed. Because the
courier’s copy dedup cache keys on BLAKE2b-256(write_cap), these
repeated polls will not cause the CopyCommand to be processed more
than once.
Potential use cases of AllOrNothing
In no particular order:
Atomically writing to two or more boxes.
The boxes can reside on distinct streams (or not); the courier
doesn’t know anything about streams of CourierEnvelope.
Sending long messages that span more than one BACAP payload, like a
file / document / picture.
Group chat join uses AllOrNothing when adding a new member to the group:
The person introducing a new member writes to their group chat stream.
The person introducing a new member also writes to the existing conversation
stream that the new member can already read.
Group chat uses AllOrNothing in all cases where it needs to send long messages –
files, pictures, audio, long cryptographic keys such as the group
membership list, and so on.
Protocol narration example
Alice wants to send a message to Bob, who is already connected to Alice. That is, Alice will write to a box with ID 12345 that Bob is trying to read from.
Alice sends:
1.1. SPHINX packet containing:
SPHINX header
Routing commands to make it arrive at a courier (on a service provider)
SPHINX payload
Reply SURB
CourierEnvelope encrypted for a courier chosen at random. The
envelope tells the courier about two randomly chosen replicas
(“intermediate replicas”).
1.2 Alice doesn’t know if the packet makes it through the network. Until she
receives a reply, she keeps resending the CourierEnvelope to the same
courier.
The courier receives the CourierEnvelope from Alice.
2.1. The courier records the EnvelopeHash of the CourierEnvelope in
its CourierBookKeeping datastructure.
2.2. If the courier has already seen that hash, GOTO Step 4.
The courier sends an ACK to Alice using the SURB on file for Alice, telling her to stop resending the CourierEnvelope. If there’s no SURB, the Courier waits for the next re-sent CourierEnvelope from Alice matching the EnvelopeHash.
3.1. The courier constructs two ReplicaMessage objects and puts them in
its outgoing queue for the two intermediate replicas. (Note that each courier
maintains constant-rate traffic with all replicas).
3.2. The courier keeps doing this for each intermediate replica until it
receives an ACK from that replica.
Alice’s actions are now complete.
The two intermediate replicas receive the two ReplicaMessage objects.
4.1. They decrypt them and compute the “designated replicas” based on
the BACAP box ID.
4.2. The replicas put ACKs in their outgoing queues for the courier saying “we
have received these messages” (see step 3.2), referencing the EnvelopeHash. This tells the courier to stop resending to the intermediate
replicas.
4.3. They put the decrypted contents (a ReplicaWrite BACAP tuple: box ID
12345; signature; encrypted payload) in their outgoing queues for the
“designated replicas”.
4.4. Each replica waits for a reply from its designated replica.
4.5. When an intermediate replica receives its ACK from the designated replica,
the intermediate replica has no more tasks.
Bob now wants to check if Alice has written a message at box 12345
Similar to step 1, Bob sends a SPHINX packet to a courier chosen at random.
5.1. SPHINX packet containing:
SPHINX header
Routing commands to make it arrive at a courier (on a service provider).
SPHINX payload
The envelope tells the courier about two randomly chosen replicas
(“intermediate replicas”).
CourierEnvelope encrypted for a courier chosen at random. Its
encrypted payload is a ReplicaRead command (reference box ID
12345).
5.2. Bob doesn’t know if the packet makes it through the network. Until he
receives a reply, he keeps resending the CourierEnvelope
to the same courier.
The courier receives Bob’s CourierEnvelope (see step 2),
See step 3.
The two intermediate replicas receive the two ReplicaMessage objects containing Bob’s ReplicaRead
8.1. (See 4.1.)
8.2. (See 4.2.)
8.3. (See 4.3.)
8.4. (See 4.4.)
8.5. When the intermediate replica receives its ACK from the designated replica,
it will include the (perhaps confusingly named) ReplicaWrite (BACAP
tuple).
8.6. The intermediate replica wraps it in a ReplicaMessageReply encrypted
for Bob’s ReplicaMessage.EPubKey and puts it in its outgoing queue for the
Courier.
The Courier receives two ReplicaMessageReply wire commands from
the two intermediate replicas.
9.1. It matches ReplicaMessageReply.EnvelopeHash to its recorded
bookkeeping state (step 2.1).
9.2. It wraps the replica’s EnvelopeReply blob (the MKEM-encrypted
ReplicaMessageReplyInnerMessage) into a CourierEnvelopeReply
with reply_type = PAYLOAD, and — if Bob’s latest retransmission
arrived with a fresh Sphinx SURB — forwards the resulting
CourierQueryReply back through the mixnet to Bob.
9.3. Either way the courier retains the ReplicaMessageReply in
its dedup cache for the cache TTL (5 minutes), so that a
subsequently-arriving retransmission of the same envelope can be
served from cache without re-dispatching to replicas.
Bob keeps resending his CourierEnvelope from step 5 until he
receives a CourierEnvelopeReply with reply_type = PAYLOAD.
10.1. Bob decapsulates the MKEM EnvelopeReply with the private
key corresponding to the sender_pubkey he put in his
CourierEnvelope, obtaining the padded
ReplicaMessageReplyInnerMessage.
10.2. Once unpadded, the inner message is either a
ReplicaReadReply carrying the BACAP tuple (box ID, signature,
ciphertext) or a non-success error_code. If the code is
ReplicaErrorBoxIDNotFound, Bob waits and polls again (i.e.
returns to step 5). If it is ReplicaErrorTombstone, Alice has
deleted the message. Otherwise Bob BACAP-decrypts and verifies
the payload.
10.3. Bob can now read Alice’s message.
3 - Build Katzenpost from source
Pinned versions of the Katzenpost stack and how to build each component from source.
This page is the canonical reference for the
pinned versions of the Katzenpost
stack, together with brief instructions for building and running
each component from source. It is intended for anyone who wishes to
run the software ahead of binary packages becoming available.
Pinned versions
The following git tags are the current recommended versions for
running the stack. Components in the same row of the same
repository should be built from the same tag.
Server-side components are listed for completeness; for full
deployment guidance, see the
Admin Guide.
Prerequisites
Go 1.23 or newer.
Rust stable (with
cargo).
Python 3.9 or newer (the
thin client supports 3.8+, but the venv tooling here assumes
3.9+).
Make,
git, and a C toolchain
(gcc or clang).
kpclientd (the client daemon)
The thin client libraries do not, by themselves, speak to the mix
network. They communicate over a local socket with the
kpclientd daemon, which performs all
cryptographic and network operations.
git clone https://github.com/katzenpost/katzenpost
cd katzenpost
git checkout v0.0.76
cd cmd/kpclientd
go build
The resulting kpclientd binary is run with a
TOML configuration file:
./kpclientd -c /path/to/client.toml
A configuration file is required. For testing, the
Docker test
mixnet generates one automatically; for joining a public
network, you would obtain the configuration from that network’s
operators.
Go thin client
The Go thin client is a library, imported as a Go module:
A decentralised group chat client built atop Qt. It depends solely
on the Katzenpost mix network and the Pigeonhole storage services.
No central server is involved. The underlying design is set out in
the Echomix
paper.
sudo apt install libxcb-cursor0 libegl1
git clone https://github.com/katzenpost/katzenqt
cd katzenqt
git checkout 0.0.2-rc6
make deps
make run
If make deps run does not produce a running
interface, the component-by-component sequence in
katzenqt’s README.md is the
recommended fallback.
Verifying the stack
Once kpclientd is running with a valid
configuration, a single test from the Python integration suite is
sufficient to exercise the full Pigeonhole round trip: Alice
writes a message to the storage replicas via the courier, and Bob
reads it back.
source thin_client/.venv/bin/activate
cd thin_client
pytest tests/test_new_pigeonhole_api.py::test_alice_sends_bob_complete_workflow
A successful run indicates that kpclientd is
connected, the PKI document has been retrieved, the network is
producing consensus, and the courier and replicas are reachable.
The remainder of the suite (pytest with no
arguments) covers tombstones, copy commands, and the various error
paths.
4 -
How to build and run katzenqt, the Qt group chat client, from source on Debian or Ubuntu
Build and run katzenqt from source
katzenqt is the Qt group chat client. It is a decentralised
application that runs over the Katzenpost mix network and the
Pigeonhole storage services. The design is set out in the
Echomix paper.
Warning.katzenqt is in active development and has not yet
been tagged for general release. It is not appropriate to rely on
the software for anonymity, security, or privacy at this stage.
Pre-built packages will be linked from the docs landing
once a release is cut.
Prerequisites
These instructions assume an up-to-date Debian or Ubuntu Linux system
with the following packages installed:
sudo apt install -y git make libxcb-cursor0 libegl1
The Makefile does the rest: it provisions a Python virtual
environment via uv, builds
kpclientd, installs it as a systemd user service, and launches the
GUI.
Quick start
git clone https://github.com/katzenpost/katzenqt
cd katzenqt
make deps
make run
If make run opens the katzenqt window, you are ready.
Step by step (if the quick start fails)
If the two-command form does not yield a running interface, the
following sequence runs each phase of the setup independently and is
useful for diagnosing where things have gone awry:
make system-setup # apt-installs the system dependenciesmake setup-uv # bootstraps the uv-managed venvmake setup # installs Python dependencies into the venvmake test# runs the test suitemake kpclientd # builds the kpclientd binarymake install-kpclient # installs kpclientd into ~/.local/binmake kpclientd.service # installs the systemd user unitmake status # verifies the install
A successful make status produces something like the following:
backend: uv
venv: .venv
kpclientd(bin): /home/<user>/.local/bin/kpclientd
kpclientd(service): active
kpclientd(path): found
Once that is the case, make run should bring up the application.
Persistent state
katzenqt keeps all of its persistent data (keys, conversation logs,
BACAP capabilities, message indices) in a single SQLite file at
~/.local/share/katzenqt/katzen.sqlite3. The environment variable
KQT_STATE overrides the file name, which is useful when running two
instances on the same machine (one talking to the other):
KQT_STATE=alice make run # uses ~/.local/share/katzenqt/alice.sqlite3
Current caveats
katzenqt is currently developed against a debug branch of the
katzenpost repository (tb/debug2025-09-21) rather than the
pinned release tag listed on Build from source.
This will be reconciled when a katzenqt tag is cut.
A tag has not yet been published.
See also
Build from source: pinned versions of
the rest of the Katzenpost stack.
The katzenqt repository:
for issues, pull requests, and the latest README.md / HACKING.md.
5 -
Generated API reference for the Katzenpost Python thin client (katzenpost_thinclient)
Python Thin Client API
This is the API reference for the katzenpost_thinclient Python
package, the Python binding of the Katzenpost thin client. The thin
client is an interface to the kpclientd daemon, which performs all
cryptographic and network operations; the binding itself does no
cryptography.
This page is generated by website/tools/python-api-gen/ from the
docstrings of the pinned katzenpost_thinclient release, using the
native Python documentation tool pydoc-markdown.
Do not edit it directly: changes belong in the binding docstrings (in
the thin_client repository) and will be overwritten by the next
generation pass.
This page documents the 0.0.15 release of the Python
binding (source,
PyPI). Symbols are
re-exported from katzenpost_thinclient, so application code may
import them directly, for example from katzenpost_thinclient import ThinClient, Config.
Maps error codes to exception instances for StartResendingEncryptedMessage.
This matches Go’s errorCodeToSentinel function in thin/pigeonhole.go.
The daemon passes through pigeonhole replica error codes (1-9) for replica-level errors.
For other errors (thin client errors like decryption failures), specific exceptions are raised.
Maps a StartResendingCopyCommandReply dict to an exception (or None on success).
Unlike error_code_to_exception(), this helper has access to the reply’s
diagnostic fields (replica_error_code, failed_envelope_index), which it
uses to construct a CopyCommandFailedError when the courier reports
THIN_CLIENT_ERROR_COPY_COMMAND_FAILED.
Arguments:
reply - The decoded start_resending_copy_command_reply dict.
Returns:
None if error_code is 0 (success); otherwise an Exception instance.
is_expected_outcome
defis_expected_outcome(exc:Exception)->bool
Returns True for exceptions that represent completed operations rather than failures.
These errors should not trigger retries.
Geometry
classGeometry()
Geometry describes the geometry of a Sphinx packet.
NOTE: You must not try to compose a Sphinx Geometry yourself.
It must be programmatically generated by Katzenpost
genconfig or gensphinx CLI utilities.
We describe all the Sphinx Geometry attributes below, however
the only one you are interested in to faciliate your thin client
message bounds checking is UserForwardPayloadLength, which indicates
the maximum sized message that you can send to a mixnet service in
a single packet.
Attributes:
PacketLengthint - The total length of a Sphinx packet in bytes.
NrHopsint - The number of hops; determines the header’s structure.
HeaderLengthint - The total size of the Sphinx header in bytes.
RoutingInfoLengthint - The length of the routing information portion of the header.
PerHopRoutingInfoLengthint - The length of routing info for a single hop.
SURBLengthint - The length of a Single-Use Reply Block (SURB).
SphinxPlaintextHeaderLengthint - The length of the unencrypted plaintext header.
PayloadTagLengthint - The length of the tag used to authenticate the payload.
ForwardPayloadLengthint - The size of the full payload including padding and tag.
UserForwardPayloadLengthint - The usable portion of the payload intended for the recipient.
NextNodeHopLengthint - Derived from the expected maximum routing info block size.
SPRPKeyMaterialLengthint - The length of the key used for SPRP (Sphinx packet payload encryption).
NIKENamestr - Name of the NIKE scheme (if used). Mutually exclusive with KEMName.
KEMNamestr - Name of the KEM scheme (if used). Mutually exclusive with NIKEName.
PigeonholeGeometry
classPigeonholeGeometry()
PigeonholeGeometry describes the geometry of a Pigeonhole envelope.
This provides mathematically precise geometry calculations for the
Pigeonhole protocol using trunnel’s fixed binary format.
It supports 3 distinct use cases:
Given MaxPlaintextPayloadLength → compute all envelope sizes
Given precomputed Pigeonhole Geometry → derive accommodating Sphinx Geometry
Given Sphinx Geometry constraint → derive optimal Pigeonhole Geometry
Attributes:
max_plaintext_payload_lengthint - The maximum usable plaintext payload size within a Box.
courier_query_read_lengthint - The size of a CourierQuery containing a ReplicaRead.
courier_query_write_lengthint - The size of a CourierQuery containing a ReplicaWrite.
courier_query_reply_read_lengthint - The size of a CourierQueryReply containing a ReplicaReadReply.
courier_query_reply_write_lengthint - The size of a CourierQueryReply containing a ReplicaWriteReply.
nike_namestr - The NIKE scheme name used in MKEM for encrypting to multiple storage replicas.
signature_scheme_namestr - The signature scheme used for BACAP (always “Ed25519”).
PigeonholeGeometry.validate
defvalidate()->None
Validates that the geometry has valid parameters.
Raises:
ValueError - If the geometry is invalid.
PigeonholeGeometry.padded_payload_length
defpadded_payload_length()->int
Returns the payload size after adding length prefix.
Returns:
int - The padded payload length (max_plaintext_payload_length + 4).
ConfigFile
classConfigFile()
ConfigFile represents everything loaded from a TOML file: only the
subtable-discriminated Dial transport config. The geometries are
supplied by the daemon over the handshake, not configured here.
Raises ConfigError eagerly on any structural problem: unknown
top-level sections (a leftover [SphinxGeometry] or
[PigeonholeGeometry] is now rejected here), missing required
sections, wrong types, or unknown / missing keys within the
Dial subtable. The intent is that a stale or drifted config
fails here at startup rather than producing a mysterious
runtime failure later.
pretty_print_obj
defpretty_print_obj(obj:"Any")->str
Pretty-print a Python object using indentation and return the formatted string.
This function uses pprintpp to format complex data structures
(e.g., dictionaries, lists) in a readable, indented format.
Arguments:
objAny - The object to pretty-print.
Returns:
str - The pretty-printed representation of the object.
ServiceDescriptor
classServiceDescriptor()
Describes a mixnet service endpoint retrieved from the PKI document.
A ServiceDescriptor encapsulates the necessary information for communicating
with a service on the mix network. The service node’s identity public key’s hash
is used as the destination address along with the service’s queue ID.
Attributes:
recipient_queue_idbytes - The identifier of the recipient’s queue on the mixnet. (“Kaetzchen.endpoint” in the PKI)
mix_descriptordict - A CBOR-decoded dictionary describing the mix node,
typically includes the ‘IdentityKey’ and other metadata.
Methods:
to_destination() - Returns a tuple of (provider_id_hash, recipient_queue_id),
where the provider ID is a 32-byte BLAKE2b hash of the IdentityKey.
Search the PKI document for services supporting the specified capability.
This function iterates over all service nodes in the PKI document,
deserializes each CBOR-encoded node, and looks for advertised capabilities.
If a service provides the requested capability, it is returned as a
ServiceDescriptor.
Arguments:
capabilitystr - The name of the capability to search for (e.g., “echo”).
docdict - The decoded PKI document as a Python dictionary,
which must include a “ServiceNodes” key containing CBOR-encoded descriptors.
Returns:
List[ServiceDescriptor] - A list of matching service descriptors that advertise the capability.
Raises:
KeyError - If the ‘ServiceNodes’ field is missing from the PKI document.
Config
classConfig()
Configuration object for the ThinClient containing connection details and event callbacks.
The Config class loads network configuration from a TOML file and provides optional
callback functions that are invoked when specific events occur during client operation.
Attributes:
networkstr - Network type (’tcp’, ‘unix’, etc.)
addressstr - Network address (host:port for TCP, path for Unix sockets)
on_connection_statuscallable - Callback for connection status changes
on_new_pki_documentcallable - Callback for new PKI documents
on_message_sentcallable - Callback for message transmission confirmations
on_message_replycallable - Callback for received message replies
Example:
defhandle_reply(event):# Process the received replypayload=event['payload']config=Config("client.toml",on_message_reply=handle_reply)client=ThinClient(config)
filepathstr - Path to the TOML config file containing network, address, and geometry.
on_connection_statuscallable, optional - Callback invoked when the daemon’s connection
status to the mixnet changes. The callback receives a single argument:
event (dict): Connection status event with keys:
‘is_connected’ (bool): True if daemon is connected to mixnet, False otherwise
’err’ (str, optional): Error message if connection failed, empty string if no error
Example - {'is_connected': True, 'err': ''}
on_new_pki_documentcallable, optional - Callback invoked when a new PKI document
is received from the mixnet. The callback receives a single argument:
event (dict): PKI document event with keys:
‘payload’ (bytes): CBOR-encoded PKI document data stripped of signatures
Example - {'payload': b'\xa5\x64Epoch\x00...'}
on_message_sentcallable, optional - Callback invoked when a message has been
successfully transmitted to the mixnet. The callback receives a single argument:
event (dict): Message sent event with keys:
‘message_id’ (bytes): 16-byte unique identifier for the sent message
‘surbid’ (bytes, optional): SURB ID if message was sent with SURB, None otherwise
‘sent_at’ (str): ISO timestamp when message was sent
‘reply_eta’ (float): Expected round-trip time in seconds for reply
’err’ (str, optional): Error message if sending failed, empty string if successful
All callbacks are optional. If not provided, the corresponding events will be ignored.
Callbacks should be lightweight and non-blocking as they are called from the client’s
event processing loop.
ThinClient
classThinClient()
A minimal Katzenpost Python thin client for communicating with the local
Katzenpost client daemon over a UNIX or TCP socket.
The thin client is responsible for:
Establishing a connection to the client daemon.
Receiving and parsing PKI documents.
Sending messages to mixnet services (with or without SURBs).
Handling replies and events via user-defined callbacks.
All cryptographic operations are handled by the daemon, not by this client.
ThinClient.__init__
def__init__(config:Config)->None
Initialize the thin client with the given configuration.
Arguments:
configConfig - The configuration object containing socket details and callbacks.
Raises:
RuntimeError - If the network type is not recognized or config is incomplete.
Start the thin client: establish connection to the daemon, read initial events,
and begin the background event loop.
Arguments:
loopasyncio.AbstractEventLoop - The running asyncio event loop.
Exceptions:
BrokenPipeError
ThinClient.get_config
defget_config()->Config
Returns the current configuration object.
Returns:
Config - The client configuration in use.
ThinClient.is_connected
defis_connected()->bool
Returns True if the daemon is currently connected to the mixnet.
Note the distinction: this reflects the daemon’s mixnet
connectivity, not the local socket between this thin client and
the daemon. The daemon may be reachable while the mixnet itself
is unreachable; in that case the local socket is fine but this
method returns False, and send_message / blocking_send_message
will raise ThinClientOfflineError. The latest value is updated
from ConnectionStatusEvents pushed by the daemon.
Returns:
bool - True if the daemon is connected to the mixnet, False
otherwise (offline mode).
ThinClient.stop
defstop()->None
Gracefully shut down the client and close its socket.
Sends a thin_close message to the daemon so it can clean up
ARQ state for this connection before disconnecting.
ThinClient.disconnect
defdisconnect()->None
Close the connection without sending thin_close.
The daemon preserves all state for this client’s app ID, allowing
the client to reconnect and resume with the same session token.
Background task that listens for events and dispatches them.
Survives daemon disconnects by automatically reconnecting with exponential backoff.
Only stopping (from stop()) causes this task to exit.
ThinClient.parse_status
defparse_status(event:"Dict[str,Any]")->None
Parse a connection status event and update connection state.
ThinClient.pki_document
defpki_document()->"Dict[str,Any] | None"
Return the most recent PKI consensus document the daemon has
forwarded to this thin client.
The document is a CBOR map describing the current mixnet topology,
the set of available services, and per-node public-key material.
Useful inputs include the PKI epoch, the list of mix nodes, the
list of service providers, and the replica descriptors consulted
by Pigeonhole.
Returns:
Dict[str, Any] | None: The parsed CBOR PKI document, or
None if the daemon has not yet forwarded one (most
commonly on a freshly-connected client, before the first
on_new_pki_document callback has fired).
Return the cert.Certificate-wrapped signed PKI document for the
requested epoch, with every directory authority signature intact.
The thin client receives the stripped PKI document by default
(via the on_new_pki_document callback, also available through
:py:meth:pki_document and :py:meth:pki_document_for_epoch);
the daemon nils the signature map before forwarding it. Use this
method when the caller wishes to verify the directory authority
signatures itself: the returned payload may be deserialized and
verified with the katzenpost core/pki.FromPayload routine
against the authorities listed in client.toml.
Arguments:
epochint - Epoch for which the signed PKI document should
be returned. Pass 0 (the default) to request the
document the daemon believes is current.
Returns:
Tuple[bytes, int]: (payload, epoch) where payload is
the cert.Certificate-wrapped signed PKI document and
epoch is the epoch of the returned document. When 0
was passed in, epoch echoes the epoch the daemon
resolved to.
Raises:
Exception - If the daemon has no cached document for the
requested epoch, or any other error code is returned.
ThinClient.parse_pki_doc
defparse_pki_doc(event:"Dict[str,Any]")->None
Parse and store a new PKI document received from the daemon.
Return every courier service advertised in the current PKI
document, each described by an (identity_hash, queue_id)
tuple. The list reflects only the couriers that the current
consensus regards as serving.
The principal caller is the nested-copy-command machinery, which
needs to choose particular couriers rather than accept the random
draw made on the caller’s behalf by
start_resending_copy_command; for simple cases where any
courier will do, the default routing path is usually preferable.
Returns:
list[tuple[bytes, bytes]]: List of (identity_hash, queue_id) tuples.
Draw n couriers uniformly at random from the list returned by
get_all_couriers, without replacement, so that no two entries
in the returned list refer to the same courier. This is the usual
building block for a nested copy command, every layer of which
must be carried by a different courier.
Arguments:
nint - Number of distinct couriers to return.
Returns:
list[tuple[bytes, bytes]]: List of (identity_hash, queue_id) tuples.
Raises:
Exception - If the current PKI document advertises fewer than
n couriers.
Send a message and block until a reply is received or timeout.
Arguments:
payloadbytes or str - Message payload.
dest_nodebytes - Destination node identity hash.
dest_queuebytes - Destination recipient queue ID.
timeout_secondsfloat - Timeout in seconds (default 30).
Returns:
bytes - Reply payload from the destination service.
Raises:
ThinClientOfflineError - If in offline mode.
asyncio.TimeoutError - If no reply within timeout.
ThinClient.new_message_id
@staticmethoddefnew_message_id()->bytes
Generate a new 16-byte random message ID.
Message IDs are used to correlate SendMessage requests with their
corresponding MessageSentEvent and (if a SURB is present)
MessageReplyEvent. Callers generally do not need to construct
one by hand — blocking_send_message does it internally — but
this helper is exposed for callers composing requests manually.
Randomness is drawn from os.urandom.
Returns:
bytes - Random 16-byte identifier.
ThinClient.new_surb_id
defnew_surb_id()->bytes
Generate a new random SURB ID.
SURB IDs identify which Single Use Reply Block a given
on_message_reply event corresponds to. Pass the returned bytes
as the surb_id argument to send_message, then watch the
callback for a matching reply. Randomness is drawn from
os.urandom.
Returns:
bytes - Random identifier of SURB_ID_SIZE bytes.
ThinClient.new_query_id
defnew_query_id()->bytes
Generate a new 16-byte random query ID.
Query IDs correlate requests and replies within the thin client ↔
daemon CBOR protocol (distinct from mix-network SURB IDs, which
identify replies within the mixnet itself). Most callers never
touch query IDs directly; they are used internally by the
Pigeonhole API helpers. Randomness is drawn from os.urandom.
Pretty-print a parsed PKI document with fully decoded CBOR nodes.
Arguments:
docdict - Raw PKI document from the daemon.
katzenpost_thinclient.pigeonhole
Katzenpost Python Thin Client - New Pigeonhole API
This module provides the new capability-based Pigeonhole API methods.
These methods use WriteCap/ReadCap keypairs and provide direct
control over the Pigeonhole protocol.
KeypairResult
@dataclassclassKeypairResult()
Result from new_keypair containing the generated capabilities.
EncryptReadResult
@dataclassclassEncryptReadResult()
Result from encrypt_read containing the encrypted read request.
EncryptWriteResult
@dataclassclassEncryptWriteResult()
Result from encrypt_write containing the encrypted write request.
StartResendingResult
@dataclassclassStartResendingResult()
Result from start_resending_encrypted_message and its variants.
StartResendingResult.plaintext
Decrypted message for read operations, or empty bytes for writes.
StartResendingResult.courier_identity_hash
32-byte hash of the identity key of the courier that handled this message.
Callers can watch PKI document updates for this courier disappearing from
consensus and cancel+re-encrypt if needed.
StartResendingResult.courier_queue_id
Queue ID of the courier that handled this message.
Creates a new keypair for use with the Pigeonhole protocol.
This method generates a WriteCap and ReadCap from the provided seed using
the BACAP (Blinding-and-Capability) protocol. The WriteCap should be stored
securely for writing messages, while the ReadCap can be shared with others
to allow them to read messages.
Arguments:
seed - 32-byte seed used to derive the keypair.
Returns:
KeypairResult - Contains write_cap, read_cap, and first_message_index.
Raises:
Exception - If the keypair creation fails.
ValueError - If seed is not exactly 32 bytes.
Example:
importosseed=os.urandom(32)result=awaitclient.new_keypair(seed)# Share result.read_cap with Bob so he can read messages# Store result.write_cap for sending messages
Encrypts a read operation for a given read capability.
This method prepares an encrypted read request that can be sent to the
courier service to retrieve a message from a pigeonhole box. The returned
ciphertext should be sent via start_resending_encrypted_message.
Arguments:
read_cap - Read capability that grants access to the channel.
message_box_index - Starting read position for the channel.
Returns:
EncryptReadResult - Contains message_ciphertext, envelope_descriptor,
and envelope_hash.
Raises:
Exception - If the encryption fails.
Example:
result=awaitclient.encrypt_read(read_cap,message_box_index)# Send result.message_ciphertext via start_resending_encrypted_message
Encrypts a write operation for a given write capability.
This method prepares an encrypted write request that can be sent to the
courier service to store a message in a pigeonhole box. The returned
ciphertext should be sent via start_resending_encrypted_message.
Plaintext Size Constraint:
The plaintext must not exceed PigeonholeGeometry.max_plaintext_payload_length
bytes. The daemon internally adds a 4-byte big-endian length prefix before
padding and encryption, so the actual wire format is:
[4-byte length][plaintext][zero padding].
If the plaintext exceeds the maximum size, the daemon will return
ThinClientErrorInvalidRequest.
Arguments:
plaintext - The plaintext message to encrypt. Must be at most
PigeonholeGeometry.max_plaintext_payload_length bytes.
write_cap - Write capability that grants access to the channel.
message_box_index - The message box index for this write operation.
Returns:
EncryptWriteResult - Contains message_ciphertext, envelope_descriptor,
and envelope_hash.
Raises:
Exception - If the encryption fails (including if plaintext is too large).
Example:
plaintext=b"Hello, Bob!"result=awaitclient.encrypt_write(plaintext,write_cap,message_box_index)# Send result.message_ciphertext via start_resending_encrypted_message
This method initiates automatic repeat request (ARQ) for an encrypted message,
which will be resent periodically until either:
A reply is received from the courier
The message is cancelled via cancel_resending_encrypted_message
The client is shut down
This is used for both read and write operations in the new Pigeonhole API.
The daemon implements a finite state machine (FSM) for handling the stop-and-wait ARQ protocol:
For default write operations (write_cap != None, read_cap == None,
no_idempotent_box_already_exists == False):
The method waits for an ACK from the courier and returns immediately.
The ACK confirms the courier received the envelope and will dispatch it
to both shard replicas. This requires only a single round-trip through
the mixnet.
For BoxAlreadyExists-aware writes (no_idempotent_box_already_exists == True):
The method waits for an ACK, then sends a second SURB to retrieve the
replica’s error code. This requires two round-trips through the mixnet.
For read operations (read_cap != None, write_cap == None):
The method waits for an ACK from the courier, then the daemon automatically
sends a new SURB to request the payload, and this method waits for the payload.
The daemon performs all decryption (MKEM envelope + BACAP payload) and returns
the fully decrypted plaintext.
Arguments:
read_cap - Read capability (can be None for write operations, required for reads).
write_cap - Write capability (can be None for read operations, required for writes).
message_box_index - Current message box index being operated on (required for reads).
reply_index - Index of the reply to use (typically 0 or 1).
envelope_descriptor - Serialized envelope descriptor for MKEM decryption.
message_ciphertext - MKEM-encrypted message to send (from encrypt_read or encrypt_write).
envelope_hash - Hash of the courier envelope.
no_retry_on_box_id_not_found - If True, BoxIDNotFound errors on reads trigger
immediate error instead of automatic retries. By default (False), reads
retry on BoxIDNotFound until the box is found or the operation is
cancelled, riding out replication lag; the retries are not capped. Set
to True to get an immediate BoxIDNotFound error without retries.
no_idempotent_box_already_exists - If True, BoxAlreadyExists errors on writes are
returned as errors instead of being treated as idempotent success.
By default (False), BoxAlreadyExists is treated as success (the write
already happened). Set to True to detect whether a write was actually
performed or if the box already existed.
Returns:
StartResendingResult - Contains plaintext (decrypted message for reads, empty for
writes), courier_identity_hash, and courier_queue_id.
Raises:
BoxIDNotFoundError - If no_retry_on_box_id_not_found=True and the box does not exist.
BoxAlreadyExistsError - If no_idempotent_box_already_exists=True and the box
already contains data.
Exception - If the operation fails. Check error_code for specific errors.
Behaves exactly like start_resending_encrypted_message save that
it raises BoxAlreadyExistsError when the replica reports the
destination box has already been written, rather than swallowing the
condition as idempotent success. Use this when one needs to
distinguish a fresh write from a repeat: for instance, when
implementing optimistic concurrency on top of the channel, or when
establishing whether a particular call actually caused a state
change at the replica.
Note that this variant costs an additional mixnet round trip: the
BoxAlreadyExists code is carried by the replica’s reply rather than
the courier’s ACK, so the daemon must dispatch a second SURB before
it can return the answer.
As with start_resending_encrypted_message, an in-flight call
can be cancelled from another task via
cancel_resending_encrypted_message.
Arguments:
read_cap - Read capability (can be None for write operations, required for reads).
write_cap - Write capability (can be None for read operations, required for writes).
message_box_index - Current message box index being operated on (required for reads).
reply_index - Index of the reply to use (typically 0 or 1).
envelope_descriptor - Serialized envelope descriptor for MKEM decryption.
message_ciphertext - MKEM-encrypted message to send (from encrypt_read or encrypt_write).
envelope_hash - Hash of the courier envelope.
Returns:
StartResendingResult - Contains plaintext, courier_identity_hash, and courier_queue_id.
Raises:
BoxAlreadyExistsError - If the box already contains data.
Exception - If the operation fails.
Example:
try:awaitclient.start_resending_encrypted_message_return_box_exists(None,write_cap,None,None,env_desc,ciphertext,env_hash)exceptBoxAlreadyExistsError:print("Box already has data; write was idempotent")
Behaves exactly like start_resending_encrypted_message save that
it disables the daemon’s automatic retry of BoxIDNotFoundError.
The caller learns at once that the box is absent rather than waiting
for replication to settle.
Use this when polling a box that may not yet have been written: for
instance, when a reader peeks ahead at a peer’s next message before
that peer has produced it. The regular variant would block until
the box appeared, which can be many round trips.
As with start_resending_encrypted_message, an in-flight call
can be cancelled from another task via
cancel_resending_encrypted_message.
Arguments:
read_cap - Read capability (can be None for write operations, required for reads).
write_cap - Write capability (can be None for read operations, required for writes).
message_box_index - Current message box index being operated on (required for reads).
reply_index - Index of the reply to use (typically 0 or 1).
envelope_descriptor - Serialized envelope descriptor for MKEM decryption.
message_ciphertext - MKEM-encrypted message to send (from encrypt_read or encrypt_write).
envelope_hash - Hash of the courier envelope.
Returns:
StartResendingResult - Contains plaintext, courier_identity_hash, and courier_queue_id.
Raises:
BoxIDNotFoundError - If the box does not exist (no automatic retries).
Exception - If the operation fails.
Example:
try:result=awaitclient.start_resending_encrypted_message_no_retry(read_cap,None,message_box_index,reply_idx,env_desc,ciphertext,env_hash)exceptBoxIDNotFoundError:print("Box not found; message not yet written")
Increments a MessageBoxIndex using the BACAP NextIndex method.
This method is used when sending multiple messages to different mailboxes using
the same WriteCap or ReadCap. It properly advances the cryptographic state by:
Incrementing the Idx64 counter
Deriving new encryption and blinding keys using HKDF
Updating the HKDF state for the next iteration
The daemon handles the cryptographic operations internally, ensuring correct
BACAP protocol implementation.
Arguments:
message_box_index - Current message box index to increment (as bytes).
Returns:
bytes - The next message box index.
Raises:
Exception - If the increment operation fails.
Example:
current_index=first_message_indexnext_index=awaitclient.next_message_box_index(current_index)# Use next_index for the next message
Return the BACAP Idx64 counter embedded in a MessageBoxIndex.
Callers that persist MessageBoxIndex blobs across sessions can use this
to order or compare two indexes — e.g. to detect a duplicate ACK that
would otherwise regress a write-cap’s index — without having to peek at
the binary layout themselves. The layout (first 8 bytes little-endian)
is a BACAP implementation detail and must not be relied on outside the
daemon.
Arguments:
message_box_index - MessageBoxIndex blob (as bytes) whose counter
should be returned.
Returns:
int - The BACAP Idx64 value.
Raises:
Exception - If the daemon rejects the request.
Example:
current_idx=awaitclient.get_message_box_index_counter(mbi_a)next_idx=awaitclient.get_message_box_index_counter(mbi_b)ifnext_idx<=current_idx:print("skipping stale ACK")
Starts resending a copy command to a courier via ARQ.
This method instructs a courier to read data from a temporary channel
(identified by the write_cap) and write it to the destination channel.
The command is automatically retransmitted until acknowledged.
If courier_identity_hash and courier_queue_id are both provided,
the copy command is sent to that specific courier. Otherwise, a
random courier is selected.
Arguments:
write_cap - Write capability for the temporary channel containing the data.
courier_identity_hash - Optional identity hash of a specific courier to use.
courier_queue_id - Optional queue ID for the specified courier. Must be set
if courier_identity_hash is set.
Raises:
Exception - If the operation fails.
Example:
# Send copy command to a random courierawaitclient.start_resending_copy_command(temp_write_cap)# Send copy command to a specific courierawaitclient.start_resending_copy_command(temp_write_cap,courier_identity_hash,courier_queue_id)
Packs a payload of arbitrary size (up to 10 MB) into properly sized
CopyStreamElement chunks for one destination channel. Each chunk
is a serialised CopyStreamElement, ready to be written to a box
via encrypt_write followed by start_resending_encrypted_message;
the caller marks the boundaries of the stream with the is_start
and is_last flags.
This method is stateless: no daemon state is kept between calls,
each invocation runs a fresh encoder and flushes before returning.
The 10 MB cap guards against accidental memory exhaustion.
Once the chunks have been written to a temporary copy stream, a
copy command (start_resending_copy_command) is dispatched to a
courier with the write capability for that temporary stream; the
courier reads the chunks back and writes each envelope to its
destination box.
Multiple calls can target the same destination stream by passing
next_dest_index from the previous result as dest_start_index.
Arguments:
payload - The data to be encoded into courier envelopes (max 10MB).
dest_write_cap - Write capability for the destination channel.
dest_start_index - Starting index in the destination channel.
is_start - Whether this is the first call (sets IsStart flag on first element).
is_last - Whether this is the last call (sets IsFinal flag on last element).
Returns:
CreateEnvelopesResult - Contains envelopes and next_dest_index.
Packs payloads bound for several destination channels into a single
stream of CopyStreamElement chunks. This is more space-efficient
than calling create_courier_envelopes_from_payload once per
destination, because the shared encoder runs all envelopes together
rather than padding the final box of each destination independently.
This method is stateless: the buffer argument carries any residual
encoder state across calls in place of daemon-side bookkeeping. Pass
None for buffer on the first call and the buffer returned
by the previous call thereafter; set is_last on the final call so
the encoder flushes its tail.
Arguments:
destinations - List of destination payloads, each a dict with:
“payload”: bytes - The data to be written
“write_cap”: bytes - Write capability for destination
“start_index”: bytes - Starting index in destination
is_start - Whether this is the first call in the sequence.
When True, the first CopyStreamElement will have IsStart=true.
is_last - Whether this is the last set of payloads in the sequence.
When True, the final CopyStreamElement will have IsFinal=true.
buffer - Residual encoder buffer from a previous call, or None.
Returns:
CreateEnvelopesResult - Contains envelopes and buffer for next call.
Raises:
Exception - If the envelope creation fails.
Example:
destinations=[{"payload":data1,"write_cap":cap1,"start_index":idx1},{"payload":data2,"write_cap":cap2,"start_index":idx2},]result=awaitclient.create_courier_envelopes_from_multi_payload(destinations,is_start=True,is_last=False)# Pass buffer to next callresult2=awaitclient.create_courier_envelopes_from_multi_payload(more_destinations,is_start=False,is_last=True,buffer=result.buffer)
CreateEnvelopesResult
@dataclassclassCreateEnvelopesResult()
Result of creating courier envelopes.
CreateEnvelopesResult.envelopes
The serialized CopyStreamElements to send to the network.
CreateEnvelopesResult.buffer
The buffered data that hasn’t been output yet. Persist this for crash recovery.
Only populated by create_courier_envelopes_from_multi_payload.
CreateEnvelopesResult.next_dest_index
The next destination message box index after all boxes consumed by this call.
Only populated by create_courier_envelopes_from_payload.
CreateEnvelopesResult.next_dest_indices
The next destination indices for each destination, in request order.
Only populated by create_courier_envelopes_from_multi_payload.
Prepares the encrypted envelopes needed to tombstone a consecutive
range of pigeonhole boxes beginning at the supplied
MessageBoxIndex. A tombstone is a signed empty payload that the
replica recognises as a deletion marker; the daemon constructs one
by signing rather than encrypting whenever encrypt_write is
invoked with an empty plaintext.
This method does not itself touch the network: it returns the
envelopes for the caller to dispatch one by one, typically via
start_resending_encrypted_message. To tombstone a single box,
pass max_count=1.
Arguments:
write_cap - Write capability for the boxes.
start - Starting MessageBoxIndex.
max_count - Maximum number of boxes to tombstone.
Returns:
TombstoneRangeResult - Contains envelopes (list of TombstoneEnvelope) and
next (the next MessageBoxIndex after the last processed).
Packs tombstones for a consecutive range of destination boxes into
CopyStreamElement chunks. The chunks are written to a temporary
copy stream and then dispatched as a copy command; the courier
applies all the tombstones atomically, which is the natural way to
retire a range of boxes as part of the same copy transaction that
writes their successors.
This method is stateless: the buffer argument carries any residual
encoder state across calls in place of daemon-side bookkeeping. Pass
None for buffer on the first call and the buffer returned
by the previous call thereafter; set is_last on the final call so
the encoder flushes its tail.
Arguments:
dest_write_cap - Write capability for the destination channel.
dest_start_index - Starting index in the destination channel.
max_count - Number of tombstones to create.
is_start - Whether this is the first call in the sequence.
is_last - Whether this is the last call in the sequence.
buffer - Residual encoder buffer from a previous call, or None.
Returns:
CreateEnvelopesResult - Contains envelopes, buffer, and next_dest_index.
Raises:
Exception - If the operation fails.
Example:
result=awaitclient.create_courier_envelopes_from_tombstone_range(write_cap,start_index,10,is_start=True,is_last=True)forenvelopeinresult.envelopes:# write envelope to temp copy stream channelpass
katzenpost_thinclient.transport.tcp
TCP transport for the thin-client.
TcpDialConfig
@dataclassclassTcpDialConfig()
Configures a TCP dialer.
address is in host:port form, e.g. “localhost:64331” or “[::1]:64331”.
network is one of “tcp”, “tcp4”, “tcp6”; defaults to “tcp”.
katzenpost_thinclient.transport
Transport abstraction for the Python thin-client.
Each concrete transport (unix, tcp; in future ssh / pipe / pigeonhole)
exposes a setup_socket() method that returns a ready-to-connect socket
and the server address in the form expected by asyncio’s
loop.sock_connect.
DialConfig is a discriminated-union container: exactly one of its
inner variants must be populated. Zero or multiple populated variants
is a configuration error.
DialConfig
@dataclassclassDialConfig()
Discriminated-union of dial transports. Exactly one subtable must be populated.
Parse a TOML [Dial] subtable (dict) into a DialConfig.
Rejects unknown subtables (typos, removed variants, future
names) and unknown keys inside a recognised subtable. Exactly
one of [Dial.Unix] / [Dial.Tcp] must be populated.
katzenpost_thinclient.transport.unix
Unix-domain-socket transport for the thin-client.
UnixDialConfig
@dataclassclassUnixDialConfig()
Configures a unix-domain-socket dialer.
Exceptions
Error types raised by the thin client; each derives from the standard library Exception.
ConfigError
classConfigError(Exception)
Raised when the thin-client TOML config is missing required sections,
contains unknown keys, or otherwise fails structural validation.
Every caller of ConfigFile.load / Config(…) should expect this
exception. It is raised eagerly at startup so that a stale or
drifted config produces a loud, early failure instead of surfacing
later as a mysterious runtime error during mixnet operations.
ReplicaError
classReplicaError(Exception)
Base class for all replica errors.
BoxIDNotFoundError
classBoxIDNotFoundError(ReplicaError)
Box ID not found on the replica. Occurs when reading from a non-existent mailbox.
InvalidBoxIDError
classInvalidBoxIDError(ReplicaError)
Invalid box ID format.
InvalidSignatureError
classInvalidSignatureError(ReplicaError)
Signature verification failed.
DatabaseFailureError
classDatabaseFailureError(ReplicaError)
Replica encountered a database error.
InvalidPayloadError
classInvalidPayloadError(ReplicaError)
Payload data is invalid.
StorageFullError
classStorageFullError(ReplicaError)
Replica’s storage capacity has been exceeded.
ReplicaInternalError
classReplicaInternalError(ReplicaError)
Internal error on the replica.
InvalidEpochError
classInvalidEpochError(ReplicaError)
Epoch is invalid or expired.
ReplicationFailedError
classReplicationFailedError(ReplicaError)
Replication to other replicas failed.
BoxAlreadyExistsError
classBoxAlreadyExistsError(ReplicaError)
Box already contains data. Pigeonhole writes are immutable.
TombstoneError
classTombstoneError(ReplicaError)
Box contains a tombstone (intentional deletion). This is not a failure.
InvalidTombstoneSignatureError
classInvalidTombstoneSignatureError(Exception)
Tombstone signature verification failed (forgery or corruption).
MKEMDecryptionFailedError
classMKEMDecryptionFailedError(Exception)
MKEM envelope decryption failed with all replica keys.
BACAPDecryptionFailedError
classBACAPDecryptionFailedError(Exception)
BACAP payload decryption or signature verification failed.
StartResendingCancelledError
classStartResendingCancelledError(Exception)
StartResendingEncryptedMessage operation was cancelled.
CopyCommandFailedError
classCopyCommandFailedError(Exception)
StartResendingCopyCommand operation failed on the courier.
The courier aborted the Copy command because a replica rejected one of the
embedded writes. Inspect the diagnostic attributes to determine the cause:
Attributes:
replica_error_codeint - The pigeonhole replica ErrorCode that triggered
the abort (e.g. REPLICA_ERROR_BOX_ALREADY_EXISTS). 0 if not reported.
failed_envelope_indexint - 1-based sequential position in the copy
stream of the envelope whose write triggered the abort. 0 if not
applicable. This is NOT a BACAP message index.
6 -
Run a Katzenpost Mix server in a Docker container
Prerequisites
Access to the namenlos git repo
Preparing the host filesystem
mkdir katzenpost-mix
cd katzenpost-mix
mkdir {conf,data}chmod 700 data
All further actions are performed from the katzenpost-mix directory.
Building the Docker image
Create Dockerfile:
FROM golang:bookworm AS builderLABELauthors="<ops@cryptonymity.net>"RUN\
cd /go &&\
git clone https://github.com/katzenpost/katzenpost &&\
cd katzenpost &&\
go mod tidy &&\
cd cmd/server && go buildFROM debian:bookworm AS deployCOPY --from=builder /go/katzenpost/cmd/server/server /usr/bin/serverEXPOSE 8181ARGuid=1000ARGgid=1000RUN\
mkdir -p /home/user &&\
echo"user:x:${uid}:${gid}:User,,,:/home/user:/bin/bash" >> /etc/passwd &&\
echo"user:x:${uid}:" >> /etc/groupUSER userENVHOME=/home/userENTRYPOINT["/usr/bin/server","-f","/conf/katzenpost.toml"]
cd <namenlos.repo>/configs
make
git commit -a
git push
Starting/stopping the server
cd katzenpost-mix
./service.sh [start|stop]
7 -
Complete API reference for the Katzenpost thin client libraries (Go, Rust, Python)
Thin Client API Reference
This is the complete API reference for the Katzenpost thin client. The
thin client is an interface to the kpclientd daemon, which handles all
cryptographic and network operations. The thin client communicates
with the daemon over a local socket using CBOR-encoded messages.
This document is generated. The canonical source is
website/tools/thin-client-api-gen/; edit binding docstrings (in the
source trees) or groups.yaml / overlay/*.md (in the generator) — do
not edit this file directly, as local changes will be overwritten by
the next generation pass.
There are three implementations: a Go reference (katzenpost/client/thin), a
Rust binding (thin_client/src), and a Python binding
(thin_client/katzenpost_thinclient).
The thin client is configured via a TOML file that specifies only how
to reach the local daemon. We usually name this configuration file
thinclient.toml.
[Dial] selects the daemon transport. Set exactly one of the two
forms:
Key
Type
Meaning
[Dial.Unix]Address
string
Filesystem path of the daemon’s Unix socket.
[Dial.Tcp]Address
string
host:port of the daemon’s TCP listener.
[Dial.Tcp]Network
string
Optional: "tcp", "tcp4", or "tcp6" (default "tcp").
Concurrency
The Go ThinClient is safe for concurrent use by multiple goroutines:
its connection state, current PKI document, and in-flight request
tracking are guarded internally, so the cancel-from-another-goroutine
patterns shown in the how-to guide are
sound. The Rust and Python bindings are async: an instance is driven
from its runtime (a Tokio task or an asyncio event loop) and follows
that runtime’s ordinary conventions rather than offering an
independent thread-safety guarantee.
Connection Management
Dial / new / start
Dial establishes a connection to the client daemon and initializes the client.
This method performs the complete connection handshake with the client daemon:
Establishes network connection (TCP or Unix socket)
Receives initial connection status from daemon
Receives initial PKI document
Starts background workers for event handling
The client supports both online and offline modes. In offline mode (when the
daemon is not connected to the mixnet), channel preparation operations will
work but actual message transmission will fail.
After successful connection, the client will automatically handle:
PKI document updates
Connection status changes
Event distribution to application code
The Rust binding folds the connect step into its constructor:
ThinClient::new returns an already-connected handle. Go and
Python construct the client first and connect afterwards via
Dial() / start(), allowing the application to set up event
sinks (in Go) or callbacks (in Python) before any traffic flows.
Close gracefully shuts down the thin client and closes the daemon connection.
This method performs a clean shutdown by:
Sending a close notification to the daemon
Closing the network connection
Stopping all background workers
After calling Close(), the ThinClient instance should not be used further.
Any ongoing operations will be interrupted and may return errors.
func(t*ThinClient)Close()error
pubasyncfnstop(&self)
defstop(self)->None:
IsConnected / is_connected
IsConnected returns true if the client daemon is connected to the mixnet.
This indicates whether the daemon has an active connection to the mixnet
infrastructure. When false, the client is in “offline mode” where channel
operations (prepare operations) will work but actual message transmission
will fail.
func(t*ThinClient)IsConnected()bool
pubfnis_connected(&self)-> bool
defis_connected(self)->bool:
Disconnect / disconnect
Disconnect closes the connection without sending ThinClose.
The daemon preserves all state for this client’s app ID, allowing
the client to reconnect and resume with the same session token.
func(t*ThinClient)Disconnect()error
pubasyncfndisconnect(&self)
defdisconnect(self)->None:
Events
The thin client emits events for connection status changes, PKI
document updates, and message replies. Go uses an event channel; Rust
uses a broadcast receiver; Python uses async callbacks supplied to the
Config constructor.
// Get a channel that receives all events
eventCh:=client.EventSink()deferclient.StopEventSink(eventCh)forev:=rangeeventCh{switchev.(type){case*thin.ConnectionStatusEvent:// ...
case*thin.NewDocumentEvent:// ...
case*thin.MessageReplyEvent:// ...
}}
// Get a receiver that yields all events as CBOR BTreeMaps
letmutevent_rx=client.event_sink();tokio::spawn(asyncmove{whileletSome(event)=event_rx.recv().await{// Inspect event["type"] and dispatch
}});
# Pass async callback functions to the Config constructor.# Each callback receives a dict with event-specific keys.# All callbacks are optional — omitted events are ignored.asyncdefon_connection_status(event):print(f"Connected: {event['is_connected']}")asyncdefon_message_reply(event):print(f"Reply for SURBID {event['surbid']!r}: {event['payload']!r}")config=Config("thinclient.toml",on_connection_status=on_connection_status,on_message_reply=on_message_reply,)client=ThinClient(config)
Event types
ConnectionStatusEvent — emitted when the daemon’s connection
to the mixnet changes. Fields (Go): IsConnected bool, Err error,
InstanceToken [16]byte. InstanceToken uniquely identifies the
daemon process and lets clients notice daemon restarts.
NewDocumentEvent — emitted when a new PKI consensus document
is received from the directory authorities. The Go binding exposes
the parsed document as Document *cpki.Document. (The lower-level
NewPKIDocumentEvent carrying a raw CBOR Payload []byte is used
internally between daemon and thin client; applications should
consume NewDocumentEvent.)
MessageSentEvent — emitted when a SendMessage request has
been transmitted by the daemon. Fields (Go):
MessageID *[MessageIDLength]byte, SURBID *[SURBIDLength]byte,
SentAt time.Time, ReplyETA time.Duration, Err string.
MessageReplyEvent — emitted when a reply to a SendMessage
call is received. Fields (Go):
MessageID *[MessageIDLength]byte, SURBID *[SURBIDLength]byte,
Payload []byte, ReplyIndex *uint8, ErrorCode uint8.
ReplyIndex identifies which of the box’s two replicas answered:
each box is sharded across K=2 replicas, and the value (0 or 1) is
the position within that pair of the replica whose response was
used. It is chiefly of interest for Pigeonhole channel reads and may
be nil when not applicable. The same value is accepted as the
reply_index parameter of StartResendingEncryptedMessage, where
it likewise selects the replica of the pair to address.
ShutdownEvent: emitted when the daemon signals that it is
shutting down. It carries no fields. It precedes the loss of the
local socket and is what causes the following
DaemonDisconnectedEvent to report IsGraceful = true. Treat it
as advance notice of the disconnect; no action is required, since
the thin client reconnects and replays in-flight requests on its
own.
DaemonDisconnectedEvent — emitted by the thin client (not the
daemon) when the local socket connection to the daemon is lost.
Fields (Go): IsGraceful bool, Err error. IsGraceful is true
precisely when a ShutdownEvent preceded the disconnect.
EventSink / event_sink
EventSink returns a buffered channel that receives all events from the thin client.
This method creates a new event channel that will receive copies of all events
generated by the thin client, including:
Connection status changes
PKI document updates
Message sent confirmations
Message replies
Channel operation results
Error notifications
The returned channel is buffered with capacity 1. Events are never
silently dropped: the fan-out worker blocks until the subscriber
accepts each event, matching the “no loss” contract the Rust and
Python thin clients uphold. Consequently an application that
stops consuming from its sink will stall the entire fan-out
(including events destined for other subscribers); applications
must drain promptly or call StopEventSink() to release their
subscription.
Important: Always call StopEventSink() when done with the channel to prevent
resource leaks and ensure proper cleanup.
Note: The event sink channel is NOT closed when the client shuts down.
Consumers should also select on HaltCh() to detect shutdown, or they
can check for a ShutdownEvent in the event stream.
The Rust binding returns an mpsc::Receiver carrying the same
event stream. The Python binding has no equivalent method:
Python applications instead register async callbacks on the
Config constructor and receive events through those.
func(t*ThinClient)EventSink()chanEvent
pubfnevent_sink(&self)-> EventSinkReceiver
StopEventSink (Go only)
StopEventSink stops sending events to the specified channel and cleans up resources.
This method removes the channel from the event distribution system and should
be called when the application is done processing events from a channel
returned by EventSink(). Failure to call this method may result in resource
leaks and continued event processing overhead.
Rust subscribers are released by dropping the
mpsc::Receiver, so the binding exposes no explicit teardown
method. Python’s callback model owns no per-subscriber
resources either, and so likewise needs no equivalent.
func(t*ThinClient)StopEventSink(chchanEvent)
PKI and Service Discovery
PKIDocument / pki_document
PKIDocument returns the thin client’s current PKI document.
The PKI document contains the current network topology, service information,
and cryptographic parameters for the current epoch. This document is
automatically updated when the client daemon receives new PKI information.
GetPKIDocumentRaw returns the cert.Certificate-wrapped signed PKI
document for the requested epoch, with every directory authority
signature intact. Pass epoch == 0 to request the document the daemon
believes is current.
The thin client receives the stripped PKI document by default (as
pushed in NewPKIDocumentEvent); use this method when the caller
needs to verify the directory authority signatures itself. The
payload can be deserialized and verified with core/pki.FromPayload.
GetService returns a randomly selected service matching the specified capability.
This method is a convenience wrapper around GetServices() that randomly
selects one service from all available services with the given capability.
This provides automatic load balancing across available service instances.
GetServices returns all services matching the specified capability name.
This method searches the current PKI document for services that provide
the specified capability. Services in Katzenpost are identified by their
capability names (e.g., “echo”, “courier”, “keyserver”).
The Rust binding exposes the same lookup as the free function
find_services in helpers.rs, rather than as a method on
ThinClient.
SendMessage sends a message with reply capability using the legacy API.
This method sends a message with a Single Use Reply Block (SURB) that allows
the destination to send a reply. The method is asynchronous - it only blocks
until the daemon receives the send request, not until the message is actually
transmitted or a reply is received.
To receive replies, applications must monitor events from EventSink() and
look for MessageReplyEvent instances with matching SURB IDs.
SendMessageWithoutReply sends a fire-and-forget message using the legacy API.
This method sends a message without any reply capability. The message is
encapsulated in a Sphinx packet and sent through the mixnet, but no response
can be received. This is suitable for notifications or one-way communication.
BlockingSendMessage sends a message and blocks until a reply is received.
This method provides a synchronous request-response pattern by automatically
generating a SURB ID, sending the message, and waiting for the reply. It
blocks until either a reply is received or the context times out.
This is convenient for simple request-response interactions but lacks the
advanced features of the Pigeonhole Channel API such as message ordering,
channel persistence, and offline operation support.
NewKeypair creates a new keypair for use with the Pigeonhole protocol.
This method generates a WriteCap and ReadCap from the provided seed using
the BACAP (Blinding-and-Capability) protocol. The WriteCap should be stored
securely for writing messages, while the ReadCap can be shared with others
to allow them to read messages.
NextMessageBoxIndex increments a MessageBoxIndex using the BACAP NextIndex method.
This method is used when sending multiple messages to different mailboxes using
the same WriteCap or ReadCap. It properly advances the cryptographic state by:
Incrementing the Idx64 counter
Deriving new encryption and blinding keys using HKDF
EncryptRead encrypts a read operation for a given read capability.
This method prepares an encrypted read request that can be sent to the
courier service to retrieve a message from a pigeonhole box. The returned
ciphertext should be sent via StartResendingEncryptedMessage.
EncryptWrite encrypts a write operation for a given write capability.
This method prepares an encrypted write request that can be sent to the
courier service to store a message in a pigeonhole box. The returned
ciphertext should be sent via StartResendingEncryptedMessage.
StartResendingEncryptedMessage sends an encrypted message via ARQ and blocks until completion.
This method BLOCKS until a reply is received. CancelResendingEncryptedMessage is only
useful when called from another goroutine to interrupt this blocking call.
The message will be resent periodically until either:
A reply is received from the courier (this method returns)
The message is cancelled via CancelResendingEncryptedMessage (from another goroutine)
The client is shut down
This is used for both read and write operations in the new Pigeonhole API.
The daemon implements a finite state machine (FSM) for handling the stop-and-wait ARQ protocol:
For default write operations (writeCap != nil, readCap == nil,
noIdempotentBoxAlreadyExists == false):
The method waits for an ACK from the courier and returns immediately.
The ACK confirms the courier received the envelope and will dispatch it
to both shard replicas. This requires only a single round-trip through
the mixnet.
For BoxAlreadyExists-aware writes (noIdempotentBoxAlreadyExists == true):
The method waits for an ACK, then sends a second SURB to retrieve the
replica’s error code. This requires two round-trips through the mixnet.
For read operations (readCap != nil, writeCap == nil):
The method waits for an ACK from the courier, then the daemon automatically
sends a new SURB to request the payload, and this method waits for the payload.
The daemon performs all decryption (MKEM envelope + BACAP payload) and returns
the fully decrypted plaintext.
StartResendingEncryptedMessageReturnBoxExists behaves exactly like
StartResendingEncryptedMessage save that it returns
ErrBoxAlreadyExists when the replica reports that the destination
box has already been written, rather than swallowing the condition
as idempotent success. Use it when one needs to distinguish a
fresh write from a repeat: for instance, when implementing
optimistic concurrency on top of the channel, or when establishing
whether a particular call actually caused a state change at the
replica.
Note that this variant costs an additional mixnet round trip: the
BoxAlreadyExists code is carried by the replica’s reply rather than
the courier’s ACK, so the daemon must dispatch a second SURB before
it can return the answer.
As with StartResendingEncryptedMessage, an in-flight call may be
cancelled from another goroutine via CancelResendingEncryptedMessage.
StartResendingEncryptedMessageNoRetry behaves exactly like
StartResendingEncryptedMessage save that it disables the daemon’s
automatic retry of ErrBoxIDNotFound. The caller learns at once that
the box is absent rather than waiting for replication to settle.
Use it when polling a box that may not yet have been written, for
instance when a reader peeks ahead at a peer’s next message before
that peer has produced it; the regular variant would block until
the box appeared, which can be many round trips.
As with StartResendingEncryptedMessage, an in-flight call may be
cancelled from another goroutine via CancelResendingEncryptedMessage.
TombstoneRange prepares the encrypted envelopes needed to
tombstone a consecutive range of pigeonhole boxes beginning at the
supplied MessageBoxIndex. A tombstone is a signed empty payload
that the replica recognises as a deletion marker; the daemon
constructs one by signing rather than encrypting whenever
EncryptWrite is invoked with an empty plaintext.
This method does not itself touch the network: it returns the
envelopes for the caller to dispatch one by one, typically via
StartResendingEncryptedMessage. To tombstone a single box, pass
maxCount=1.
CreateCourierEnvelopesFromPayload packs a payload of arbitrary
size (up to 10 MB) into properly sized CopyStreamElement chunks
for one destination channel. Each chunk is a serialised
CopyStreamElement, ready to be written to a box via EncryptWrite
followed by StartResendingEncryptedMessage; the caller marks the
boundaries of the stream with the isStart and isLast flags.
This method is stateless: no daemon state is kept between calls,
each invocation runs a fresh encoder and flushes before returning.
The 10 MB cap guards against accidental memory exhaustion.
Once the chunks have been written to a temporary copy stream, a
copy command (StartResendingCopyCommand) is despatched to a
courier with the WriteCap for that temporary stream; the courier
reads the chunks back and dispatches each envelope to its
destination box.
CreateCourierEnvelopesFromMultiPayload packs payloads bound for
several destination channels into a single stream of
CopyStreamElement chunks. This is more space-efficient than
calling CreateCourierEnvelopesFromPayload once per destination,
because the shared encoder runs all envelopes together rather than
padding the final box of each destination independently.
This method is stateless: the buffer argument carries any residual
encoder state across calls in place of daemon-side bookkeeping.
Pass nil for buffer on the first call and the Buffer returned by
the previous call thereafter; set isLast on the final call so that
the encoder flushes its tail.
CreateCourierEnvelopesFromTombstoneRange creates tombstone CourierEnvelopes for a range
of destination indices, encoded as copy stream elements ready to be written to a
temporary copy stream channel.
This combines the tombstone creation logic (SignBox with empty payload) with the
courier envelope wrapping and copy stream encoding of CreateCourierEnvelopesFromPayload.
The buffer parameter enables stateless continuation across multiple calls without
wasting space in the last box. Pass nil on the first call, then pass the returned
nextBuffer to the next call.
StartResendingCopyCommand sends a copy command via ARQ and blocks until completion.
This method BLOCKS until a reply is received. It uses the ARQ (Automatic Repeat reQuest)
mechanism to reliably send copy commands to the courier, automatically retrying if
the reply is not received in time.
The copy command instructs the courier to read from a temporary copy stream channel
and write the parsed envelopes to their destination channels. The courier:
Derives a ReadCap from the WriteCap
Reads boxes from the temporary channel
Parses boxes into CourierEnvelopes
Sends each envelope to intermediate replicas for replication
Writes tombstones to clean up the temporary channel
The Rust and Python bindings accept optional courier_identity_hash
and courier_queue_id arguments to pin the command to a particular
courier; the Go binding exposes that same behaviour through a
distinct method, StartResendingCopyCommandWithCourier.
StartResendingCopyCommandWithCourier behaves exactly like
StartResendingCopyCommand save that it dispatches the copy command
to a courier the caller has chosen, rather than to one selected at
random from the current PKI document. The courier is identified by
the (identity-hash, queue-id) pair returned by GetAllCouriers or
GetDistinctCouriers.
This is the building block for nested copy commands, in which the
outer command is sent to one courier and the inner commands carried
inside it reference a different courier. Staggering the two layers
across distinct couriers reduces the chance that any single
compromised courier observes both halves of the copy transaction
and can therefore link them.
In Rust and Python the same behaviour is reached not through a
separate method but by supplying the optional
courier_identity_hash and courier_queue_id arguments to
start_resending_copy_command.
GetAllCouriers returns every courier service advertised in the
current PKI document, each described by an (identity-hash,
queue-id) pair. The list reflects only the couriers that the
current consensus regards as serving.
The principal caller is the nested-copy-command machinery, which
needs to choose particular couriers rather than accept the random
draw made on the caller’s behalf by StartResendingCopyCommand; for
simple cases where any courier will do, the default routing path
is usually preferable.
GetDistinctCouriers draws n couriers uniformly at random from the
list returned by GetAllCouriers, without replacement, so that no
two entries in the returned slice refer to the same courier. This
is the usual building block for a nested copy command, every layer
of which must be carried by a different courier.
Returns an error if the current PKI document advertises fewer than
n couriers.
Returns one courier destination, drawn uniformly at random from
the couriers advertised in the current PKI document, as the
(identity_hash, queue_id) pair the rest of the API expects. This
spares the caller from handling a list when one courier will do.
The principal use is the routine “pick a courier, send a copy
command to it” pattern; for the nested-copy-command case where two
distinct couriers are required, draw them with a single call to
the underlying service helpers in helpers.rs rather than calling
this method twice and risking the same draw.
Go and Python callers reach the same result by calling
GetDistinctCouriers(1) / get_distinct_couriers(1) and taking
the first element of the returned slice.
Returns the pigeonhole geometry the daemon supplied during the
connection handshake. This geometry defines the payload sizes and
envelope formats for the pigeonhole protocol.
Panics if called before the daemon’s first ConnectionStatusEvent
has been processed, or if the daemon did not supply the geometry
(an incompatible daemon).
Go callers retrieve the same value through
GetConfig().PigeonholeGeometry. The Python binding stores the
geometry internally but does not at present expose a public
accessor.
NewMessageID generates a new cryptographically random message identifier.
Message IDs are used to correlate requests with responses in both legacy
and channel APIs. Each message should have a unique ID to prevent
confusion and enable proper event correlation.
The Pigeonhole methods return structured results whose fields are
enumerated below. These are the Go reference structs from
katzenpost/client/thin; they are authoritative. The Rust and Python
bindings return the equivalent data through their own result types,
with the same fields rendered in snake_case (for example WriteCap
becomes write_cap, NextMessageBoxIndex becomes
next_message_box_index).
Two fields recur throughout and are protocol plumbing rather than
application data:
QueryID correlates a reply with the request that produced it; the
bindings manage it for you.
ErrorCode is zero on success and otherwise names the failure. The
bindings translate a non-zero code into the language-native error
documented under Replica and Courier
Errors; application code inspects the
raised error or returned sentinel rather than this byte directly.
NewKeypair result (Rust/Python: KeypairResult)
NewKeypairReply is the reply to a NewKeypair request.
Field
Type
Description
QueryID
*[QueryIDLength]byte
QueryID is used for correlating this reply with the NewKeypair request
WriteCap
*bacap.WriteCap
WriteCap is the write capability that should be stored for channel
ReadCap
*bacap.ReadCap
ReadCap is the read capability that can be shared with others to allow them to read messages from this channel.
FirstMessageIndex
*bacap.MessageBoxIndex
FirstMessageIndex is the first message index that should be used when writing messages to the channel.
ErrorCode
uint8
ErrorCode indicates the reason for a failure to create a new keypair if any. Otherwise it is set to zero for success.
EncryptWrite result (Rust/Python: EncryptWriteResult)
EncryptWriteReply is the reply to an EncryptWrite request.
Field
Type
Description
QueryID
*[QueryIDLength]byte
QueryID is used for correlating this reply with the EncryptWrite request
MessageCiphertext
[]byte
MessageCiphertext is the encrypted message ciphertext that should be sent to the Courier service.
EnvelopeDescriptor
[]byte
EnvelopeDescriptor contains the serialized EnvelopeDescriptor that contains the private key material needed to decrypt the envelope reply.
EnvelopeHash
*[32]byte
EnvelopeHash is the hash of the CourierEnvelope that was sent to the mixnet and is used to resume the write operation.
NextMessageBoxIndex
*bacap.MessageBoxIndex
NextMessageBoxIndex is the next message box index to use for subsequent write operations. This is computed by the daemon using BACAP’s NextIndex.
ErrorCode
uint8
ErrorCode indicates the reason for a failure to encrypt the write if any. Otherwise it is set to zero for success.
EncryptRead result (Rust/Python: EncryptReadResult)
EncryptReadReply is the reply to an EncryptRead request.
Field
Type
Description
QueryID
*[QueryIDLength]byte
QueryID is used for correlating this reply with the EncryptRead request
MessageCiphertext
[]byte
MessageCiphertext is the encrypted message ciphertext that should be sent to the Courier service.
EnvelopeDescriptor
[]byte
EnvelopeDescriptor contains the serialized EnvelopeDescriptor that contains the private key material needed to decrypt the envelope reply.
EnvelopeHash
*[32]byte
EnvelopeHash is the hash of the CourierEnvelope that was sent to the mixnet and is used to resume the read operation.
NextMessageBoxIndex
*bacap.MessageBoxIndex
NextMessageBoxIndex is the next message box index to use for subsequent read operations. This is computed by the daemon using BACAP’s NextIndex.
ErrorCode
uint8
ErrorCode indicates the reason for a failure to encrypt the read if any. Otherwise it is set to zero for success.
StartResendingEncryptedMessage result (Rust/Python: StartResendingResult)
StartResendingEncryptedMessageReply is the reply to a StartResendingEncryptedMessage request.
Field
Type
Description
QueryID
*[QueryIDLength]byte
QueryID is used for correlating this reply with the StartResendingEncryptedMessage request
Plaintext
[]byte
Plaintext is the plaintext message that was read from the channel.
ErrorCode
uint8
ErrorCode indicates the reason for a failure to start resending the encrypted message if any. Otherwise it is set to zero for success.
CourierIdentityHash
*[32]byte
CourierIdentityHash is the 32-byte hash of the identity key of the courier that was selected to handle this message. Callers can watch PKI document updates for this courier disappearing from consensus and cancel+re-encrypt if it does.
CourierQueueID
[]byte
CourierQueueID is the queue ID of the courier that was selected.
StartResendingCopyCommand result
StartResendingCopyCommandReply is the reply to a StartResendingCopyCommand request.
Field
Type
Description
QueryID
*[QueryIDLength]byte
QueryID is used for correlating this reply with the StartResendingCopyCommand request
ErrorCode
uint8
ErrorCode indicates the reason for a failure to execute the copy command if any. Otherwise it is set to zero for success.
ReplicaErrorCode
uint8
ReplicaErrorCode is the pigeonhole replica ErrorCode that caused the Copy command to abort on the courier. Meaningful only when ErrorCode indicates a Copy failure and the courier identified a specific replica-side reason (e.g. ReplicaErrorBoxAlreadyExists).
FailedEnvelopeIndex
uint64
FailedEnvelopeIndex is the 1-based sequential position in the copy stream of the envelope whose write triggered the abort. 0 if not applicable. Not a BACAP message index.
NextMessageBoxIndex result
NextMessageBoxIndexReply is the reply to a NextMessageBoxIndex request.
Field
Type
Description
QueryID
*[QueryIDLength]byte
QueryID is used for correlating this reply with the NextMessageBoxIndex request
NextMessageBoxIndex
*bacap.MessageBoxIndex
NextMessageBoxIndex is the incremented message box index.
ErrorCode
uint8
ErrorCode indicates the reason for a failure to increment the index if any. Otherwise it is set to zero for success.
GetMessageBoxIndexCounter result
GetMessageBoxIndexCounterReply is the reply to a GetMessageBoxIndexCounter request.
Field
Type
Description
QueryID
*[QueryIDLength]byte
QueryID is used for correlating this reply with the GetMessageBoxIndexCounter request.
Counter
uint64
Counter is the BACAP Idx64 value read out of the requested MessageBoxIndex.
ErrorCode
uint8
ErrorCode indicates the reason for a failure to read the counter if any. Otherwise it is set to zero for success.
GetPKIDocumentRaw result
GetPKIDocumentReply is the reply to a GetPKIDocument request. The
Payload field carries the cert.Certificate-wrapped signed PKI document
exactly as the daemon received it from the gateway, retaining every
directory authority signature so that callers may verify it
themselves.
Field
Type
Description
QueryID
*[QueryIDLength]byte
QueryID is used for correlating this reply with the GetPKIDocument request.
Payload
[]byte
Payload is the cert.Certificate-wrapped signed PKI document, or nil on failure. Use core/pki.FromPayload to deserialize and verify it against the directory authorities’ public keys.
Epoch
uint64
Epoch is the epoch of the returned document. When the request asked for the current epoch this echoes the epoch the daemon believes is current.
ErrorCode
uint8
ErrorCode indicates the reason for a failure to return a signed PKI document if any. Otherwise it is set to zero for success.
CreateCourierEnvelopesFromPayload result
CreateCourierEnvelopesFromPayloadReply is sent in response to a CreateCourierEnvelopesFromPayload request.
It provides multiple serialized CopyStreamElements, one for each chunk of the payload.
Field
Type
Description
QueryID
*[QueryIDLength]byte
QueryID is used for correlating this reply with the CreateCourierEnvelopesFromPayload request that created it.
Envelopes
[][]byte
Envelopes is a slice of serialized CopyStreamElements, one per chunk.
NextDestIndex
*bacap.MessageBoxIndex
NextDestIndex is the next destination message box index after all boxes consumed by this call. Use this as DestStartIndex in subsequent calls to continue writing to the same destination stream.
ErrorCode
uint8
ErrorCode indicates the success or failure of the envelope creation. A value of ThinClientSuccess indicates successful creation.
CreateCourierEnvelopesFromMultiPayload result
CreateCourierEnvelopesFromPayloadsReply is sent in response to a CreateCourierEnvelopesFromPayloads request.
It provides multiple serialized CopyStreamElements packed efficiently from multiple destination payloads.
Field
Type
Description
QueryID
*[QueryIDLength]byte
QueryID is used for correlating this reply with the CreateCourierEnvelopesFromPayloads request that created it.
Envelopes
[][]byte
Envelopes is a slice of serialized CopyStreamElements containing all the courier envelopes from all destinations packed efficiently together.
Buffer
[]byte
Buffer contains any data buffered by the encoder that hasn’t been output yet. This can be persisted for crash recovery and restored via SetStreamBuffer.
NextDestIndices
[]*bacap.MessageBoxIndex
NextDestIndices contains the next destination message box index for each destination, in the same order as the destinations in the request. Use these as StartIndex in subsequent calls to continue writing to the same destination streams.
ErrorCode
uint8
ErrorCode indicates the success or failure of the envelope creation. A value of ThinClientSuccess indicates successful creation.
CreateCourierEnvelopesFromTombstoneRange result
CreateCourierEnvelopesFromTombstoneRangeReply is sent in response to a
CreateCourierEnvelopesFromTombstoneRange request. It provides serialized
CopyStreamElements containing tombstone courier envelopes.
Field
Type
Description
QueryID
*[QueryIDLength]byte
QueryID is used for correlating this reply with the request.
Envelopes
[][]byte
Envelopes is a slice of serialized CopyStreamElements.
Buffer
[]byte
Buffer is the residual encoder buffer to pass to the next call. Nil when IsLast was true in the request.
NextDestIndex
*bacap.MessageBoxIndex
NextDestIndex is the next destination message box index after all tombstones created by this call.
ErrorCode
uint8
ErrorCode indicates the success or failure of the operation.
DestinationPayload (parameter)
DestinationPayload specifies a payload and its destination channel for multi-channel writes.
Passed into CreateCourierEnvelopesFromMultiPayload, one per destination channel.
Field
Type
Description
Payload
[]byte
Payload is the data to be written to this destination.
WriteCap
*bacap.WriteCap
WriteCap is the write capability for the destination channel.
StartIndex
*bacap.MessageBoxIndex
StartIndex is the starting index in the destination channel.
Transport and Lifecycle Errors
These errors can in principle be raised by any method that performs
I/O against the daemon or the mixnet.
Condition
Go
Rust
Python
Daemon not connected to mixnet
ad-hoc error with message “cannot send message in offline mode - daemon not connected to mixnet” (no sentinel — check IsConnected() first)
ThinClientError::OfflineMode(String)
ThinClientOfflineError
Operation timed out
context.DeadlineExceeded (from ctx.Err())
ThinClientError::Timeout(String)
asyncio.TimeoutError
Operation cancelled by caller
context.Canceled (from ctx.Err())
(no distinct variant — uses higher-level cancellation)
asyncio.CancelledError
Local socket to kpclientd lost
returned on the next I/O; thin client attempts reconnect with exponential backoff
ditto (receive DaemonDisconnectedEvent on the event sink)
ditto
CBOR (de)serialisation failure
wrapped error
ThinClientError::CborError(serde_cbor::Error)
serde-layer exception bubbles up
The Go binding does not provide a named sentinel for offline mode.
Applications that must distinguish “daemon offline” from other errors
should test IsConnected()before sending, not compare error values
after the fact. The Rust and Python bindings provide proper sentinels
testable with matches! / isinstance.
Replica and Courier Errors
The errors below can be returned by StartResendingEncryptedMessage
and its variants. They are defined in
pigeonhole/errors.go.
Errors specific to reads (when readCap is set)
Error
Go
Rust
Python
Box not found (retries exhausted)
ErrBoxIDNotFound
ThinClientError::BoxNotFound
BoxIDNotFoundError
MKEM decryption failed
ErrMKEMDecryptionFailed
ThinClientError::MkemDecryptionFailed
MKEMDecryptionFailedError
BACAP decryption failed
ErrBACAPDecryptionFailed
ThinClientError::BacapDecryptionFailed
BACAPDecryptionFailedError
Tombstone (box was deleted)
ErrTombstone
ThinClientError::Tombstone
TombstoneError
Errors specific to writes (when writeCap is set)
Error
Go
Rust
Python
Storage full
ErrStorageFull
ThinClientError::StorageFull
StorageFullError
Errors on both reads and writes
Error
Go
Rust
Python
Operation cancelled
ErrStartResendingCancelled
ThinClientError::StartResendingCancelled
StartResendingCancelledError
Invalid box ID
ErrInvalidBoxID
ThinClientError::InvalidBoxId
InvalidBoxIDError
Invalid signature
ErrInvalidSignature
ThinClientError::InvalidSignature
InvalidSignatureError
Invalid tombstone signature
ErrInvalidTombstoneSignature
ThinClientError::InvalidTombstoneSignature
InvalidTombstoneSignatureError
Database failure
ErrDatabaseFailure
ThinClientError::DatabaseFailure
DatabaseFailureError
Invalid payload
ErrInvalidPayload
ThinClientError::InvalidPayload
InvalidPayloadError
Invalid epoch
ErrInvalidEpoch
ThinClientError::InvalidEpoch
InvalidEpochError
Replication failed
ErrReplicationFailed
ThinClientError::ReplicationFailed
ReplicationFailedError
Replica internal error
ErrReplicaInternalError
ThinClientError::ReplicaInternalError
ReplicaInternalError
Box already exists (writes only, when non-idempotent variant used)
ErrBoxAlreadyExists
ThinClientError::BoxAlreadyExists
BoxAlreadyExistsError
Copy-command failure
StartResendingCopyCommand can return a diagnostic error carrying the
underlying replica error code and the 1-based sequential envelope
index at which processing stopped:
Binding
Error
Go
ErrCopyCommandFailed (see CopyCommandFailedError struct for fields)
Some errors from StartResendingEncryptedMessage represent completed
operations, not failures. Use IsExpectedOutcome(err) (Go),
err.is_expected_outcome() (Rust), or is_expected_outcome(exc)
(Python) to distinguish them:
Error
Why it may be expected
BoxIDNotFound / BoxNotFound
Polling for a message that hasn’t been written yet
BoxAlreadyExists
Retrying an idempotent write that already succeeded
Tombstone
Reading a box that was intentionally deleted
These should generally not trigger retries in your application.
8 -
Task-oriented guides for using the Katzenpost thin client API
Thin Client How-to Guide
This guide shows how to accomplish specific tasks with the Katzenpost
thin client. Each section is self-contained: find the task you need
and follow the steps.
Throughout this guide and the API the words channel and stream
are used interchangeably: they denote one and the same thing.
Authoritative working examples
A word of caution before you proceed. The code fragments in this guide
are illustrative: they are written to teach one task at a time, and to
keep the reader’s eye on the matter at hand they omit imports, error
handling, and surrounding context. They are not compiled or run by
our continuous integration, and so, as the API evolves, an individual
snippet may fall out of step with it.
The integration tests below carry no such caveat. They are exercised on
every change by CI, so they are guaranteed to compile and to pass
against the code they accompany. When a fragment in this guide and a
test disagree, the test is correct. Treat these files as the
canonical, runnable companion to the prose:
These links track the main branch of each repository; should you be
working against a pinned release, consult the corresponding files at
that tag instead.
Connect to the kpclientd daemon and set up event handling:
cfg,err:=thin.LoadFile("thinclient.toml")iferr!=nil{log.Fatal(err)}logging:=&config.Logging{Level:"INFO"}client:=thin.NewThinClient(cfg,logging)err=client.Dial()iferr!=nil{log.Fatal(err)}deferclient.Close()// Listen for events
eventCh:=client.EventSink()deferclient.StopEventSink(eventCh)forev:=rangeeventCh{switchv:=ev.(type){case*thin.ConnectionStatusEvent:fmt.Printf("Connected: %v\n",v.IsConnected)case*thin.NewDocumentEvent:fmt.Println("New PKI document received")case*thin.MessageReplyEvent:fmt.Printf("Reply received for SURB %x\n",v.SURBID)}}
letconfig=Config::new("thinclient.toml")?;letclient=ThinClient::new(config).await?;// Listen for events
letmutevent_rx=client.event_sink();tokio::spawn(asyncmove{whileletSome(event)=event_rx.recv().await{// Process events
println!("Event: {:?}",event);}});// ... do work ...
client.stop().await;
asyncdefon_connection_status(event):print(f"Connected: {event.get('is_connected')}")asyncdefon_new_document(event):print("New PKI document received")asyncdefon_message_reply(event):print(f"Reply received")config=Config("thinclient.toml")config.on_connection_status=on_connection_statusconfig.on_new_pki_document=on_new_documentconfig.on_message_reply=on_message_replyclient=ThinClient(config)loop=asyncio.get_running_loop()awaitclient.start(loop)# ... do work ...client.stop()
How to discover network services via the PKI document
NOTE that this isn’t necessary for using the Pigeonhole protocol
because kpclientd does courier service discovery automatically.
The PKI document lists all available mixnet services. Use
GetService to get a random instance of a named service:
doc:=client.PKIDocument()ifdoc==nil{log.Fatal("No PKI document available")}// Get a random echo service
desc,err:=client.GetService("echo")iferr!=nil{log.Fatal(err)}// Use desc.MixDescriptor.IdentityKey and desc.RecipientQueueID
// as the destination for SendMessage
destNode,destQueue:=desc.ToDestination()
letdoc=client.pki_document().await?;// Get a random echo service
letdesc=client.get_service("echo").await?;// Use the destination for send_message
let(dest_node,dest_queue)=desc.to_destination();
doc=client.pki_document()ifdocisNone:raiseException("No PKI document available")# Get a random echo servicedesc=client.get_service("echo")# Use the destination for send_messagedest_node,dest_queue=desc.to_destination()
How to verify the PKI document yourself
In ordinary use you do not need this section. kpclientd already
verifies every PKI document against the directory authorities listed
in client.toml, and only after a sufficient threshold of authority
signatures has passed does it push the document on to the thin
client. The pki_document() method described above hands you the
post-verification document, and you inherit the daemon’s guarantee
without further work. The signature map is stripped before that
handoff precisely because the verification has already happened;
carrying the signatures through would only invite confusion about
whose trust root is in force.
get_pki_document_raw is the trapdoor for special applications and
integrations that want the signed document. The cases that come up
in practice include:
An application that wishes to anchor its own root of trust,
independently of kpclientd’s configuration, for instance when
shipping a hardened build with the authority keys compiled in.
A relay that forwards the signed document to a separate consumer
(for archival, audit, or out-of-band verification) which does not
itself speak the thin-client protocol.
A diagnostic or monitoring tool that wishes to display which
authorities signed which consensus, across time.
The method returns the cert.Certificate-wrapped signed document
together with the epoch the daemon resolved to. Pass 0 for the
requested epoch to mean “whatever the daemon currently believes is
the latest”.
The examples below verify the document against the post-quantum
hybrid signature scheme Falcon-padded-512-Ed25519, the recommended
production scheme published by
hpqc in both its Python and Go
forms. The authority public keys must come from the application’s
own trust store, never from the daemon: if the daemon supplied them,
the verification would establish only that the daemon was internally
consistent, not that the document was signed by the real
authorities.
importstructfromhashlibimportblake2bimportcbor2fromhpqc.sign.hybridimportFalconPadded512Ed25519# The directory authority public keys in the wire format expected by# hpqc's hybrid scheme: the byte concatenation# ``falcon_padded_512_pub || ed25519_pub`` (929 bytes per authority).# These must be obtained out of band, typically baked into the# application or carried in a separately signed bundle. The hex# strings below are placeholders.AUTHORITY_PUBLIC_KEYS=[bytes.fromhex("ab"*929),# auth1bytes.fromhex("cd"*929),# auth2bytes.fromhex("ef"*929),# auth3]THRESHOLD=len(AUTHORITY_PUBLIC_KEYS)//2+1SCHEME_NAME="Falcon-padded-512-Ed25519"def_signed_message(cert:dict)->bytes:"""Reconstruct the byte string the authorities signed.
A deterministic little-endian concatenation of the Certificate
fields preceding Signatures; see katzenpost/core/cert/cert.go for
the canonical encoding.
"""returnb"".join([struct.pack("<I",cert["Version"]),struct.pack("<Q",cert["Expiration"]),cert["KeyType"].encode("utf-8"),cert["Certified"],])asyncdeffetch_and_verify_pki(client,epoch:int=0)->bytes:"""Fetch the signed PKI document and verify it against the trust root.
Returns the inner Certified payload (the CBOR-encoded Document)
once a sufficient threshold of authority signatures has verified;
raises ValueError otherwise.
"""payload,returned_epoch=awaitclient.get_pki_document_raw(epoch)cert=cbor2.loads(payload)ifcert["Version"]!=0:raiseValueError(f"unknown certificate version: {cert['Version']}")ifcert["KeyType"]!=SCHEME_NAME:raiseValueError(f"unexpected key type {cert['KeyType']!r}, "f"expected {SCHEME_NAME!r}")msg=_signed_message(cert)signatures=cert.get("Signatures")or{}verified=0forpubkeyinAUTHORITY_PUBLIC_KEYS:key_hash=blake2b(pubkey,digest_size=32).digest()sig=signatures.get(key_hash)ifsigisNone:continueifFalconPadded512Ed25519.verify(pubkey,msg,sig["Payload"]):verified+=1ifverified<THRESHOLD:raiseValueError(f"only {verified} of {len(AUTHORITY_PUBLIC_KEYS)} authority "f"signatures verified for epoch {returned_epoch}; threshold "f"is {THRESHOLD}")# cert["Certified"] is the CBOR-encoded Document. Decode it with# cbor2.loads(cert["Certified"]) if the application needs the# contents themselves.returncert["Certified"]
packagemainimport("encoding/hex""fmt""log""github.com/katzenpost/hpqc/sign""github.com/katzenpost/hpqc/sign/hybrid""github.com/katzenpost/katzenpost/client/thin""github.com/katzenpost/katzenpost/core/cert")// AuthorityPublicKeysHex is the application's root of trust for the
// network's directory: the wire-format hybrid public keys of each
// authority, hex-encoded. They must be obtained out of band and
// never from the daemon. Replace these placeholders with your own.
varAuthorityPublicKeysHex=[]string{"abab...",// auth1
"cdcd...",// auth2
"efef...",// auth3
}// FetchAndVerifyPKI fetches the signed PKI document for the given
// epoch (pass 0 for "current") and verifies it against the
// authority public keys above using core/cert.VerifyThreshold.
funcFetchAndVerifyPKI(client*thin.ThinClient,epochuint64)([]byte,error){scheme:=hybrid.FalconPadded512Ed25519verifiers:=make([]sign.PublicKey,0,len(AuthorityPublicKeysHex))for_,hexKey:=rangeAuthorityPublicKeysHex{raw,err:=hex.DecodeString(hexKey)iferr!=nil{returnnil,fmt.Errorf("decoding authority key: %w",err)}pub,err:=scheme.UnmarshalBinaryPublicKey(raw)iferr!=nil{returnnil,fmt.Errorf("parsing authority key: %w",err)}verifiers=append(verifiers,pub)}payload,returnedEpoch,err:=client.GetPKIDocumentRaw(epoch)iferr!=nil{returnnil,fmt.Errorf("fetching signed PKI doc: %w",err)}threshold:=len(verifiers)/2+1certified,good,_,err:=cert.VerifyThreshold(verifiers,threshold,payload)iferr!=nil{returnnil,fmt.Errorf("threshold verification failed for epoch %d: %w",returnedEpoch,err,)}log.Printf("verified %d of %d authority signatures for epoch %d",len(good),len(verifiers),returnedEpoch)returncertified,nil}
Considerations:
The authority public keys must come from a trust root external to
the daemon. If the daemon supplied them, verification would prove
only that the daemon was internally consistent.
The threshold above (a simple majority) matches the policy that the
authorities themselves enforce when they admit a consensus. An
application may apply a stricter policy, but should not relax it.
Should the network ever be reconfigured to use a different
signature scheme, swap the hybrid for the corresponding hpqc
verifier and adjust the expected KeyType (Python) or the
hybrid.* selector (Go) accordingly. The KeyType field of the
certificate is what the authorities signed under, and is the
authoritative indicator of the scheme in force.
A Rust binding is not shown because hpqc does not yet publish a
Rust port; a Rust application can compose the verification with
the ed25519-dalek and falcon crates by the same wire layout
(the public key and signature are simple concatenations of the
two component halves).
How to send a message to a mixnet service
NOTE that this API call is NOT used with the Pigeonhole protocol.
However it is still useful for writing other protocols and proving the
echo service.
Use BlockingSendMessage for simple request-response interactions
with non-Pigeonhole services (like the echo service):
NOTE that this does NOT produce any network traffic.
It’s a local cryptographic operation only.
A Pigeonhole channel (stream) is created from a 32-byte random seed.
The writer keeps the write cap; the reader receives the read cap and
first index out-of-band.
Writing is a two-step process: encrypt the message, then send it via
ARQ.
// Encrypt the message
ciphertext,envDesc,envHash,nextIndex,err:=client.EncryptWrite([]byte("hello"),writeCap,currentIndex,)iferr!=nil{log.Fatal(err)}// Send via ARQ (blocks until acknowledged)
_,err=client.StartResendingEncryptedMessage(nil,// readCap (nil for writes)
writeCap,nil,// messageBoxIndex (nil for writes)
nil,// replyIndex
envDesc,ciphertext,envHash,)iferr!=nil{log.Fatal(err)}// Advance the index for the next write
currentIndex=nextIndex
// Encrypt the message
letresult=client.encrypt_write(b"hello",&write_cap,¤t_index,).await?;// Send via ARQ (blocks until acknowledged)
client.start_resending_encrypted_message(None,// read_cap (None for writes)
Some(&write_cap),None,// message_box_index
None,// reply_index
&result.envelope_descriptor,&result.message_ciphertext,&result.envelope_hash,).await?;// Advance the index for the next write
current_index=result.next_message_box_index;
# Encrypt the messageresult=awaitclient.encrypt_write(b"hello",write_cap,current_index)# Send via ARQ (blocks until acknowledged)awaitclient.start_resending_encrypted_message(read_cap=None,write_cap=write_cap,next_message_index=None,reply_index=None,envelope_descriptor=result.envelope_descriptor,message_ciphertext=result.message_ciphertext,envelope_hash=result.envelope_hash,)# Advance the index for the next writecurrent_index=result.next_message_box_index
How to read a message from a Pigeonhole channel
Reading is also two steps: encrypt a read request, then send it via
ARQ. The reply contains the plaintext.
// Encrypt a read request
ciphertext,envDesc,envHash,nextIndex,err:=client.EncryptRead(readCap,currentIndex,)iferr!=nil{log.Fatal(err)}// Send via ARQ (blocks until the message is retrieved)
result,err:=client.StartResendingEncryptedMessage(readCap,nil,// writeCap (nil for reads)
nil,// messageBoxIndex
nil,// replyIndex
envDesc,ciphertext,envHash,)iferr!=nil{log.Fatal(err)}plaintext:=result.Plaintext// Advance the index for the next read
currentIndex=nextIndex
// Encrypt a read request
letread_result=client.encrypt_read(&read_cap,¤t_index).await?;// Send via ARQ (blocks until the message is retrieved)
letresult=client.start_resending_encrypted_message(Some(&read_cap),None,// write_cap (None for reads)
None,// message_box_index
None,// reply_index
&read_result.envelope_descriptor,&read_result.message_ciphertext,&read_result.envelope_hash,).await?;letplaintext=result.plaintext;// Advance the index for the next read
current_index=read_result.next_message_box_index;
# Encrypt a read requestread_result=awaitclient.encrypt_read(read_cap,current_index)# Send via ARQ (blocks until the message is retrieved)plaintext=awaitclient.start_resending_encrypted_message(read_cap=read_cap,write_cap=None,next_message_index=None,reply_index=None,envelope_descriptor=read_result.envelope_descriptor,message_ciphertext=read_result.message_ciphertext,envelope_hash=read_result.envelope_hash,)# Advance the index for the next readcurrent_index=read_result.next_message_box_index
How to wait for a message that has not been written yet
Reads and writes are not coordinated: a reader routinely asks for an
index before the writer has filled it, and replication lag can briefly
hide a box that was in fact written. In both cases the daemon reports
BoxIDNotFound. This is the expected answer to “anything here yet?”,
not a failure. The correct pattern is a bounded poll: retry on the
expected outcome, with a short delay between attempts, until the data
appears or an application deadline elapses. Use IsExpectedOutcome to
tell a benign “not yet” apart from a real error, so that genuine
failures are not silently retried forever.
deadline:=time.Now().Add(2*time.Minute)varplaintext[]bytefor{ciphertext,envDesc,envHash,nextIndex,err:=client.EncryptRead(readCap,currentIndex,)iferr!=nil{log.Fatal(err)}result,err:=client.StartResendingEncryptedMessage(readCap,nil,nil,nil,envDesc,ciphertext,envHash,)iferr==nil{plaintext=result.PlaintextcurrentIndex=nextIndexbreak}// BoxIDNotFound here just means "not written yet". Anything that
// is not an expected outcome is a real failure worth surfacing.
if!thin.IsExpectedOutcome(err){log.Fatal(err)}iftime.Now().After(deadline){log.Fatal("gave up waiting for the message")}time.Sleep(3*time.Second)}
importasyncio,timefromkatzenpost_thinclientimportis_expected_outcomedeadline=time.monotonic()+120whileTrue:read=awaitclient.encrypt_read(read_cap,current_index)try:plaintext=awaitclient.start_resending_encrypted_message(read_cap=read_cap,write_cap=None,next_message_index=None,reply_index=None,envelope_descriptor=read.envelope_descriptor,message_ciphertext=read.message_ciphertext,envelope_hash=read.envelope_hash,)current_index=read.next_message_box_indexbreakexceptExceptionasexc:# "not written yet" is expected; anything else is a real error.ifnotis_expected_outcome(exc):raiseiftime.monotonic()>deadline:raiseawaitasyncio.sleep(3)
How to persist and restore channel state
The daemon keeps no per-application channel state. The write cap, the
read cap, and above all the current index belong to your
application, and if you lose the index across a restart you no longer
know where to append next (re-using a filled index earns
BoxAlreadyExists). Persist the index every time you advance it,
durably, before you treat the write as done.
In Go the capabilities and the index are typed; serialise them with
MarshalBinary and restore them with the bacap constructors. In
Rust and Python new_keypair already hands you the caps and index as
byte strings, so persistence is simply storing and reloading those
bytes.
import"github.com/katzenpost/hpqc/bacap"// Save: marshal each artefact to bytes and write atomically to disk.
wcBytes,_:=writeCap.MarshalBinary()rcBytes,_:=readCap.MarshalBinary()idxBytes,_:=currentIndex.MarshalBinary()saveState(wcBytes,rcBytes,idxBytes)// your durable, atomic write
// Restore after a restart:
writeCap,err:=bacap.NewWriteCapFromBytes(wcBytes)iferr!=nil{log.Fatal(err)}readCap,err:=bacap.ReadCapFromBytes(rcBytes)iferr!=nil{log.Fatal(err)}currentIndex,err:=bacap.NewEmptyMessageBoxIndexFromBytes(idxBytes)iferr!=nil{log.Fatal(err)}
// new_keypair already returns Vec<u8> for each artefact.
letkp=client.new_keypair(&seed).await?;save_state(&kp.write_cap,&kp.read_cap,&kp.first_index);letmutcurrent_index=kp.first_index.clone();// ... each time you advance, persist the new index bytes ...
save_index(¤t_index);// After a restart, the stored bytes are passed straight back into
// the API; no deserialisation step is required.
let(write_cap,read_cap,current_index)=load_state();
# new_keypair already returns bytes for each artefact.kp=awaitclient.new_keypair(seed)save_state(kp.write_cap,kp.read_cap,kp.first_index)current_index=kp.first_index# ... each time you advance, persist the new index bytes ...save_index(current_index)# After a restart, the stored bytes are passed straight back into# the API; no deserialisation step is required.write_cap,read_cap,current_index=load_state()
The writer must persist currentIndex after every successful write,
the reader after every successful read. Persist the index before
acknowledging the message to the rest of your application, so that a
crash cannot leave you having processed a message whose index you
never recorded.
How to hold a two-way conversation
A stream has exactly one writer, so a conversation between two parties
is two streams: each party writes to its own and reads from the
other’s. The setup is symmetric: each creates a stream and shares its
read cap (and first index) with the other out-of-band. Thereafter each
party writes with its own write cap and polls the peer’s stream with
the peer’s read cap, advancing two independent indices.
// Alice's side. (Bob's is the mirror image.)
aliceWrite,aliceRead,aliceIdx,err:=client.NewKeypair(aliceSeed)iferr!=nil{log.Fatal(err)}// Exchange read caps out-of-band: Alice sends aliceRead+aliceIdx to
// Bob and receives bobRead+bobIdx from Bob.
sendOutOfBand(aliceRead,aliceIdx)bobRead,bobIdx:=receiveOutOfBand()// Send on Alice's own stream.
ct,ed,eh,nextOut,_:=client.EncryptWrite([]byte("hello Bob"),aliceWrite,aliceIdx)_,err=client.StartResendingEncryptedMessage(nil,aliceWrite,nil,nil,ed,ct,eh)iferr!=nil{log.Fatal(err)}aliceIdx=nextOut// persist this
// Receive on Bob's stream, using the polling pattern shown above,
// reading with bobRead and advancing bobIdx.
// Alice's side. (Bob's is the mirror image.)
letalice=client.new_keypair(&alice_seed).await?;// Exchange read caps out-of-band.
send_out_of_band(&alice.read_cap,&alice.first_index);let(bob_read,mutbob_idx)=receive_out_of_band();letmutalice_idx=alice.first_index.clone();// Send on Alice's own stream.
letw=client.encrypt_write(b"hello Bob",&alice.write_cap,&alice_idx).await?;client.start_resending_encrypted_message(None,Some(&alice.write_cap),None,None,&w.envelope_descriptor,&w.message_ciphertext,&w.envelope_hash).await?;alice_idx=w.next_message_box_index;// persist this
// Receive on Bob's stream with the polling pattern, reading with
// bob_read and advancing bob_idx.
# Alice's side. (Bob's is the mirror image.)alice=awaitclient.new_keypair(alice_seed)# Exchange read caps out-of-band.send_out_of_band(alice.read_cap,alice.first_index)bob_read,bob_idx=receive_out_of_band()alice_idx=alice.first_index# Send on Alice's own stream.w=awaitclient.encrypt_write(b"hello Bob",alice.write_cap,alice_idx)awaitclient.start_resending_encrypted_message(read_cap=None,write_cap=alice.write_cap,next_message_index=None,reply_index=None,envelope_descriptor=w.envelope_descriptor,message_ciphertext=w.message_ciphertext,envelope_hash=w.envelope_hash)alice_idx=w.next_message_box_index# persist this# Receive on Bob's stream with the polling pattern, reading with# bob_read and advancing bob_idx.
How to prepare operations offline
The daemon distinguishes two kinds of work. Key generation and
envelope encryption (NewKeypair, EncryptWrite, EncryptRead,
TombstoneRange, and the copy-stream constructors) are local
cryptography and succeed even when the daemon is not connected to the
mixnet. Only StartResendingEncryptedMessage and
StartResendingCopyCommand require connectivity; called offline they
fail rather than block.
You can therefore prepare envelopes while offline, persist them, and
transmit once connectivity returns. Test IsConnected before
transmitting, or watch the connection event and flush a queue when it
turns true.
// Offline: this is pure local crypto and works regardless.
ciphertext,envDesc,envHash,nextIndex,err:=client.EncryptWrite([]byte("written while offline"),writeCap,currentIndex,)iferr!=nil{log.Fatal(err)}enqueue(envDesc,ciphertext,envHash)// persist for later
// Later, only transmit once the daemon is connected.
ifclient.IsConnected(){for_,e:=rangedrainQueue(){_,err=client.StartResendingEncryptedMessage(nil,writeCap,nil,nil,e.desc,e.ct,e.hash)iferr!=nil{log.Fatal(err)}}}
// Offline: pure local crypto, works regardless.
letw=client.encrypt_write(b"written while offline",&write_cap,¤t_index).await?;enqueue(&w);// persist for later
// Later, only transmit once connected.
ifclient.is_connected(){foreindrain_queue(){client.start_resending_encrypted_message(None,Some(&write_cap),None,None,&e.envelope_descriptor,&e.message_ciphertext,&e.envelope_hash).await?;}}
# Offline: pure local crypto, works regardless.w=awaitclient.encrypt_write(b"written while offline",write_cap,current_index)enqueue(w)# persist for later# Later, only transmit once connected.ifclient.is_connected():foreindrain_queue():awaitclient.start_resending_encrypted_message(read_cap=None,write_cap=write_cap,next_message_index=None,reply_index=None,envelope_descriptor=e.envelope_descriptor,message_ciphertext=e.message_ciphertext,envelope_hash=e.envelope_hash)
A complete end-to-end example
The fragments above each show one task. Here they are assembled into a
single runnable program: Alice creates a stream, writes one message,
and Bob reads it back. This is the smallest complete program that
exercises the Pigeonhole path. As with every example in this guide it
omits production concerns (durable persistence, structured logging),
but it compiles into the shape of a real application; the
CI-verified tests are the
authority on exact, current usage.
packagemainimport("log""github.com/katzenpost/hpqc/rand""github.com/katzenpost/katzenpost/client/thin""github.com/katzenpost/katzenpost/core/config")funcmain(){cfg,err:=thin.LoadFile("thinclient.toml")iferr!=nil{log.Fatal(err)}client:=thin.NewThinClient(cfg,&config.Logging{Level:"INFO"})iferr:=client.Dial();err!=nil{log.Fatal(err)}deferclient.Close()// Alice creates a stream.
seed:=make([]byte,32)if_,err:=rand.Reader.Read(seed);err!=nil{log.Fatal(err)}writeCap,readCap,idx,err:=client.NewKeypair(seed)iferr!=nil{log.Fatal(err)}// Alice writes one message.
ct,ed,eh,_,err:=client.EncryptWrite([]byte("hello from Alice"),writeCap,idx)iferr!=nil{log.Fatal(err)}if_,err:=client.StartResendingEncryptedMessage(nil,writeCap,nil,nil,ed,ct,eh);err!=nil{log.Fatal(err)}// Bob reads it back (readCap would normally be shared out-of-band).
rct,red,reh,_,err:=client.EncryptRead(readCap,idx)iferr!=nil{log.Fatal(err)}result,err:=client.StartResendingEncryptedMessage(readCap,nil,nil,nil,red,rct,reh)iferr!=nil{log.Fatal(err)}log.Printf("Bob read: %s",result.Plaintext)}
usekatzenpost_thin_client::{Config,ThinClient};#[tokio::main]asyncfnmain()-> Result<(),Box<dynstd::error::Error>>{letconfig=Config::new("thinclient.toml")?;letclient=ThinClient::new(config).await?;// Alice creates a stream.
letseed: [u8;32]=rand::random();letkp=client.new_keypair(&seed).await?;// Alice writes one message.
letw=client.encrypt_write(b"hello from Alice",&kp.write_cap,&kp.first_index).await?;client.start_resending_encrypted_message(None,Some(&kp.write_cap),None,None,&w.envelope_descriptor,&w.message_ciphertext,&w.envelope_hash).await?;// Bob reads it back (read_cap would normally be shared out-of-band).
letr=client.encrypt_read(&kp.read_cap,&kp.first_index).await?;letresult=client.start_resending_encrypted_message(Some(&kp.read_cap),None,None,None,&r.envelope_descriptor,&r.message_ciphertext,&r.envelope_hash).await?;println!("Bob read: {:?}",result.plaintext);client.stop().await;Ok(())}
importasyncio,osfromkatzenpost_thinclientimportThinClient,Configasyncdefmain():config=Config("thinclient.toml")client=ThinClient(config)awaitclient.start(asyncio.get_running_loop())# Alice creates a stream.seed=os.urandom(32)kp=awaitclient.new_keypair(seed)# Alice writes one message.w=awaitclient.encrypt_write(b"hello from Alice",kp.write_cap,kp.first_index)awaitclient.start_resending_encrypted_message(read_cap=None,write_cap=kp.write_cap,next_message_index=None,reply_index=None,envelope_descriptor=w.envelope_descriptor,message_ciphertext=w.message_ciphertext,envelope_hash=w.envelope_hash)# Bob reads it back (read_cap would normally be shared out-of-band).r=awaitclient.encrypt_read(kp.read_cap,kp.first_index)plaintext=awaitclient.start_resending_encrypted_message(read_cap=kp.read_cap,write_cap=None,next_message_index=None,reply_index=None,envelope_descriptor=r.envelope_descriptor,message_ciphertext=r.message_ciphertext,envelope_hash=r.envelope_hash)print("Bob read:",plaintext)client.stop()asyncio.run(main())
How to delete messages with tombstones
Use TombstoneRange to create tombstone envelopes, then send each
one via StartResendingEncryptedMessage. To tombstone a single box,
use maxCount=1.
// Create tombstone envelopes for 5 boxes
result,err:=client.TombstoneRange(writeCap,startIndex,5)iferr!=nil{log.Fatal(err)}// Send each tombstone
for_,envelope:=rangeresult.Envelopes{_,err=client.StartResendingEncryptedMessage(nil,writeCap,nil,nil,envelope.EnvelopeDescriptor,envelope.MessageCiphertext,envelope.EnvelopeHash,)iferr!=nil{log.Fatal(err)}}// result.Next is the index after the last tombstoned box
// Create tombstone envelopes for 5 boxes
letresult=client.tombstone_range(&write_cap,&start_index,5).await;// Send each tombstone
forenvelopein&result.envelopes{client.start_resending_encrypted_message(None,Some(&write_cap),None,None,&envelope.envelope_descriptor,&envelope.message_ciphertext,&envelope.envelope_hash,).await?;}// result.next is the index after the last tombstoned box
# Create tombstone envelopes for 5 boxesresult=awaitclient.tombstone_range(write_cap,start_index,5)# Send each tombstoneforenvelopeinresult.envelopes:awaitclient.start_resending_encrypted_message(read_cap=None,write_cap=write_cap,next_message_index=None,reply_index=None,envelope_descriptor=envelope.envelope_descriptor,message_ciphertext=envelope.message_ciphertext,envelope_hash=envelope.envelope_hash,)# result.next is the index after the last tombstoned box
How to send to one channel atomically via copy command
A copy command writes data to a destination channel atomically via a
courier. The steps are:
Create a temporary channel and a destination channel.
Pack the payload into copy stream elements.
Write each element to the temporary channel.
Send a copy command referencing the temporary channel.
// Create destination channel
destSeed:=make([]byte,32)rand.Reader.Read(destSeed)destWriteCap,destReadCap,destFirstIndex,err:=client.NewKeypair(destSeed)iferr!=nil{log.Fatal(err)}// Create temporary channel
tempSeed:=make([]byte,32)rand.Reader.Read(tempSeed)tempWriteCap,_,tempFirstIndex,err:=client.NewKeypair(tempSeed)iferr!=nil{log.Fatal(err)}// Pack payload into copy stream elements
envelopes,_,err:=client.CreateCourierEnvelopesFromPayload(payload,destWriteCap,destFirstIndex,true,// isStart
true,// isLast
)iferr!=nil{log.Fatal(err)}// Write each element to the temporary channel
tempIndex:=tempFirstIndexfor_,chunk:=rangeenvelopes{ciphertext,envDesc,envHash,nextIdx,err:=client.EncryptWrite(chunk,tempWriteCap,tempIndex,)iferr!=nil{log.Fatal(err)}_,err=client.StartResendingEncryptedMessage(nil,tempWriteCap,nil,nil,envDesc,ciphertext,envHash,)iferr!=nil{log.Fatal(err)}tempIndex=nextIdx}// Send the copy command (blocks until courier acknowledges)
err=client.StartResendingCopyCommand(tempWriteCap)iferr!=nil{log.Fatal(err)}// Share destReadCap and destFirstIndex with the reader
// Create destination channel
letdest_seed: [u8;32]=rand::random();letdest=client.new_keypair(&dest_seed).await?;// Create temporary channel
lettemp_seed: [u8;32]=rand::random();lettemp=client.new_keypair(&temp_seed).await?;// Pack payload into copy stream elements
letenvelopes_result=client.create_courier_envelopes_from_payload(&payload,&dest.write_cap,&dest.first_index,true,// is_start
true,// is_last
).await?;// Write each element to the temporary channel
letmuttemp_index=temp.first_index.clone();forchunkin&envelopes_result.envelopes{letwrite_result=client.encrypt_write(chunk,&temp.write_cap,&temp_index,).await?;client.start_resending_encrypted_message(None,Some(&temp.write_cap),None,None,&write_result.envelope_descriptor,&write_result.message_ciphertext,&write_result.envelope_hash,).await?;temp_index=write_result.next_message_box_index;}// Send the copy command (blocks until courier acknowledges)
client.start_resending_copy_command(&temp.write_cap,None,None).await?;// Share dest.read_cap and dest.first_index with the reader
importos# Create destination channeldest_seed=os.urandom(32)dest=awaitclient.new_keypair(dest_seed)# Create temporary channeltemp_seed=os.urandom(32)temp=awaitclient.new_keypair(temp_seed)# Pack payload into copy stream elementsenvelopes_result=awaitclient.create_courier_envelopes_from_payload(payload,dest.write_cap,dest.first_index,is_start=True,is_last=True,)# Write each element to the temporary channeltemp_index=temp.first_indexforchunkinenvelopes_result.envelopes:write_result=awaitclient.encrypt_write(chunk,temp.write_cap,temp_index)awaitclient.start_resending_encrypted_message(read_cap=None,write_cap=temp.write_cap,next_message_index=None,reply_index=None,envelope_descriptor=write_result.envelope_descriptor,message_ciphertext=write_result.message_ciphertext,envelope_hash=write_result.envelope_hash,)temp_index=write_result.next_message_box_index# Send the copy command (blocks until courier acknowledges)awaitclient.start_resending_copy_command(temp.write_cap)# Share dest.read_cap and dest.first_index with the reader
How to send to multiple channels atomically
Use CreateCourierEnvelopesFromMultiPayload to pack payloads for
different destinations into a single copy stream efficiently:
How to handle multi-call buffer passing for large copy streams
When building a copy stream across multiple calls (because you have
more data than fits in a single call, or data arrives incrementally),
pass the buffer from each result to the next call:
varbuffer[]byte// nil on first call
// First batch of destinations
result1,err:=client.CreateCourierEnvelopesFromMultiPayload(batch1Destinations,true,// isStart (first call)
false,// isLast (more calls coming)
buffer,)iferr!=nil{log.Fatal(err)}// Write result1.Envelopes to temp channel...
buffer=result1.Buffer// save for next call
// Persist buffer to disk for crash recovery
saveState(buffer)// Second batch (final)
result2,err:=client.CreateCourierEnvelopesFromMultiPayload(batch2Destinations,false,// isStart (not the first call)
true,// isLast (final call)
buffer,)iferr!=nil{log.Fatal(err)}// Write result2.Envelopes to temp channel...
// On crash recovery, reload buffer from disk and continue
// with isStart=false
letmutbuffer: Option<Vec<u8>>=None;// None on first call
// First batch
letresult1=client.create_courier_envelopes_from_multi_payload(batch1_destinations,true,// is_start
false,// is_last
buffer,).await?;// Write result1.envelopes to temp channel...
buffer=result1.buffer;// save for next call
// Persist buffer to disk for crash recovery
save_state(&buffer);// Second batch (final)
letresult2=client.create_courier_envelopes_from_multi_payload(batch2_destinations,false,// is_start
true,// is_last
buffer,).await?;// Write result2.envelopes to temp channel...
buffer=None# None on first call# First batchresult1=awaitclient.create_courier_envelopes_from_multi_payload(batch1_destinations,is_start=True,# first callis_last=False,# more calls comingbuffer=buffer,)# Write result1.envelopes to temp channel...buffer=result1.buffer# save for next call# Persist buffer to disk for crash recoverysave_state(buffer)# Second batch (final)result2=awaitclient.create_courier_envelopes_from_multi_payload(batch2_destinations,is_start=False,# not the first callis_last=True,# final callbuffer=buffer,)# Write result2.envelopes to temp channel...
How to tombstone a range via copy stream
Use CreateCourierEnvelopesFromTombstoneRange to atomically
tombstone boxes as part of a copy command. The courier performs the
tombstoning, so it either all succeeds or none of it is visible.
Both StartResendingEncryptedMessage and StartResendingCopyCommand
block until completion. You can cancel them individually, or stop
everything at once by closing the thin client.
To cancel a specific operation, call the corresponding cancel
method from another thread/task:
// Cancel an encrypted message operation
err:=client.CancelResendingEncryptedMessage(envelopeHash)// Cancel a copy command (needs blake2b-256 hash of the write cap)
writeCapBytes,_:=tempWriteCap.MarshalBinary()writeCapHash:=blake2b.Sum256(writeCapBytes)err=client.CancelResendingCopyCommand(&writeCapHash)
// Cancel an encrypted message operation
client.cancel_resending_encrypted_message(&envelope_hash).await?;// Cancel a copy command
useblake2::{Blake2b,Digest};usedigest::consts::U32;letwrite_cap_hash: [u8;32]=Blake2b::<U32>::digest(&temp_write_cap).into();client.cancel_resending_copy_command(&write_cap_hash).await?;
# Cancel an encrypted message operationawaitclient.cancel_resending_encrypted_message(envelope_hash)# Cancel a copy commandfromhashlibimportblake2bwrite_cap_hash=blake2b(temp_write_cap,digest_size=32).digest()awaitclient.cancel_resending_copy_command(write_cap_hash)
To stop all in-flight operations at once, call Close() (Go),
stop() (Rust), or stop() (Python). This shuts down the thin
client entirely – all blocked callers receive an error, and the
daemon stops all ARQ retransmission loops for this client. This is
useful when your application is shutting down or when you want to
abandon all pending work without cancelling each operation
individually.
How to handle daemon disconnects and restarts
The thin client automatically reconnects when the daemon connection
is lost. It uses an instance token to detect whether it reconnected
to the same daemon or a new one:
Same instance token: The daemon still has its state. No action
needed.
Different instance token: The daemon is a new process. The thin
client automatically replays all in-flight
StartResendingEncryptedMessage and StartResendingCopyCommand
operations. Callers blocked on these methods are unaware of the
disconnect.
Applications do not need to manage reconnection or replay. You can
observe disconnect events to log or update UI state:
eventCh:=client.EventSink()deferclient.StopEventSink(eventCh)forev:=rangeeventCh{switchv:=ev.(type){case*thin.DaemonDisconnectedEvent:ifv.IsGraceful{fmt.Println("Daemon shut down gracefully")}else{fmt.Printf("Daemon connection lost: %v\n",v.Err)}// No action needed -- thin client reconnects automatically
// and replays in-flight requests if the daemon instance changed.
case*thin.ConnectionStatusEvent:fmt.Printf("Connected: %v\n",v.IsConnected)// v.InstanceToken identifies the daemon process.
// The thin client compares this internally on reconnect.
}}
letconfig=Config::new("thinclient.toml")?;// Set disconnect callback during config
config.on_daemon_disconnected=Some(Box::new(|graceful,err_msg|{ifgraceful{println!("Daemon shut down gracefully");}else{println!("Daemon connection lost: {:?}",err_msg);}// No action needed -- thin client reconnects automatically
// and replays in-flight requests if the daemon instance changed.
}));letclient=ThinClient::new(config).await?;
asyncdefon_daemon_disconnected(event):ifevent.get("is_graceful"):print("Daemon shut down gracefully")else:print(f"Daemon connection lost: {event.get('error')}")# No action needed -- thin client reconnects automatically# and replays in-flight requests if the daemon instance changed.asyncdefon_connection_status(event):print(f"Connected: {event['is_connected']}")# event contains 'instance_token' identifying the daemon process.# The thin client compares this internally on reconnect.config=Config("thinclient.toml",on_daemon_disconnected=on_daemon_disconnected,on_connection_status=on_connection_status,)client=ThinClient(config)
If the thin client is disconnected when you cancel an operation, the
cancel just removes it from in-flight tracking – it will not be
replayed on reconnect:
// Safe to call while disconnected -- removes from tracking,
// no message sent to daemon since there is no connection.
err:=client.CancelResendingEncryptedMessage(envelopeHash)err=client.CancelResendingCopyCommand(&writeCapHash)
// Safe to call while disconnected -- removes from tracking,
// no message sent to daemon since there is no connection.
client.cancel_resending_encrypted_message(&envelope_hash).await?;client.cancel_resending_copy_command(&write_cap_hash).await?;
# Safe to call while disconnected -- removes from tracking,# no message sent to daemon since there is no connection.awaitclient.cancel_resending_encrypted_message(envelope_hash)awaitclient.cancel_resending_copy_command(write_cap_hash)
To terminate the thin client entirely (all blocked callers receive an
error, daemon disconnects never kill the thin client):
err:=client.Close()
client.stop().await;
client.stop()
9 -
Threat Model
The purpose and
structure of this document
This threat model document is unique in the privacy technology
landscape for its detailed treatment of realistic adversary
capabilities. It is not a description of a superficial, theoretical
system, but rather of complex, real-life software that is being
interrogated and constantly re-designed to provide the best possible
security. We examine it from the point of view of both theoretical
design, networking choices and practical pitfalls.
And still, it is not and will likely never be comprehensive. Various
attacks and countermeasure strategies will be added to this document in
the future, as it keeps evolving. However, we feel that it already
provides a valuable, systematic view of the challenges faced by mixnet
technology.
There exists a rich body of academic work analyzing how one might
disrupt the functioning of a Mixnet or circumvent its security and
privacy guarantees. We have endeavored to compile these decades research
and summarized these attacks in the table on page 3. The table on page 4
focuses on networking security threats that are specific to Katzenpost
protocol choices.
We then delve into the countermeasures employed by Katzenpost and
discuss their efficacy. A special care is taken to discuss the details
of post-quantum cryptographic primitives that we have introduced in
several places of the design.
Introducing the adversary
It is no longer controversial to say that in the modern world, we
face incredibly powerful surveillance adversaries. These could be state,
corporate or criminal actors, vying for our information to use as means
of making profit, manipulating us and others, gaining leverage,
strengthening their authority, or as means of persecution. In many
contexts, we have little hope for non-technical solutions due to lack of
sufficiently powerful pressure in favor of privacy.
And so in a quest for technical solutions, we need equally powerful
tools. In the case of communication tools, the Internet’s bread and
butter, we would like to allow users to interact and exchange
information with reasonable expectation of both the content and metadata
of their communication, and personal information such as a user’s social
graph, being protected from such adversaries. Therefore, we consider an
adversary capable of the following:
The adversary can see the connections of the entire global
internet and is capable of intricate statistical analysis of gathered
data.
The adversary can disable parts of the network.
The adversary can plant or take over some devices in the network
to inject malicious code and manipulate the functioning of the network
or to gain access to the information available to them. The takeover
could happen by technical means or by exercising force outside of the
network.
The adversary has very large (but not infinite) computational
resources, and is capable of cryptanalysis on par with frontier
research.
The adversary has access to a quantum computer, or will have
access to a quantum computer in the near future.
The adversary can supplement collected data with rich context of
already gathered data on all users from other sources.
If we hope for our work to be relevant in the modern world, we can no
longer settle for weak threat models. That is the bar we set for
ourselves at Katzenpost.
Katzenpost mixnet threat
model summary
Firstly, assumptions about the user:
The user acts reasonably and in good faith.
The user obtains an authentic copy of the Katzenpost client and
the mixnet client configuration file.
Secondly, assumptions about the user’s computer:
The computer operates correctly and is not compromised by
malware.
Thirdly, assumptions about the mixnet:
The mixnet only provides internal services and does not have any
"exit nodes" or anything that resembles a proxy service or VPN.
All mixnet protocols are protocols which do not force
interaction.
All mixnet protocols are low bandwidth and latency
tolerant.
Finally, assumptions about the world:
The three core protocols of Katzenpost are configured to use modern
cryptographic primitives which are valid and considered impossible to
break, for example:
PKI Signature Scheme using Edd25519-Sphincs+
NIKE Sphinx using X25519
PQ Noise with pqXX pattern using Xwing
What the user’s Gateway can achieve keeping in mind that typically a
fair sized mixnet will have more than one gateway node:
A Gateway node learns when a given client is online.
A Gateway node learns the client’s IP address.
A Gateway node learns how many messages the client sends and
receives.
A Gateway node does NOT learn the sent message destinations or
the received message origins.
A Gateway node does NOT learn if a given sent or received message
is a decoy or not.
A Gateway node can drop or corrupt any sent or received
message.
A Gateway node can spam a user with invalid messages.
A Gateway node can duplicate old messages. However duplicate
outbound messages will be dropped by the first hop as per Sphinx packet
deduplication cache.
What a sufficiently global, passive adversary can achieve:
A GPA can learn who is using the mixnet and where their Gateway
nodes are located.
What a local network attacker can achieve:
A local network can observe when a user is using
Katzenpost.
A local network can block Katzenpost.
What a compromise of the user’s computer can achieve:
After an endpoint device is compromised, an attacker can impersonate
that user, receiving and sending messages. The attacker does NOT learn
the communication correspondent network locations.
What a Service Node can achieve:
A Service Node on the mix network does not know from whence it’s service
request message came. Therefore in general, absent some clever attack,
the Service Nodes learn nothing about the clients that interact with
them.
What a contact can achieve:
A contact can spam a user with messages.
A contact can, to some extent, prove to a third-party that a
message came from a user
A contact can retain messages from a user, forever.
What a random person on the Internet can achieve:
A random person can attempt to DoS the mix network or a specific service
on the mixnet.
A
summary of theoretical security concerns in a Mixnet
Mixnet
attack type
Attack
description
Necessary adversary
capabilities
Intersection, Statistical Disclosure
Attacks
Over time, adversary can glean statistical
information that makes the probability distribution of who Alice is
communicating with non-uniform. Law of Large Numbers implies the
anonymity set tends to the set of clients with identical probability in
the long run to the actual recipient.
The adversary must typically be able to
see messages entering and leaving the network. This is customarily
treated as a PGA, despite only requiring a view of the network’s
perimeter. The adversary must be able to distinguish messages from dummy
traffic, or observe when users are active.
n − 1 Attack
The adversary causes the mix to contain
only messages sent by the adversary, except one. In the context of
continuous time mixing such as with the Poisson mix, this means that the
adversary drops or delays other messages until the mix is empty before
the target message enters the mix. The adversary sees the target message
exit the mix to its next destination.
The adversary must compromise routers
which are upstream from a target mix node so as to be able to block
incoming messages, send messages, as well as be able to tell when a
target message passes through them.
Epistemic Attack
The fact that a client is issued only a
subset of the mix nodes’ directory and encryption keys can leak
information to the adversary.
The adversary has knowledge of the target
client’s view of the network which distinguishes them among clients.
This could happen via a zero day or a design flaw such as not
implementing PIR for discovery.
Denial of Service Attack
The adversary is able to disrupt the
functioning of the service, often by overwhelming its resources.
The adversary has sufficient network and
computational resources to overwhelm the network.
Sybil Attack
The adversary plants a large number of
malicious nodes, and is therefore able to glean partial or complete
information to follow a message through the mix and disrupt the
network.
The adversary has sufficient resources to
take over the network, and the network’s design allows for the creation
of a large number of malicious nodes.
Compulsion Attack
The adversary compels enough honest node
operators to disclose information to follow a message through the mix
network .
The adversary has the necessary force to
compel a sufficient number of honest actors to do the adversary’s
bidding.
Timing Attack
An active adversary manipulates the timing
of the packets passing through compromised routers, or passive adversary
exploits timing information that is leaked despite padding.
The passive attack could happen via a zero
day or design flaw. The efficacy of the active attack needs to be
analyzed with respect to the specific design.
Cryptographic Attacks
The adversary is able to forge a
signature, generate a second hash preimage, decrypt ciphertext or do
other damage assumed to be prevented by the use of cryptography.
The adversary can break the security of
one or more cryptographic primitives through a cryptographic zero day or
sufficient computational resources, or exploit a flaw in their
implementation.
Endpoint Security Attacks
The adversary breaches the security of a
user’s device via an attack not directly related to the mixnet.
The adversary is able to exploit a
technical flaw in the user’s device or compel the user to grant him
access.
Predecessor Attack
The adversary compromises at least one mix
node in each routing topology layer. Eventually a client will randomly
select a bad route where every mix node in the route is
compromised.
The adversary must have the capability to
operate or compromise mix nodes, at least one in each routing topology
layer. See countermeasure section for more details.
Networking security
concerns in Katzenpost
Mixnet
attack type
Attack
description
Necessary adversary
capabilities
Tagging Attack
The adversary exploits some kind of
cryptographic malleability property of the Sphinx packet format in order
to violate the privacy notions of the mix network.
The adversary must be able to witness the
Sphinx payload decryption to determine if it was tagged or not. This
means compromising a Provider for forward packets and compromising a
client’s endpoint device for SURB replies.
Replay Confirmation Attack
If a Sphinx packet is able to be replayed
then the adversary may send the packet many times concurrently in order
to observe the traffic burst in another part of the network.
The mix nodes maintain Sphinx replay
caches in order to prevent replays; the attack is therefore only
possible if there is a replay cache malfunction.
SURB Confirmation Attack
If a client sends many SURBs1 to
another entity on the network, that entity may choose to send out ALL
the SURBs at once in order to observe the traffic burst in another part
of the network.
The adversary is a global passive observer
of the network and participant in the network; additionally the
adversary must be in possession of multiple SURBs created by another
entity on the network.
ARQ Confirmation Attack
The adversary’s goal is to find a specific
ARQ2 client who is currently interacting
on the network by causing targeted outages of entry Providers after the
target service receives a protocol message. To start, half of the entry
Providers are allowed to receive messages. If the adversary observes a
retransmission then it confirms the client is in the group of entry
Providers that we blocked messages to. The adversary continues the
binary search and finds the client’s entry Provider in log(n) time.
The adversary must have access to a target
mixnet service so as to distinguish a message transmission versus a
retransmission. The adversary must also be able block messages from
going to specific mixnet nodes, in this example, entry Providers.
Attack Countermeasures
Here we describe the attack countermeasures currently used by the
Katzenpost mix network software design.
Intersection Attacks
Attack description:
Intersection attacks, also known as long term statistical disclosure
attacks have two basic categories:
The Adversary learns to whom Alice sends messages.
The Adversary learns who sends Alice messages.
Statistical disclosure attacks work to some extent on all anonymous
communication networks. The Katzenpost client and Katzen messaging
protocol is designed to provide partial defense against long-term
intersection attacks as well as sufficient defence against short-term
timing correlation attacks.
The simplest form of this attack assumes a global passive adversary
who watches Alice’s interactions with the mix network. Whenever Alice
sends a message, a set of potential recipients are noted by observing
which clients receive a message shortly after Alice sends her message.
After many hours, days or weeks of noting these sets of potential
recipients, an intersection among these sets may reveal the set of
recipients Alice sends messages to.
The classical mix network literature has described intersection
attacks in terms of a mix network where a passive network observer can
watch individual clients receive messages. This assumption can be
otherwise stated that the adversary observes all the inputs and outputs
of the mix network and thus receives a high granularity of statistical
information.
countermeasure
Katzenpost and the Katzen messaging protocol are designed to provide
partial defense against intersection attacks. Complete defense is not
practical because user behavior is often repetitive and they cannot stay
connected to the mixnet forever. Attack success depends largely on the
adversary’s ability to predict user behavior. If user’s behavior is
overly repetitive this may lead to the success of such attacks.
Although the Katzenpost continuous time mixing strategy provides
defense against short term timing correlation attacks, additional
defense mechanisms are required to defend against longer term
attacks:
async message queueing and retrieval at the network edge
traffic padded message retrieval
loop decoy traffic
uniform traffic patterns (all sent messages result in a SURB
reply)
The Katzenpost chat protocol known as Katzen, uses an additional
network route to provide another indirection to protect the network
location of clients. In other words, while Katzen clients connect to the
mixnet using a randomly selected entry Provider, they retrieve messages
from a different Provider mix node on the network; message retrieval is
done by means of a Sphinx SURB, single use reply block which is sent to
the messaging queue service so that a reply containing a message payload
can be sent back to the client, anonymously. All sent messages result in
a SURB reply being sent back to the client.
Katzenpost clients periodically send loop decoy messages; these
Sphinx packets are sent to a randomly selected Provider whose echo
service sends the client’s packet payload back to the client via the
attached SURB. However, loop decoy messages are only distinguishable
from normal messages to the client that receives them. Passive network
observers will not be able to tell the difference. These decoy loops are
uniformly distributed among all of the Providers (AKA service/exit mix
nodes).
Whenever clients retrieve messages from their locally connected entry
Provider, they do so using a traffic padded protocol that either sends
them 0 or 1 message where both outcomes are indistinguishable from the
perspective of a passive network observer.
n − 1
Attacks
attack description:
An n − 1 attack is a multi
stage attack where the adversary observes a target message enter the
mixnet and must perform the attack in order to follow the message to the
next hop. The n − 1 attack is
performed repeatedly for each hop in the route in order to discover the
final destination.
Although the adversary could simply compromise each mix node in the
route starting with the first hop, that is the compulsion attack
category and is a distinct attack category from the n − 1 attack category. The n − 1 attack is performed by the
adversary compromising upstream routers so that they have the capability
of watching messages enter the target mix, blocking any of those
messages if they choose to, and sending messages of their own into the
target mix node. By using these capabilities the adversary is able to
manipulate mix nodes so that they only contain the target message and
messages sent by the adversary.
For a good introduction to n − 1 attacks, please see . In the
context of continuous time mixing strategies like "Stop and Go" and
Poisson , the n − 1 attack is
performed by the adversary blocking or delaying (although delaying
obviously wouldn’t work for Stop and Go) incoming messages ahead of time
so that they are reasonably certain the mix is empty before the target
message enters the mix.
When the target message enters the empty mix, it is artificially
delayed by the mixing strategy and then routed to the next hop. The
adversary gets to observe where the message is going for it’s next hop
because they are reasonably sure that the message exiting the mix,
although it is bitwise unlinkable because of the cryptographic
transformation, it must be the same message.
countermeasure
: Katzenpost currently does not have any countermeasures in place for
n − 1 attacks. See Future
Countermeasures section below.
Epistemic Attack: route
fingerprinting
attack description:
A route fingerprinting attack is when the adversary is able to
identify a client by the specific route being used.
countermeasure:
Katzenpost doesn’t allow clients to have a partial view of the
network. The directory authority system publishes the full network view
to be cached by the edge nodes, Providers, so that clients can retrieve
them.
Denial of Service
attack description:
Sending many packets into the mix network can cause the mix nodes to
become overwhelmed and begin dropping packets. The logical conclusion to
this scenario is that there is effectively a network outage until the
adversary stops sending so much traffic.
countermeasure:
Rate limiting individual clients is the current countermeasure.
However this only stops the DOS attack from being conducted by a single
client entity. However the adversary could still DOS the network by
using many clients to send packets.
Sybil Attack
attack description:
The adversary plants a large number of malicious nodes, and is
therefore able to glean partial or complete information to follow a
message through the mix network.
countermeasure:
We mitigate Sybil attacks by preventing mix nodes from automatically
joining the network. A prerequisite for joining the network is to have
all the directory authorities add the new mix node’s connection
information and public cryptographic key material to their
configuration. Please see the Future Countermeasures section below for a
discussion of additional directory authority features including a
reputation system.
Compulsion Attack
attack description:
The adversary compels enough honest node operators to disclose
information to follow a message through the mix network.
countermeasure:
Our current countermeasure for the compulsion attack is frequent mix
key rotation, every 20 minutes. See Future Countermeasures section
below.
Timing Attacks
attack description:
An active adversary manipulates the timing of the packets passing
through compromised routers, or passive adversary exploits timing
information that is leaked despite padding.
Currently, there are no known timing attacks against any Katzenpost
protocols. Timing correlation attacks are already covered in the
intersection attack category. And although all mix network protocols
leak statistical information no matter what countermeasures are used, we
posit that this leaked statistical information isn’t really the same
thing as traditional timing attacks against a cryptographic system. In
fact, the mix network is actively preventing timing attacks injecting
latency into the system.
countermeasure:
No known timing attacks and therefore no countermeasure.
Cryptographic Attacks
attack description:
There are no known cryptographic attacks against Katzenpost core
protocols (sphinx, noise, dirauth). However we explore theoretical
cryptographic attacks in the Cryptographic Protocols section below.
countermeasure:
All core Katzenpost protocols make use of hybrid post quantum
cryptographic constructions which in theory protect against active
quantum adversaries.
Endpoint Security Attacks
attack description:
The adversary breaches the security of a user’s device via an attack
not directly related to the mixnet.
countermeasure:
There are no countermeasures provided by Katzenpost for endpoint
security because it’s considered an orthogonal concern.
Tagging Attack
attack description:
The Sphinx cryptographic packet format allows for a one bit tagging
attack under certain circumstances. The reason for allowing the design
to have this security defect is to allow for the Single Use Reply Block.
The Sphinx header is MAC’ed but the packet body is not. Instead, the
body is encrypted with a wide-block cipher (an SPRP). This ensures that
an expected verification block in the beginning of the plaintext can be
used to verify the plaintext in the final decryption. If a bit in the
payload ciphertext gets flipped then the payload decryption will yield
garbled results and the expected verification block will not be present.
Therefore in order to make use of this to perform a tagging attack, the
adversary must have access to the result of the payload decryption as
well as the ability to tag the packet some number of hops earlier in the
route. We call this a one bit tagging attack because it yield one bit of
information: Either the verification block was destroyed or not.
In Katzenpost there are two ways to use Sphinx to send a payload.
Forwards packets and SURB reply packets. Both of these Sphinx packet
types are susceptible to a one bit tagging attack:
tagging attack
against forward Sphinx packets:
Clients send forwards Sphinx packets to mixnet services which reply
via a SURB in the payload. Let’s say an adversary "tags" a forward
Sphinx packet sent by Alice. The adversary would have to compromise or
collude with the service Providers on the mixnet in order to witness the
forward packet payload decryption failure which indicates the tag.
tagging
attack against SURB replay Sphinx packets:
If an adversary "tags" a SURB reply which a mixnet service sends to a
client, then only the client will be able to witness the packet payload
decryption failure. The adversary would have to compromise the client’s
endpoint device to witness this event (or to compromise the key
materials allowing them to compute the failed payload decryption
themselves).
countermeasure:
In the context of a forward Sphinx tagging attack on Katzenpost, the
adversary must compromise or collude with the destination service
Provider. If that’s the case then attack allows the adversary to learn
which Provider node and service the packets was destined for. Although
this is valuable information in the context of the current Katzen
protocol, see the Future Countermeasures section below for a discussion
of how we plan to mitigate intersection attacks in the future because it
also carries over to much greater defense against this forward payload
tagging attack.
countermeasure:
We could encode the last hop’s Sphinx routing command, inside the
Sphinx payload instead of the header. This would provide short term
plausible deniability in the sense that an adversary conducting a
tagging attack would be destroying the routing information so that they
cannot know if the packet was a decoy or not.
Replay Confirmation Attack
attack description:
If a Sphinx packet were allowed to be replayed then the adversary may
send the packet many times concurrently in order to observe the traffic
burst in another section of the network.
countermeasure:
Katzenpost mix nodes maintain a relay cache which prevents Sphinx
packets from being replayed. This cache doesn’t grow forever since it’s
only kept until the end of the epoch which are currently only a 20
minute duration.
SURB Confirmation Attack
attack description:
If a client sends many SURBs to another entity on the network, that
entity may choose to send out ALL the SURBs at once in order to observe
the traffic burst in another part of the network. This works as an entry
node discovery attack.
Although currently, all Katzenpost protocols only send one SURB at a
time, this attack still applies if the adversary accumulates enough
SURBs to form a visible traffic burst within the mix network.
countermeasure:
No countermeasure. See Future Attack Countermeasure section below for
the discussion of how to countermeasure this attack.
ARQ Confirmation Attack
attack description:
See above table entry for ARQ confirmation attack description.
countermeasure:
Currently, no countermeasure.
Predecessor Attack
attack description:
A bad route is defined as a route in which every node is compromised.
The goal of such an attack is to link a given client with a specific
destination or service on the destination node. This attack is also
known as the Predecessor Attack and is detailed in with many variations
for all the different types of anonymous communication networks. In the
context of the Katzenpost mixnet, the Predecessor Attack is performed by
the adversary compromising at least one node in each routing topology
layer. Clients using the mixnet will eventually select a bad route.
countermeasure:
Fundamentally, we have two choices, either we have clients select a
new route for each message sent or they select one route and use that
for some time duration. In the former, every time a message is sent, the
probability of selecting a bad route is increased. Whilst in the later,
if a client selects a bad route they use it many times, but the
probability of selecting a bad route is reduced.
Yet another countermeasure is to design the mixnet protocols such
that they use a new destination for each message using some kind of
private deterministic permutation achieving a uniform distribution of
message amongst the destination mixnet nodes and their message slots. We
have chosen this last countermeasure for Katzenpost and it will be
detailed elsewhere in our literature.
Future Countermeasures
Intersection Attacks
The new Katzen protocol is sometimes referred to as scatter queue.
Two communicating parties each exchange shared secrets which they use to
determine a new "mailbox" for each message. To be clear, this new
protocol is an improved revision of the previous Katzen protocol where
each party chooses their own "mailbox" (queue Provider + queue ID); the
difference here is that instead of the two parties exchanging mailbox
locations they exchange seeds which are used to deterministically generate
mailbox locations for each message.
This new protocol still uses all four previously mentioned mechanisms
to achieve countermeasure against intersection attacks however the new
"scatter queue" design drastically reduces the amount of metadata which
can be collected by the operators of the mailbox Provider mix nodes. We
think this is a huge improvement to the threat model. But it would be
great if we could quantify the improvement using various anonymity
metrics. Firstly, Shannon entropy seems applicable here because we can
make statements like "compared to the old protocol, scatter queue
increases the entropy on Providers where malicious adversaries are
trying to correlate communicating party sets with messages arriving at
specific mailboxes"; the new protocol makes this infeasible.
Therefore we can say that the new Katzen messaging protocol mitigates
or partially mitigates intersection attacks by means of five
mechanisms:
async message queueing and retrieval at the network edge
traffic padded message retrieval
loop decoy traffic
uniform traffic patterns (all sent messages result in a SURB
reply)
scatter queue
n − 1
Attacks
Here we will attempt to describe a partial countermeasure wherein
clients receive statistical information from the network which is
cryptographically signed by it’s authors. Client use this data to decide
if there’s an ongoing n − 1
attack, if there is they disconnect from the network and try again
later.
There are two sources of information about n − 1 attacks:
mix loops
client loops
Mix loops vs client loops
In theory mix loops can detect n − 1 attacks in the context of a
continuous time mix. Such an attack means the adversary is dropping or
delaying messages before they enter the mix. Therefore the mix
originating loop decoy message can function as a sort of heartbeat
protocol that allow the mix to detect n − 1 attacks. Obviously this mix
loop decoy message might get dropped by the network for various reasons
that have nothing to do with an n − 1 attack. The red green blue
heartbeat mixnet paper (by george) suggests the countermeasure of the
individual mixes halting their routing of messages temporarily to thwart
the n − 1 attacks. This would
work but it would also probably create unnecessary outages. Instead we
want a system that let’s the client software decide whether or not there
is an ongoing n − 1 attack.
Clients can also detect such attacks with their own end to end loop
decoy messages. However we want the mixes to publish a signed
certificate containing their mix loop statistics. Client will then
download these mix loop statistics from the providers and they will use
those statistics along with their own client loop statistics to make
decisions with regards to n − 1 network status.
TODO: add detailed description of client heuristics for
deciding if there’s an n-1 attack
Core Cryptographic Protocols
Katzenpost consists of three cryptographic protocols:
PKI/Dirauth
PQ Noise
Sphinx
Katzenpost is an overlay network meaning that we aren’t trying to
replace IP (internet protocol). Overlay means we build protocol layers
that sit on top of existing Internet protocols. Currently Katzenpost
works over TCP/IP however in the future we plan to support QUIC/IP as an
optional transport that can be selected.
Katzenpost uses a PQ Noise based protocol known as the Katzenpost
wire protocol, which provides point to point transport security and
authorization. The wire protocol enforces the mix network’s topology
whereby the clients are only allowed to connect to gateway nodes,
gateway nodes are only allowed to send packets to layer 1 mixes, and
layer 1 mixes are only allowed to send packets to layer 2 mixes etc.
Clients use the wire protocol to talk to gateway nodes to whom they
send Sphinx packets. These Sphinx packets are encapsulated within the
encrypted PQ Noise messages and are therefore never exposed to passive
network observers but if they were there wouldn’t in principle be any
problem with that. This redundancy in security is often referred to as
"defense in depth".
Besides within the mixnet itself, the wire protocol is also used to
directly communicate with the directory authorities. Gateway nodes
retrieve the latest PKI document from the directory authorities and
cache the document for the epoch duration so that clients can download
the cached copy. This is a notably different use case because within the
mixnet we should have the goal of padding all the wire protocol commands
to be the same size. Whereas when gateways nodes download the consensus
they are likely receiving PKI documents which are perhaps many times
bigger than our Sphinx packet size.
The PKI/Directory authority protocol stands apart from the rest
because it’s the root of all authority within the mix network. The PKI
provides the network participants with all the connection information
and key materials they need to use the other two protocols, PQ Noise and
Sphinx. It does so by publishing a PKI document every epoch (currently
20 minutes). This is necessary because the mixes destroy their old mix
keys and create new mix keys for each new epoch thereby reducing the
window for compulsion attacks to the epoch duration.
Both the PQ Noise based wire protocol AND our Sphinx protocols are
considered to be transport protocols. However the dirauth as the 3rd
cryptographic protocol here refers to two aspects:
The client and mixnet interactions with the dirauth system; That
is, the pki document itself it signed by a majority of the dirauth nodes
AND the pki document contains the mix descriptor for each mix node in
the network. The document also specifies the topology. Mix nodes and
clients verify these cryptographic signatures.
The dirauth’s crash fault consensus cryptographic protocol for
publishing new PKI documents every epoch.
Katzenpost PKI / Directory
Authority
The dirauth system has voting protocol rounds where each
party exchanges votes with every other party.
The public key infrastructure (PKI) protocol for Katzenpost, also
known as the Directory Authority or dirauth, is a decentralized system
of nodes which vote for each epoch’s consensus document. If we used a
BFT protocol instead then the dirauth system would fail when 1/3 + 1
nodes failed. Therefore we can say that our crash fault tolerant system
is more robust because it will fail when 1/2 + 1 nodes fail.
The Katzenpost PKI is the security root of the entire system because
all clients and network nodes will depend on the PKI to sign the
consensus document for each epoch. Currently epoch duration is every 20
minutes. The consensus document is essentially a view of the network, it
contains all the connection information and all the public cryptographic
key materials and signatures. Each mix node signs it’s descriptor and
uploads it to the dirauth nodes. Each dirauth node signs the consensus.
When clients or nodes download the consensus document they are able to
verify the dirauth node signatures on the document.
Currently we use a hybrid signature scheme consisting of the
classical Ed25519 and the post quantum stateless hash based signature
scheme known as Sphincs+ with the parameters: ‘sphincs-shake-256f‘
The Katzenpost Noise
Protocol Layer
Early versions of Katzenpost used the Noise cryptographic protocol
framework; however we used an HFS (hybrid forward secret) variation of
XX handshake that used a post quantum KEM however it could not resist
active quantum adversaries since the initial keys exchanged were
classical ECDH public keys. Such constructions offer protections against
current classical adversaries that record ciphertext transcripts in
hopes of breaking them in the future with a cryptographically relevant
quantum computer.
More recently, Katzenpost was made to use PQ Noise from the paper,
entitled, Post Quantum Noise . The paper shows us that we can
algebraically transform existing classical Noise handshake patterns into
post quantum handshake patterns by replacing all usages of ECDH with
KEM. In some of these transformations there’s additional network
interactions implied.
Our current, hybrid KEM uses our security preserving KEM combiner and
the NIKE to KEM adapter (ad hoc hashed el gamal construction). Our Noise
protocol string is:
Noise_pqXX_Kyber768X25519_ChaChaPoly_BLAKE2b
Which means that our PQ Noise protocol uses the following
cryptographic primitives:
We use the PQ Noise handshake pattern known as pqXX
which is expressed in the PQ Noise pattern language like so:
-> e
<- ekem, s
-> skem, s
<- skem
Expressed as a sequence diagram, pqXX looks like this:
pqXX sequence
Client sends there ephemeral public key (e).
Server sends it’s static public key (s), encrypted with the KEM
ciphertext (ekem) keyed to client’s public ephemeral key.
Client sends their static public key (s) encapsulated via KEM
ciphertext (skem) keyed to server’s static public key.
Server sends a KEM ciphertext (skem) encapsulated using the
client’s static public key.
future improvement, option 1:
Remove the "retrieve message" command which client’s use to poll for
new messages. Instead the client - server Noise protocol should be
designed such that clients periodically receive messages from the server
without requesting or polling for them. If no message is present in the
message queue on the server then the server will send the client a decoy
message.
future improvement, option 2:
Replace the "retrieve message" command with a "send and retrieve"
command whereby every time the client sends a message they also receive a
message. As per usual, perhaps some of the messages send and received
are decoy messages.
Classical Sphinx and
Post Quantum Sphinx
The original Sphinx paper introduces the Sphinx nested encrypted
packet format using a NIKE 3. NIKE Sphinx can be a
hybrid post quantum construction simply by using a hybrid NIKE. Our
Sphinx implementation also can optionally use a KEM 4
instead of a NIKE, however the trade-off is that the packet’s header
will take up a lot of overhead because it must store a KEM ciphertext
for each hop. Katzenpost has a completely configurable Sphinx geometry
which allows for any KEM or NIKE to be used.
The Sphinx cryptographic packet format also uses these additional
cryptographic primitives, the current Katzenpost selection is:
stream cipher: CTR-AES256
MAC: HMAC-SHA256
KDF: HKDF-SHA256
SPRP: AEZv5
In Katzenpost the dirauths select the Sphinx geometry, each dirauth
must agree with the other dirauths. They publish the hash of the Sphinx
Geometry in the PKI document so that the rest of the network entities
can validate their Sphinx Geometry. At the time of writing the namenlos
network still uses classical Sphinx with the following geometry:
In the Katzenpost implementation of Sphinx, we MAC an unencrypted two
byte region at the beginning of the Sphinx packet; This additional data
region is to be used to match Sphinx version numbers.
Mixnet Attack Trees
Compromise Mix Node physical Access compromise human operator social
engineering threat of violence blackmail large money bribe legal action
police action military action compromise software remote code execution
vulnerability compromise software upgrade pipeline malware USB stick
mail interdiction evil maid attack
attacker’s goal is to compromise a mix node
The above attack tree consists of all OR nodes because each of the
leaves are alternative ways to achieve the sub-goal expressed by their
branch which in turn, each branch, e.g. physical access, compromise
human operator, compromise software are each alternatives to the overall
goal of compromising the mix node.
,,,, ,,
10 -
A high-level introduction to the Pigeonhole protocol for application developers
Understanding Pigeonhole
Pigeonhole is the storage layer of the Katzenpost mix network. It
lets applications communicate anonymously using encrypted,
append-only streams. From a passive network observer’s perspective
there is no consistent stream access and instead everything looks like
randomly scattered queries across storage servers.
This is the document to read first. Having understood the concepts
here, proceed to the how-to guide for
task-oriented recipes, and consult the
API reference for the precise
signatures.
For the privacy properties and the adversary they are designed to
withstand, see the threat model and the
Echomix paper; a passive network
observer learns only that scattered, unlinkable queries traverse the
storage servers, never which messages belong to one stream nor who
reads them. For protocol details, see the
Pigeonhole specification and sections
4-5 of the paper. Note that the published threat model is an evolving
work in progress and does not yet incorporate the newer designs the
paper introduces.
Terminology
A short glossary of the terms used throughout this document. Each is
elaborated in its own section below.
Box. The unit of storage. A box is a fixed-size, encrypted, signed
ciphertext addressed by a pseudorandom identifier. Once written, a
box’s contents are immutable except by tombstoning.
Stream. An ordered, append-only sequence of boxes addressed by a
pair of capabilities; the analogue of a single-writer log. The terms
stream and channel are used interchangeably throughout these
documents and the API: they denote one and the same thing.
Write cap. The capability that grants the right to sign and write
messages to a stream and to derive the read cap from itself.
Read cap. The capability that grants the right to read and verify
a stream’s contents. It cannot write or tombstone.
Tombstone. A signed empty payload that overwrites a box’s
contents; the write cap holder uses it to retire prior messages.
BACAP. Blinding-And-Capability scheme. The Ed25519 key-blinding
construction that ties write caps, read caps, and indices to
unlinkable pseudorandom box identifiers.
MKEM. Multi-recipient KEM. The envelope encryption scheme that
delivers each request to all of a box’s replicas without revealing
the box ID to the courier.
Sphinx. The mixnet packet format used to route every request
through three layers of mix nodes before it reaches the courier.
SURB. Single-Use Reply Block. A pre-built Sphinx return path that
allows the courier to reply to a request without learning the
client’s location.
Courier. A service running on a Katzenpost service node. The
courier mediates between clients and replicas, maintaining a
fixed-throughput stream of envelopes so that traffic patterns cannot
be inferred from the wire.
Replica. A storage server. Each box is sharded across K=2 replicas
via consistent hashing; replicas mirror writes to their shard peer to
maintain redundancy.
Katzenpost epoch. The network’s wire-protocol epoch, lasting
20 minutes by default (the duration is a deployment parameter and a
given network operator may configure it otherwise). It governs the
directory authority consensus, mix key rotation, and Sphinx routing. It
does not govern how long a box’s contents survive, and a reader need
not concern themselves with it.
Replica epoch. A separate, much longer epoch lasting one week,
used solely by the storage replicas. Replicas rotate and garbage-collect
their storage keys on this weekly schedule; it is this epoch, not the
20-minute network epoch, that determines the lifetime of stored data.
Pigeonhole Streams
All communication happens through Pigeonhole streams (also called
channels). A stream is an ordered, append-only sequence of encrypted
messages, known as boxes. These boxes are stored in the storage servers
using a hash based sharding scheme. Boxes have a fixed-size maximum
payload and are padded; the exact size is set by the pigeonhole
geometry, described below.
Append-only and immutable. Once a message is written to a
box, it cannot be overwritten by another write – the replica
rejects the second write with BoxAlreadyExists. New messages
are appended at the next index. The only exception is tombstones:
a tombstone unconditionally replaces the box contents with an
empty signed payload, regardless of whether the box previously
held data.
Single-writer, multi-reader. One writer, any number of readers.
Nothing in the protocol inherently forbids multiple writers; the
constraint arises because each index addresses a box that may be
written only once (tombstones aside). Two writers therefore have no
agreed answer to the question of which index each should write to: were
they both to target the same index it would be a race, and whoever
writes first wins while the other’s write is rejected with
BoxAlreadyExists. Avoiding that requires out-of-band coordination, so
in practice a stream has a single writer. Two-way communication is
arranged as two streams, one in each direction.
Durable. Each message is replicated across multiple storage
nodes. Currently, set to 2 storage nodes per shard.
Ephemeral. Storage is keyed to the replica epoch, which lasts
one week (not to be confused with the 20-minute Katzenpost network
epoch). Replicas retain the current and the preceding replica epoch and
garbage-collect anything older, so a box’s contents survive for roughly
one to two weeks; nothing written to a box outlives that window.
Per-replica-epoch storage. Capabilities are cryptographic
credentials independent of any epoch schedule and continue to work
indefinitely, but the data at any given box location does not carry
across a replica epoch transition (the weekly schedule, not the
20-minute network epoch): at the start of a new replica epoch the box
reads as empty, even though the capability that addresses it is
unchanged. Workflows that need data to outlive the replica epoch in
which it was written must arrange to re-emit it (the copy command,
described below, is the usual instrument for doing so atomically).
Unlinkable. Storage servers cannot tell which messages belong
to the same stream.
Box size and the pigeonhole geometry
Every box carries a fixed-size payload, and a message is padded up to
that size before storage so that all boxes on the wire are
indistinguishable by length. A message larger than one box’s payload
must be split across several boxes; one smaller is padded.
The exact size is not a hard-coded constant but a property of the
pigeonhole geometry, a deployment parameter rather than something
the application chooses. It is the operators of the mix network who
set the geometries: they choose the network’s Sphinx geometry, and the
pigeonhole geometry is derived from, and must be consistent with, it.
A Pigeonhole request travels inside a Sphinx packet, so the usable box
payload is whatever the deployed Sphinx geometry allows once the
envelope and protocol overheads are subtracted. A network running a
larger Sphinx packet has a correspondingly larger box payload; one
running a smaller Sphinx geometry has a smaller one.
The kpclientd daemon’s configuration file must carry these same
correct Sphinx and pigeonhole geometries, matching the network it is
to connect to; a daemon configured with the wrong geometries cannot
speak to that network. The thin client itself holds no such
configuration. When a thin client connects to kpclientd, the daemon
sends it the geometries over the local socket during the connection
handshake. An application should therefore treat the box payload size
as a value obtained at runtime from the daemon, not a fixed number to
hard-code. See the API reference
for how each binding exposes it.
Cryptographic Capabilities
Pigeonhole is a messaging system built from couriers and storage
replicas. The capabilities described here come from BACAP, the
Blinding-And-Capability scheme it relies upon. (In the
Echomix paper, BACAP is the
subject of section 4 and Pigeonhole of section 5.)
Access to a stream is controlled by two BACAP capabilities:
The write cap can write messages, create tombstones, and derive
the read cap. Only the write cap holder can sign and encrypt messages, and
storage servers verify these signatures and store the signature and ciphertext.
The read cap can read, verify signatures and decrypt messages. It cannot
write or tombstone.
Creating a stream produces a write cap, a read cap, and a starting
index. The writer keeps the write cap. To grant someone read access,
share the read cap and index with them out-of-band. That is the
fundamental operation: create a stream, share the read cap.
Multiple readers can hold the same read cap independently.
A capability, once shared, cannot be revoked. There is no mechanism to
withdraw a read cap from someone who holds it: anyone in possession of
it can read every box the stream has produced and will produce, for as
long as that data survives. If a reader must lose access, the only
remedy is to abandon the stream entirely, create a fresh one, and
redistribute the new read cap to the readers who remain. Plan key
distribution with this permanence in mind.
A first interaction: Alice writes, Bob reads
To anchor the abstractions above, here is the smallest useful
interaction: one writer (Alice) and one reader (Bob).
Alice creates a stream. She generates a 32-byte random seed and
calls new_keypair (Go: NewKeypair). The thin client returns a
write cap, a read cap, and the first message index.
Alice shares the read cap with Bob. She hands him the read cap
and the first message index by any means independent of the mixnet,
for instance a QR code in a face-to-face meeting, or a separate
signal channel. Anyone with the read cap can decrypt the stream;
anyone without it sees only pseudorandom traffic.
Alice writes a message. She calls encrypt_write with her write
cap, the current index, and the plaintext, then sends the resulting
envelope via start_resending_encrypted_message. The daemon
dispatches the envelope through three mix layers, the courier
receives it, and the courier writes the box to both of its replicas.
Once the call returns Alice advances her local index.
Bob reads the message. He calls encrypt_read with his copy of
the read cap and his current index, dispatches the envelope, and
waits for the daemon to return the decrypted plaintext.
Alice writes again. She repeats step 3 with the next index. Bob
reads in step 4 at his own pace, advancing his own index. The two
indices are independent: Alice never blocks on Bob, and Bob can fall
behind without losing messages until the replica-epoch garbage
collection window (roughly one to two weeks) expires.
A two-way conversation is therefore two streams, one in each direction,
because every stream has exactly one writer. The
how-to guide shows the equivalent code in
Go, Rust, and Python.
What the client daemon does
Your application talks to a local daemon (kpclientd) through a thin
client library. The daemon handles all cryptography (BACAP
encryption and decryption, MKEM envelope encryption and decryption,
signature creation and verification, payload padding and unpadding),
Sphinx packet construction, courier selection, and automatic
retransmission (ARQ). Your application creates streams, tracks
indexes, and persists state for crash recovery.
Most API calls are local crypto operations with no network traffic.
Only StartResendingEncryptedMessage and StartResendingCopyCommand
touch the network.
Consistency and timing
Pigeonhole offers no read-after-write ordering guarantee across
participants, and an application that assumes one will misbehave.
Reading ahead of the writer. A reader who reads an index the
writer has not yet written to receives BoxIDNotFound. This is not
an error but the expected answer to “has anything been written here
yet?”: it simply means “not yet.” The reader should wait and ask
again rather than abandon the stream. The thin client exposes
IsExpectedOutcome precisely so that an application can tell this
benign outcome apart from a genuine failure; the
how-to guide gives the polling pattern.
Replication lag. Each box is written to both of its K=2
replicas, but the two are not updated in the same instant. A read
dispatched in the brief interval after a write reaches one replica
but before its peer has caught up may transiently return
BoxIDNotFound even though the write has in fact succeeded. The
remedy is the same: retry. A handful of retries spaced over a few
seconds is normally ample.
Read-delay randomisation. The courier maintains a
fixed-throughput connection to the replicas, decoupled from the
rate at which clients submit requests. A read therefore returns
after a delay that bears no exploitable relation to when the data
was written or requested. This randomised latency is a deliberate
privacy property, not a defect: do not design protocols that depend
on a read completing within a tight deadline.
The practical consequence is that every read should be written as a
poll with bounded retry, treating BoxIDNotFound as “try again
shortly” until either the data appears or an application-level timeout
elapses.
Copy commands
Copy commands exist for when you need to atomically write more than
one box to one or more streams that already exist and are already
known to other entities. The writer creates a temporary stream,
packs all the destination writes into it, then sends a single copy
command to a courier. The courier reads the temporary stream,
executes all the writes to their destination streams, then
overwrites the temporary stream with tombstones. Either all the
destination writes succeed or none of them are visible to readers.
Use cases: sending to a group, backfilling messages for an offline
reader, atomically tombstoning old messages while writing new ones.
Tombstones
A tombstone is a Pigeonhole box which is signed with an empty
ciphertext. Unlike normal writes, which are rejected if the box
already exists, a tombstone unconditionally replaces the box
contents. Only the write cap holder can create tombstones. Tombstones
can be sent directly or bundled into a copy command for atomic
deletion.
Protocol Composition
Many different protocols can be composed using these Pigeonhole streams.
For example, in our Group Chat Design each
group participant creates their own channel to write to. Each of the participants
shares their own channel’s ReadCap with the other group members. Therefore
each group member monitors and reads from all the other channels in the group.
Each of them can simply write to their own channel in order to write a message
to the group.
Fundamentally, protocols are composed by creating channels and sharing the read caps
to those channels. When writing to a channel the entity doing the writing must keep
track of the current index into the channel. This is the reason why channels are single
writer, multi reader; because without coordination, multiple writers would be racing to
write first before any of the others write to a specific index.