This is the multi-page printable view of this section. Click here to print.

Return to the regular view of this page.

Documentation

Creative Commons Attribution-ShareAlike 4.0 International License

These works are licensed under the Creative Commons Attribution-ShareAlike 4.0 International License.

Supported by

Flag of the EU

This project has received funding from the European Union’s Horizon 2020 research and innovation programme under the Grant Agreement No 653497, Privacy and Accountability in Networks via Optimized Randomized Mix-nets (Panoramix).

1 - Intro

These works are licensed under the Creative Commons Attribution-ShareAlike 4.0 International License.

1.1 - Frequently Asked Questions

What is a mix network?

A mix network is an unreliable packet switching network which resists traffic analysis. Mix networks can be designed to provide various anonymity properties such as:

  • sender anonymity
  • receiver anonymity
  • sender and receiver anonymity with respect to third party observers

Can Katzenpost act as a drop-in TCP replacement?

No and furthermore you should not want a stream oriented interface for interacting with a message oriented protocol. If your application is message oriented then integration as a Katzenpost client is possible. Client protocol libraries are currently being developed!

Although decryption mixnets are inherently unreliable and offer unordered delivery, reliability and ordered delivery can be achieved by the protocol library and NOT by the mix network itself. That is to say, the mixnet client library should abstract away all the details of retransmissions and book keeping related to ordered delivery.

What kinds of applications can use Katzenpost?

Katzenpost can be used by various kinds of message oriented applications. Generally these fall into two categories:

  1. peer to peer: Alice can chat with Bob over the mixnet. In this case there’s a protocol library that let’s them send and receive messages with a Katzenpost specific addressing sytem. In this case the mixnet acts as a transport for Alice and Bob's interactions.
  2. client to server: Alice can interact with a service that listens on the mixnet for mixnet messages. This means there is a client and a server component and they use the mixnet as a transport for their interaction.

What are some examples of “client to server” mixnet applications?

  • Clients send URLs to a “retrieval service” on the mixnet. This service retrieves the URLs and sends the content back to the client.
  • Privacy preserving crypto currency wallet sends crypto currency transactions to the mixnet service. This service then submits the transaction to the blockchain.

What are some examples of “peer to peer” mixnet applications?

  • Encrypted chat applications can use the mixnet as the transport.
  • File exchange: Alice can send Bob a file using the mixnet as the transport.

What is Loopix?

Loopix is described in the paper “The Loopix Anonymity System” published at USENIX 2017, https://arxiv.org/pdf/1703.00536.pdf

Briefly, Loopix uses a collection of the best mix network designs to create a messaging system that has the property of sender and receiver anonymity with respect to third party observers. This particular meaning of "anonymity" is remarkably different than what Tor provides. Loopix does not have strong location hiding properties nor does it provide sender anonymity or receiver anonymity. That having been said it should be possible to create such systems based on the Loopix design.

The Loopix design is informed by over 15 years of mixnet literature and strives to reduce many kinds of metadata leakages that historically have made mix networks vulnerable to long term statistical disclosure attacks. Loopix has a defense against blending/n-1 attacks. Loopix explores the tradeoff between decoy traffic and latency, thus revitalizing mix networks with much lower latency for message transportation.

Loopix uses the Sphinx cryptographic packet format, the Poisson mix strategy, three kinds of decoy traffic and the stratified mix topology. It's Provider architecture forces long term statistical disclosure attacks to take place on the receiver's Provider, thus forcing such adversaries to actively compromise Providers instead of passively observing the mix network which is in contrast to historical mix network designs.

What is Katzenpost?

Katzenpost has the goal of implementing the Loopix designs with the additional property of message transport reliability. Reliability is achieved using a Stop and Wait ARQ which is half duplex and uses SURBs to create the reply channel for the ACKnowledgement messages.

Why is this a big deal? To our knowledge, no other mix network design has attempted to achieve reliability. We believe that the lack of reliability has been one of the major obstacles to the adoption of mix networks. “Would you want to use a messaging system which might not even transport your messages to their destination?”

How are mix networks different from Tor?

Tor is stream oriented. Mixnets are message oriented. Tor is low latency, easy to use, has a great primary application (Tor Browser), and functions as an extremely useful general purpose anonymity system. This is in contrast to mix networks which do not function well as general purpose anonymity systems. Instead mix networks are better suited to customization for specific applications, for example a mix network for instant messaging and a mix network for e-mail will have different traffic patterns and therefore require different decoy traffic patterns to achieve the desired traffic analysis resistant properties.

There are also many adversarial model differences between Tor and mix networks. For example, Tor can be easily deanonymized by statistical correlation attacks by a sufficiently global adversary whereas mixnets are not immediately vulnerable to these kinds of attacks if they correctly use mix strategies and decoy traffic.

Both Tor and mix networks can scale well with respect to increasing user traffic, however Tor requires route unpredictability to achieve it’s anonymity properties. Mix networks on the other hand do not require route unpredictability and therefore can achieve very strong anonymity properties with far fewer network nodes than Tor.

How do mix networks compare to Pond?

Pond doesn’t actually mix anything whereas mix networks specifically contain component mixes, each containing a mix queue which “mixes” messages together via some specific mix strategy before sending them to the next hop in the route. Pond uses a group signature scheme to prevent the server from learning to whom a message is being sent to. Pond uses Tor onion services as it’s transport while also using decoy traffic to prevent a passive network observer from determining when a user sends a message. Mix network designs can also use decoy traffic, however in the Loopix design there are three different kinds of decoy traffic that serve different purposes. Mix networks also scale much better with respect to increasing users and traffic whereas pond servers quickly become performance bottlenecks. This is in contrast to mix networks where additional mixes can be added to the network in order to efficiently process increases in user traffic.

How does Vuvuzela differ from Loopix/Katzenpost?

Vuvuzela uses the cascade mix topology which does not scale well with respect to an increase in user traffic. Loopix uses the stratified topology which scales very well. In Vuvuzela, messages cannot be received when a user is offline. In Loopix messages received while a user is offline are queued by their Provider. Vuvuzela operates in rounds whereas Loopix does not. Vuvuzela does not provide reliable message transportation whereas Katzenpost does.

How does AnonPOP differ from Loopix/Katzenpost?

AnonPOP operates in rounds and provides offline storage of messages. Loopix uses a continuous time mix strategy so that it avoids user synchronization issues. AnonPOP does not provide reliable message transportation whereas Katzenpost does.

1.2 - Glossary

Traffic Analysis Resistance

Traffic analysis resistance means hiding traffic metadata from passive network observers. Such metadata includes:

  • message sender
  • message receiver
  • size of the message
  • time of message transmission

Mix

A mix is the primary component used to compose a mix network. Mixes receive incoming messages, mix them via some specific mix strategy which incurs a delay and then output the messages after removing one layer of encryption. Bitwise unlinkability between input and output messages is achieved using the mix network cryptographic packet format. Mixes can also output their own decoy traffic which adds further entropy to the network as detailed in the Loopix paper.

Mixnet

Mixnet is short for mix network which is a network of mix. Fundamentally a mix network is a lossy packet switching network whose primary purpose is to achieve traffic analysis resistant properties such as location hiding, sender anonymity etc. See our FAQ for more information.

Node

A Mix or Provider instance.

User

An agent controlling a Client of the Katzenpost system.

Client

Software run by the User on its local device to participate in the Mixnet.

Provider

In the context of Loopix/Katzenpost, a Provider is a node in the mix network which is responsible for authenticating Client forwarding messages to the rest of the mix network on behalf of Client and queueing messages that can later be retrieved by Client

PKI

Stands for Public Key Infrastructure. In the context of Panoramix is also known as the Mix Directory Authority service. In Katzenpost, Network Authority or in short Authority is the server responsible to provide the Mix Directory Authority service.

It is explained in more detail in pki

Sphinx

The Sphinx cryptographic packet format is now the defacto standard for mix networks. The Mixminion mix network used SURBs to achieve sender anonymity. Mixminion inspired the design of the Sphinx packet format.

Katzenpost

A mixnet design based on the Loopix research with added message transport reliability using an ARQ protocol scheme.

Panoramix

A project funded by the European Union’s Horizon 2020 research and innovation programme to research mixnet for voting, statistics, and messaging, running from 2015 to 2019. See panoramix-project.eu.

Loopix

The Loopix mixnet design is described in the paper "The Loopix Anonymity System" published at USENIX 2017. Loopix uses a collection of the best mix network designs to create a messaging system that has the property of sender and receiver anonymity with respect to third party observers. Loopix uses the Sphinx{.interpreted-text role=“term”} cryptographic packet format, various kinds of decoy traffic{.interpreted-text role=“term”} and a stratified mix topology{.interpreted-text role=“term”}.

ARQ

ARQ means Automatic Repeat reQuest which is a protocol scheme that achieves reliability by means for ACKnowledgement protocol control messages and retransmissions. This concept comes from the packet switching network literature and is not generally associated with mix networks. There is no other way to achieve network reliability other than an ARQ scheme although there are many hybrid ARQ schemes for radio communication that use forward error correction for the purpose of performing retransmissions less frequently.

Stop and Wait ARQ

Stop and Wait ARQ is the simplest of all the ARQ protocol schemes. In the context of mix networks it also leaks the least amount of information. When comparing it to TCP, Stop and Wait ARQ has a congestion window of size one. This means that after a message is transmitted, a second message cannot be sent until the ACK for the first message is received. If the ACK message is not received within a particular time duration then the message is retransmitted.

SURB

SURB means Single Use Reply Block. SURBs are essentially a cryptographic delivery token with a short lifetime. In the Sphinx packet format SURBs have two categories of components, those used by the creator and those used by the sender. When Alice creates a SURB, she retains a decryption token and a SURB ID. Alice gives Bob a Sphinx header and a payload encryption token. Bob can use the payload encryption token to encrypt his message. Bob then attaches the Sphinx header to his ciphertext payload, thus forming a Sphinx packet which he sends through the network. Bob cannot know the destination or route of this Sphinx packet. Alice will receive the ciphertext payload and the SURB ID. She uses the SURB ID to identify which SURB decryption token to use for the ciphertext payload decryption.

SURBs have a short lifetime because mixes MUST rotate Sphinx routing keys frequently as the primary method of achieving forward secrecy. The other reason routing keys must be rotated is because each mix retains a replay cache which stores a unique tag for each Sphinx packet that traverses it. This replay cache can only be flushed after a key rotation.

Mixminion

A mix network software project whose design has been inspirational to the Katzenpost design. For more information see

2 - Getting Started

Getting started in Katzenpost development

This guide assumes you are familiar with golang, git and Unix like operating system environment.

Overview of our git repositories

We do some additional ticket tracking in:

  • mixnet_uprising - Repository for tracking open tasks for the Katzenpost mixnet framework.

Our specs, drafts and other documents are in:

  • docs - The documentation is here with the exception of the website. This includes the design specifications as well as the document you are currently reading.

The mix server and directory authority libraries both make use of our core library:

  • core - Core library
  • server - Mix server
  • authority - Mix PKI, nonvoting and voting Directory Authority servers and clients
  • tools - Tools are programs that we use for testing and debugging.

Our core library’s wire protocol depends on our fork of the golang noise library:

  • noise - The Katzenpost fork of flynn’s golang Noise crypto library implementation which has the ablity to use the New Hope Simple post quantum hybrid key exchange for forward secrecy.

Clients also use the core library:

  • minclient - Minimal client library which is the basis for all other mix clients.
  • mailproxy - High-level client library with optional reliability and optional SMTP and POP3 listeners.
  • client - Experimental client library implementing proper Loopix decoy traffic et cetera.

Our website:

  • website - The Katzenpost website

The rest of these repositories do not currently have a maintainer associated with them and it is likely they have bitrot:

  • bindings - Language bindings for Java and Python. STATUS: NOT finished.
  • nixops - NixOS automated configuration for Katzenpost server components.
  • nixpkg - NixOS packages for Katzenpost components.
  • pykatzenpost - Python client library for Katzenpost.
  • katzsim - Client simulator for load testing mix servers.
  • katzenpost-android-mail - A Katzenpost client, based on K-9 Mail.
  • pykatzen-auth - Optional Katzenpost server authentication module in python.

Development workflow

  1. Acquire a recent version of golang (go1.11 or later) so that you can use the go-modules features for dependency version pinning! Read about go-modules here:

  2. Setup proper golang environment variables such as GOPATH, GOROOT and GO111MODULE, for example:

    export GO111MODULE=on
    export GOPATH=/home/user/gopath
    export GOROOT=/home/user/code/go
    
  3. Checkout the latest master branch of the component you are interested in building:

    cd $GOPATH/src/github.com/katzenpost/
    git clone https://github.com/katzenpost/katzenpost/server.git
    cd server
    git checkout master
    
  4. Edit the source code and make it do cool stuff.

  5. Track a new dependency with go-modules:

    cd $GOPATH/src/github.com/katzenpost/katzenpost/server/
    go get github.com/foo/bar@v1.2.3
    

Or perhaps you’d like to build with the master branch of such a dependency:

    cd $GOPATH/src/github.com/katzenpost/katzenpost/server/
    go get github.com/foo/bar@master
  1. Build the binary

    cd $GOPATH/src/github.com/katzenpost/katzenpost/server/cmd/server
    go build
    

client and server internals

The Katzenpost server repository has several coding themes which you should become familiar with before making a contribution. The server must not have any unbounded resource consumption such as spawning new go routines for example.

the Worker type

Katzenpost is NOT crash-only software! Everything has a proper shutdown code path unlike many golang examples on the Internet. Struct types which act as worker goroutines MUST be a composite struct type with the Worker type which is defined in our "core" repository here:

Correctly implementing a composite Worker type means that your code uses the Worker's Go method to spawn new goroutines and will halt it's runtime loop upon receiving from the channel returned by the Worker's HaltCh method. There are plenty of examples of this in our code.

the Channels library

The Katzenpost mix server and mailproxy both use the EApache Channels library:

Channels API docs:

Channels code:

The extended functionality of these channels is well suited to building various kinds of computational pipelines. In particular throughout the code base you will see "infinite buffered channels" used as a queue connecting the schedulers of pipeline stages. More discussion on this pipeline model is below in the next section.

the SEDA model

The Katzenpost server is essentially a software based router and as such it utilizes three active queue management algorithms (AQMs). These queues are called the ingress queue, the mix strategy queue and the egress queue. We utilize a computational model called SEDA or Staged Even Driven Architecture where these three queues are pipelined together.

At each stage of the pipeline there is a thread pool of workers which perform the computation for that stage. Between each of these stages is an AQM which can drop work tasks and can have dynamic load shedding properties so that performance degrades gracefully with respect to increased work load.

If you'd like to learn more about the SEDA computation model we recommend reading:

  • “SEDA: An Architecture for Well-Conditioned, Scalable Internet Services”

http://www.sosp.org/2001/papers/welsh.pdf

the mix strategy

Currently Katzenpost only supports the Poisson mix strategy and therefore the mix strategy AQM is implemented using a priority queue. To learn more about the Poisson mix strategy you should read:

Mix Pipeline Diagram

.-----------.        .------------.       .---------.
| Listeners |  --->  |  incoming  | --->  |  crypto |
`-----------'        | connection |       | workers |
        ▲               |  workers   |       `---------'
        |               `------------'            |
        |                                         |
        |                                         V
        |               .------------.      .----------.
                        |  connector |      |   mix    |
    network link  <--- |   packet   | <--- | strategy |
                        | dispatcher |      |   AQM    |
                        `------------'      `----------'

Provider Pipeline Diagram

.-----------.        .------------.       .---------.       .----------.       .-------------.
| Listeners |  --->  |  incoming  | --->  |  crypto | --->  | provider | --->  | user spools |
`-----------'        | connection |       | workers |       |  packet  |       `-------------'
     ▲               |  workers   |       `---------'       | workers  |                  .-----------------.
     |               `------------'            |            `----------'      .-------->  | external plugin |
     |                                         |                 |  |         |           |     workers     |
     |                                         V                 |  '_        |           `-----------------'
     |               .------------.      .----------.            V    '-------|           .-----------------.
                     |  connector |      |   mix    |       .-----------.     |           | external plugin |
  network link <---  |   packet   | <--- | strategy |       | kaetzchen |     |-------->  |     workers     |    ....-----.
                     | dispatcher |      |   AQM    |       |  workers  |     |           `-----------------'              `\
                     `------------'      `----------'       `-----------'     |           .-----------------.                |
                                _                                 |           |           | external plugin |                |
                           _   |\                                 |           '-------->  |     workers     |                |
                          |\     \                               _'                       `-----------------'                |
                            \     '-----------------------------'                                                            |
                             \                                                                                               |
                              \                                                                                            _'
                               '------------------------------------------------------------------------------------------'

Exercising Katzenpost with your own private mixnet

For many circumstances it is easier and more appropriate to perform your integration testing on a mixnet deployed to a single machine, a remote server which could be a VM instance. In that case I would compile my katzenpost binaries locally and upload them to my remote server and then run a bash script to restart the services.

You will most likely want to turn on debug logging for all the mixnet services. Checking these debug log can help you determine if the behavior is correct. Certainly you could do all of this and add extra debug log statements to help track down a problem that would otherwise be very difficult to detect.

Making a code contribution

  1. Meet the Katzenpost developers

    Chat with the Katzenpost developers on irc: #katzenpost on the OFTC network or reach out to us on our mailing list: https://lists.mixnetworks.org/listinfo/katzenpost

    It is a good idea to discuss your code change with us before investing your time in writing the code.

    These days the IRC channel is not so active. Try contacting us over the mailing list.

  2. Write a specification document

    If your code change is complex or requires us to change any of our protocols, you will need to first propose a draft specification document. You can do this by forking our docs repository, creating a new git branch with your specification document and then submitting a pull-request.

  3. Document the work task

    Open a ticket to document your feature addition or code change using the repository’s issue tracker.

  4. Testing your code

    Your code should have unit tests. However you may wish to gain extra confidence in your code addition by using our kimchi tool.

  5. Request code review

    Finally you can submit a pull-request for your code changes or additions. We will review your code. There may be several rounds of code reviews until the code is of sufficient quality to be merged.

2.1 - Run a Test Network

Getting started in Katzenpost development

The following is steps to run an actual katzenpost network from a single computer for development testing purposes. The default is the following:

  • 3x Directory Authority Nodes
  • 6x Mix Nodes (1 provider node)
  • 2x Chat App

Installing

  1. Install the necessary dependencies as root user
apt install podman docker-compose

Go to a workspace directory or GOPATH or such

cd ~/code/
  1. Get the code for katzenpost server nodes
git clone https://github.com/katzenpost/katzenpost
cd katzenpost/
git fetch
git checkout -b devel
cd ../
  1. Get code for katzen GUI chat app
git clone https://github.com/katzenpost/katzen
cd katzen/
git fetch
git checkout -b devel
cd ../

Run: katzenpost servers

Run the following as non-root user and then build the code:

cd docker/
systemctl --user enable --now podman.socket 
make start-voting-testnet

If all builds correctly, it will start running multiple katzenpost nodes in the background. You should see things like this in your terminal

Creating voting_mixnet_auth4_1 ... done
Creating voting_mixnet_auth1_1 ... done
Creating voting_mixnet_mix5_1      ... done
Creating voting_mixnet_provider2_1 ... done
...

Wait a little for things to finish starting up, then run following:

tail -F voting_mixnet/auth1/katzenpost.log

Wait for the network to get consensus for about 2 minutes

Run: ping

Pop open another terminal and run:

cd ~/code/katzenpost/docker/
make run-ping

Observe that ping works correctly, then move on to testing katzen app

App: katzen

Backup any non-devel non-warped katzen binary first if this would overwrite it

Then build katzen binary with:

make warped=true docker-build-linux

Assuming that builds with no errors, you should have a katzen binary wit which you should run two instances of katzen GUI app (from two terminals) with:

./katzen -f ../katzenpost/docker/voting_mixnet/client/client.toml -s teststate1
./katzen -f ../katzenpost/docker/voting_mixnet/client/client.toml -s teststate2

Each app instance will prompt you to input a passphrase in each UI. Choose something easy like teststate1 or such.

Then add the same identity key or tatzen to each chat app.

Now see if you can talk to yourself from one app to another!


Stopping Servers & Cleaning Up

If you want to stop all katzenpost nodes from running, do the following

cd ~/code/katzenpost/
make clean-local

To delete the previous build (from same directory) run the following

make clean

2.2 - Dependencies

A Software Bill of Materials

Abstract

This document describes the Katzenpost software dependencies and their licenses for the core katzenpost server side components AND the catshadow anonymous messaging system with QT user interface. This document is meant to be useful for determining software license compliance and to assist in audits.

Dependencies and Licenses

We use go-modules in each golang git repository to pin dependencies. Therefore these dependencies can easily be derived from the go.mod file at the root of each git repo. Auditors wishing to quickly learn the transitive dependencies can do so by looking at these files.

Core

github.com/katzenpost/katzenpost/core with license AGPL-3.0

The Katzenpost Core library depends on:

  • git.schwanenlied.me/yawning/aez.git with license CC0 1.0 Universal
  • git.schwanenlied.me/yawning/bsaes.git with license
  • github.com/agl/ed25519 with license BSD-3-Clause
  • github.com/stretchr/testify with license MIT
  • github.com/ugorji/go/codec with license MIT
  • golang.org/x/crypto with license
  • gopkg.in/op/go-logging.v1 with license BSD-3-Clause

Forks of external dependencies:

  • github.com/katzenpost/chacha20 with license AGPL-3.0
  • github.com/katzenpost/noise with license

Noise

Noise Protocol Framework: Katzenpost fork of the flynn noise library Noise depends on:

Forks of external dependencies:

  • github.com/katzenpost/newhope

Newhope

  • github.com/katzenpost/newhope with license CC0 1.0 Universal

Fork of https://git.schwanenlied.me/yawning/newhope

Depends on: github.com/katzenpost/chacha20 with license AGPL-3.0 and golang.org/x/crypto with license

Chacha20

https://github.com/katzenpost/chacha20 with license AGPL-3.0 fork of https://git.schwanenlied.me/yawning/chacha20

depends on:

Authority

github.com/katzenpost/katzenpost/authority with license AGPL-3.0

The Katzenpost Authority depends on:

Forks of external dependencies:

  • github.com/katzenpost/chacha20 with license AGPL-3.0

Server

github.com/katzenpost/katzenpost/server with license AGPL-3.0

Server depends on:

  • github.com/katzenpost/katzenpost/core with license AGPL-3.0
  • github.com/katzenpost/katzenpost/authority with license AGPL-3.0
  • git.schwanenlied.me/yawning/aez.git with license CC0 1.0 Universal
  • git.schwanenlied.me/yawning/avl.git with license CC0 1.0 Universal
  • git.schwanenlied.me/yawning/bloom.git with license CC0 1.0 Universal
  • github.com/BurntSushi/toml with license MIT
  • go.etcd.io/bbolt with license MIT
  • github.com/jackc/pgx with license MIT
  • github.com/stretchr/testify with license: MIT
  • github.com/ugorji/go/codec with license MIT
  • golang.org/x/net with license https://github.com/golang/net/blob/master/LICENSE
  • golang.org/x/text with license https://github.com/golang/text/blob/master/LICENSE
  • gopkg.in/eapache/channels.v1 with license MIT
  • gopkg.in/op/go-logging.v1 with license BSD-3-Clause

Minclient

github.com/katzenpost/katzenpost/minclient with license AGPL-3.0

Minclient depends on:

  • github.com/katzenpost/katzenpost/core with license AGPL-3.0
  • github.com/stretchr/testify with license MIT
  • gopkg.in/op/go-logging.v1 with license BSD-3-Clause

Forks of external dependencies:

Client

github.com/katzenpost/katzenpost/client with license AGPL-3.0

Client depends on:

  • github.com/katzenpost/katzenpost/authority with license AGPL-3.0
  • github.com/katzenpost/katzenpost/core with license AGPL-3.0
  • github.com/katzenpost/kimchi with license AGPL-3.0
  • github.com/katzenpost/katzenpost/minclient with license AGPL-3.0
  • github.com/katzenpost/katzenpost/registration_client with license AGPL-3.0
  • github.com/BurntSushi/toml with license MIT
  • github.com/stretchr/testify with license MIT
  • golang.org/x/net with license https://github.com/golang/net/blob/master/LICENSE
  • golang.org/x/text with license https://github.com/golang/text/blob/master/LICENSE
  • gopkg.in/eapache/channels.v1 with license MIT
  • gopkg.in/op/go-logging.v1 with license BSD-3-Clause

Catshadow

github.com/katzenpost/katzenpost/catshadow with license AGPL-3.0

Client depends on:

  • github.com/katzenpost/katzenpost/core with license AGPL-3.0
  • github.com/katzenpost/katzenpost/client with license AGPL-3.0
  • github.com/katzenpost/kimchi with license AGPL-3.0
  • github.com/katzenpost/katzenpost/memspool with license AGPL-3.0
  • github.com/katzenpost/katzenpost/panda with license AGPL-3.0
  • github.com/katzenpost/doubleratchet with license https://github.com/katzenpost/doubleratchet/blob/master/LICENSE
  • github.com/BurntSushi/toml with license MIT
  • github.com/stretchr/testify with license MIT
  • github.com/ugorji/go/codec with license MIT
  • golang.org/x/crypto with license https://github.com/golang/crypto/blob/master/LICENSE
  • gopkg.in/eapache/channels.v1 with license MIT
  • gopkg.in/op/go-logging.v1 with license BSD-3-Clause

Forks of external dependencies:

Catchat

depends on:

  • QT, the C++ library with license LGPL-3.0
  • https://doc.qt.io/qt-5/opensourcelicense.html
  • github.com/therecipe/qt/core with license LGPL-3.0
  • github.com/katzenpost/katzenpost/catshadow with license AGPL-3.0
  • github.com/katzenpost/katzenpost/client with license AGPL-3.0
  • github.com/dustin/go-humanize with license MIT
  • github.com/BurntSushi/toml with license MIT
  • github.com/muesli/go-app-paths with license MIT
  • golang.org/x/crypto with license https://github.com/golang/crypto/blob/master/LICENSE

Double Ratchet

github.com/katzenpost/doubleratchet with license https://github.com/katzenpost/doubleratchet/blob/master/LICENSE

fork of double ratchet from @agl’s pond

depends on:

Memspool

https://github.com/katzenpost/katzenpost/memspool with license AGPL-3.0

depends on:

  • github.com/katzenpost/katzenpost/client with license AGPL-3.0
  • github.com/katzenpost/katzenpost/core with license AGPL-3.0
  • github.com/katzenpost/kimchi with license AGPL-3.0
  • github.com/katzenpost/katzenpost/server with license AGPL-3.0
  • go.etcd.io/bbolt with license MIT
  • github.com/stretchr/testify with license MIT
  • github.com/ugorji/go/codec with license MIT
  • gopkg.in/op/go-logging.v1 with license BSD-3-Clause

Registration Client

https://github.com/katzenpost/katzenpost/registration_client with license AGPL-3.0

This component will hopefully go away soon but we include it for completeness.

depends on:

2.3 - Katzenpost Improvement Proposals

The procedure for proposing improvements and protocol changes to the Katzenpost project.

If you have an idea to change or otherwise enhance the Katzenpost project at the protocol level, we require the proposal to be accompanied by a comprehensive and concise text outlining the specification that clearly explains engineering and/or cryptographic choices.

Examples Specs:

0. Discuss proposal with a developer

Perhaps it’s useful to discuss your idea before writing anything more elaborate. Perhaps it’s already being worked on under a different name? Perhaps it had been considered previously, but has undesirable attack vectors. It’s better to solicit minor feedback first.

1. Write rough draft of proposal

Once there is N > 0 other developers who validate your idea, scribble it down somewhere shareable. This can be in an Etherpad, Google Docs, a Nextcloud, or if possible, ideally a git commit (skip to Step 3).

2. Solicit feedback from other developers

If you prefer to get feedback before your proposal is committed to eternity in the git repo, share the URL with developers through preferred channel(s) and discuss accordingly.

3. Commit KIP to monorepo

Once you are ready for your proposal to be committed to eternity in git, please append the following “frontmatter” YAML snippet so that your

title: "Longer Form Explicite Name"
linkTitle: "Name"
description: ""
categories: [""]
tags: [""]
author: ["your-name"] 
version: 0
draft: true

The correct location to commit your KIP to is here:

katzenpost/docs/spec/

Name your file one or two words accordingly to it’s title name.md

Additionally

Additional concerns, steps to be added….

3 - Administrators Guide

Abstract

Thank you for interest in Katzenpost! This document describes how to use and configure the Katzenpost Mix Network software system. The target audience for this document is systems administrators. This document assumes you are familiar with using unix systems, git revision control system and building golang binaries.

Introduction

Katzenpost can be used as a message oriented transport for a variety of applications and is in no way limited to the e-mail use case demonstrated by the mailproxy client/library. Other possible applications of Katzenpost include but are not limited to: instant messenger applications, crypto currency transaction transport, bulletin board systems, file sharing and so forth.

The Katzenpost system has four component categories:

  • public key infrastructure
  • mixes
  • Providers
  • clients

Providers has a superset of mixes that fulfill two roles: 1. The initial hop in the route and therefore as an ingress hop this node authenticates clients and does per client rate limiting. 2. The terminal hop in the route and therefore can either store and forward or proxy to a Kaetzchen aka a mixnet service.

This handbook will describe how to use and deploy each of these. The build instructions in this handbook assume that you have a proper golang environment with at least golang 1.10 or later AND the git revision control system commandline installed.

Building the latest stable version of Katzenpost

NOTE: Find out what our latest stable version tag by looking at the releases.md file in the top-level of this repository.

  1. Make sure you have a recent version of Go that supports go modules.
  2. Follow the build instructions for each Katzenpost component you want to build.

There are two server infrastructure components:

There are several clients. Our latest work-in-progress:

The old client from the Panoramix EU 2020 grant deliverable:

Additionally HashCloak makes crypto currency clients that work with Katzenpost:

The Katzenpost Configuration File Format

Each Katzenpost component has a configuration file in the TOML format. This handbook will give you all the details you need to know to configure each of these configuration files. To learn more about the TOML format see: https://github.com/toml-lang/toml#toml

NOTE: # may be used at the beginning of a line to denote a comment instead of an effective configuration line.

Notes on Building a Test Mix Network

See our docker repo:

3.1 - Mailproxy Client Daemon

Overview

Mailproxy is one of many possible clients for using a Katzenpost mix network. It supports POP3 and SMTP for message retreival and message transmission respectively and is intended to run on a user's localhost to allow standard mail clients to send and receive mail over the mixnet.

Mailproxy is a daemon which runs in the background and periodically transmits and receives messages. Once it receives a message it will be queued locally and encrypted onto disk for later retreival via POP3.

Upon receiving the HUP signal, mailproxy will rescan it's recipients directory to check for new recipients. Other signals trigger a clean shutdown.

Configuration

The Proxy Section

The Proxy section contains mandatory proxy configuration, for example:

[Proxy]
    POP3Address = "127.0.0.1:2524"
    SMTPAddress = "127.0.0.1:2525"
    DataDir = "/home/user/.local/share/katzenpost
  • POP3Address is the IP address/port combination that the mail proxy will bind to for POP3 access. If omitted 127.0.0.1:2524 will be used.
  • SMTPAddress is the IP address/port combination that the mail proxy will bind to for SMTP access. If omitted 127.0.0.1:2525 will be used.
  • DataDir is the absolute path to mailproxy’s state files.
  • NoLaunchListeners is set to true to disable the SMTP and POP3 listeners.

The Logging Section

The Logging section controls the logging, for example:

[Logging]
    Disable = false
    File = "/home/user/.local/share/katzenpost/katzenpost.log"
    Level = "DEBUG"
  • Disable disables logging entirely if set to true
  • File specifies the log file, if omitted stdout will be sed.
  • Level specifies the log level out of
  • ERROR
  • WARNING
  • NOTICE
  • INFO
  • DEBUG

Warning: The DEBUG log level is unsafe for production use.

The NonvotingAuthority Section

The NonvotingAuthority section specifies one or more nonvoting directory authorities, for example:

[NonvotingAuthority]
  [NonvotingAuthority.TestAuthority]
    Address = "192.0.2.2:2323"
    PublicKey = "kAiVchOBwHVtKJVFJLsdCQ9UyN2SlfhLHYqT8ePBetg="

This configuration section supports multiple entries. In the above example, the entry is labelled as TestAuthority and is referred to later in the Account section of the mailproxy configuration.

  • Address is the IP address/port combination of the directory authority.
  • PublicKey is the directory authority’s public key is Base64 or Base16 format.

The Account Section

The Account section specifies account configuration(s), for example:

[[Account]]
    User = "alice"
    Provider = "example.com"
    ProviderKeyPin = "0AV1syaCdBbm3CLmgXLj6HdlMNiTeeIxoDc8Lgk41e0="
    Authority = "TestAuthority"
    InsecureKeyDiscovery = true
  • User is the account user name.
  • Provider is the provider identifier used by this account.
  • ProviderKeyPin is the optional pinned provider signing key in Base64 or Base16 format.
  • Authority is the authority configuration used by this account.
  • InsecureKeyDiscovery is set to true in order to allow unverified user identity key lookups to be used for end-to-end encryption of messages.

The Management section

The Management section specifies the management interface configuration, for example:

[Management]
    Enable = true
    Path = "/home/user/.local/share/katzenpost/management_sock"
  • Enable enables the management interface.
  • Path specifies the path to the management interface socket. If left empty it will use [management_sock]{.title-ref} under the DataDir.

Using the management interface

Several mailproxy management commands are supported:

  • GET_RECIPIENT - Returns the given user's public identity key. The syntax of the command is as follows:
GET_RECIPIENT username
  • SET_RECIPIENT - Sets the given user’s public identity key specified in hex or base64. The syntax of the command is as follows:
SET_RECIPIENT username X25519_public_key_in_hex_or_base64
  • REMOVE_RECIPIENT - Removes a given recipient. The syntax of the command is as follows:
REMOVE_RECIPIENT username
  • LIST_RECIPIENTS - Lists all the recipients. This command expects no arguments.

3.2 - Mix Server Infrastructure

Overview

A Katzenpost Provider is strictly a superset of the Katzenpost mix. Both of these components are provided for by the server binary. Each Provider and Mix MUST be white-listed by the Directory Authority (PKI) in order to participate in the network.

Install

See the mix server readme:

Configuration

A sample configuration file can be found in our docker repository, here:

Command Line Usage

The server command Line usage is as follows:

./server -h
Usage of ./server:
    -f string
        Path to the server config file. (default "katzenpost.toml")
        -g    Generate the keys and exit immediately.

The command output when generating keys looks like this:

./server -f my_katzenpost_mix_server.toml -g
22:51:55.377 NOTI server: Katzenpost is still pre-alpha.  DO NOT DEPEND ON IT FOR STRONG SECURITY OR ANONYMITY.
22:51:55.377 NOTI server: AEZv5 implementation is hardware accelerated.
22:51:55.377 NOTI server: Server identifier is: 'example.com'
22:51:55.379 NOTI server: Server identity public key is: 2628F87F2806048C95F060DA9CD3D8F9BE7550BFB9EE85F213381BC04C047650
22:51:55.379 NOTI server: Server link public key is: CCDC5C105E649D543DF1CF397A17638F812F95B7E572288F4602F8EC01EC4F3C

Note that if you choose to configure logging to a file one disk, you can implement log rotation by moving the log file and then sending the HUP to the authority server process. This will cause the daemon to rewrite the log file in the location specified by the config file.

Configuring Mixes and Providers

Katzenpost mixes and providers have identical configuration files except that the configuration for a provider has a Provider section AND the Server section specifies IsProvider = true.

Server section

The Server section contains mandatory information common to all nodes, for example:

[Server]
    Identifier = "example.com"
    Addresses = [ "192.0.2.1:29483", "[2001:DB8::1]:29483" ]
    DataDir = "/var/lib/katzenpost"
    IsProvider = true
  • Identifier is the human readable identifier for the node (eg: FQDN).
  • Addresses are the IP address/port combinations that the server will bind to for incoming connections. IPv4 and/or IPv6 may be specified.
  • DataDir is the absolute path to the server's state files.
  • IsProvider specifies if the server is a provider (vs a mix).

PKI section

The PKI section contains the directory authority configuration for the given mix or provider, for example:

[PKI]
  [PKI.Nonvoting]
    Address = "192.0.2.2:2323"
    PublicKey = "kAiVchOBwHVtKJVFJLsdCQ9UyN2SlfhLHYqT8ePBetg="
  • Nonvoting is a simple non-voting PKI for test deployments.
  • Address is the IP address/port combination of the directory authority.
  • PublicKey is the directory authority's public key in Base64 or Base16 format.

Logging section

The Logging section controls the logging, for example:

[Logging]
    Disable = false
    File = "/var/log/katzenpost.log"
    Level = "DEBUG"
  • Disable is used to disable logging if set to true.
  • File specifies the file to log to. If omitted then stdout is used.
  • Debug may be set to one of the following:
  • ERROR
  • WARNING
  • NOTICE
  • INFO
  • DEBUG

Warning: The DEBUG log level is unsafe for production use.

Management section

The management section specifies connectivity information for the Katzenpost control protocol which can be used to make configuration changes during run-time. An example configuration looks like this:

[Management]
    Enable = true
    Path = "/var/lib/katzenpost/thwack.sock"
  • Disable is used to disable the management interface if set to true.
  • Path specifies the path to the management interface socket. If left empty then [management_sock]{.title-ref} will be used under the DataDir.

Debug section

Debug is the Katzenpost server debug configuration for advanced tuning.

  • IdentityKey specifies the identity private key.
  • NumSphinxWorkers specifies the number of worker instances to use for inbound Sphinx packet processing.
  • NumProviderWorkers specifies the number of worker instances to use for provider specific packet processing.
  • NumKaetzchenWorkers specifies the number of worker instances to use for Kaetzchen specific packet processing.
  • SchedulerExternalMemoryQueue will enable the experimental external memory queue that is backed by disk.
  • SchedulerQueueSize is the maximum allowed scheduler queue size before random entries will start getting dropped. A value <= 0 is treated as unlimited.
  • SchedulerMaxBurst is the maximum number of packets that will be dispatched per scheduler wakeup event.
  • UnwrapDelay is the maximum allowed unwrap delay due to queueing in milliseconds.
  • ProviderDelay is the maximum allowed provider delay due to queueing in milliseconds.
  • KaetzchenDelay is the maximum allowed kaetzchen delay due to queueing in milliseconds.
  • SchedulerSlack is the maximum allowed scheduler slack due to queueing and or processing in milliseconds.
  • SendSlack is the maximum allowed send queue slack due to queueing and or congestion in milliseconds.
  • DecoySlack is the maximum allowed decoy sweep slack due to various external delays such as latency before a loop decoy packet will be considered lost.
  • ConnectTimeout specifies the maximum time a connection can take to establish a TCP/IP connection in milliseconds.
  • HandshakeTimeout specifies the maximum time a connection can take for a link protocol handshake in milliseconds.
  • ReauthInterval specifies the interval at which a connection will be reauthenticated in milliseconds.
  • SendDecoyTraffic enables sending decoy traffic. This is still experimental and untuned and thus is disabled by default. WARNING: This option will go away once decoy traffic is more concrete.
  • DisableRateLimit disables the per-client rate limiter. This option should only be used for testing.
  • GenerateOnly halts and cleans up the server right after long term key generation.

Provider section

The Provider section specifies the Provider configuration. This section of the configuration has sensible defaults for every field and can therefore be omitted unless you wish to deviate from the defaults.

The top-level Provider configuration parameters include:

  • AltAddresses is the map of extra transports and addresses at which the Provider is reachable by clients. The most useful alternative transport is likely tcp in core/pki.TransportTCP
  • EnableEphemeralClients if set to true allows ephemeral clients to be created when the Provider first receives a given user identity string.
  • TrustOnFirstUse if set to true the Provider will trust client’s wire protocol keys on first use.

Kaetzchen Configuration

Kaetzchen are a simple kind of Provider-side service which receives a request and replies with a response message. We here discuss built-in internal kaetzchen services. (see next section for external kaetzchen plugin system)

Consider the following simple configuration example where we configure the echo and keyserver services:

[Provider]
    [[Provider.Kaetzchen]]
    Capability = "echo"
    Endpoint = "+echo"
    Disable = false

    [[Provider.Kaetzchen]]
    Capability = "keyserver"
    Endpoint = "+keyserver"
    Disable = false

The Kaetzchen field is the list of configured Kaetzchen (auto-responder agents) for this provider. In the above example we configured two Kaetzchen, keyserver and echo which are required by the mailproxy client.

Lets review the Kaetzchen configuration parameters:

  • Capability is the capability exposed by the agent.
  • Endpoint is the provider side endpoint that the agent will accept requests at. While not required by the spec, this server only supports Endpoints that are lower-case local-parts of an e-mail address. By convention these endpoint strings begin with +.
  • Config is the extra per agent arguments to be passed to the agent’s initialization routine.
  • Disable disabled a configured agent.

External Kaetzchen Plugin Configuration

Currently the Katzenpost server external kaetzchen plugin system uses CBOR over HTTP over UNIX domain socket to communicate with plugin programs. That is to say, the katzenpost server will spin up each plugin program one or more times as specified by it’s MaxConcurrency parameter, connect to it as a HTTP client and pipeline Kaetzchen queries.

Here’s a configuration example for the external currency service:

[[Provider.CBORPluginKaetzchen]]
  Capability = "zec"
  Endpoint = "+zec"
  Disable = false
  Command = "/home/user/test_mixnet/bin/currency"
  MaxConcurrency = 10
[Provider.PluginKaetzchen.Config]
  log_dir = "/home/user/test_mixnet/zec_tx_logs"
  f = "/home/user/test_mixnet/currency_zec/curreny.toml"

We’ve written echo services in golang and rust as an example here:

Provider User Database Configuration

UserDB is the user database configuration. If left empty the simple BoltDB backed user database will be used with the default database. A simple configuration example:

[Provider.UserDB]
    Backend = "bolt"

    [Provider.UserDB.Bolt]
    UserDB = "my_users.db"
  • Backend is the active userdb backend. If left empty, the BoltUserDB backend will be used bolt

If the bolt backend is specified there is one configuration parameter available under this section:

  • UserDB is the path to the user database. If left empty it will use users.db under the DataDir.

Next we will examine a configuration example which demonstrates using a user database via HTTP:

[Provider.UserDB]
  [Provider.UserDB.ExternUserDB]
    ProviderURL = "http://localhost:8080/"
  • ExternUserDB is the external http user authentication mechanism.
  • ProviderURL is the base url used for the external provider authentication API.

Provider Spool Database Configuration

The Provider spool database stores received messages for later retreival by clients. A simple configuration example follows:

[Provider.SpoolDB]
    Backend = "bolt"

[Provider.SpoolDB.Bolt]
    SpoolDB = "my_spool.db"
  • SpoolDB is the path to the user message spool. If left empty, it will default to spool.db under the DataDir.

Using the Postgres SQL Database Backend

Lastly, we will explore how to use a SQL database as the backend for the user and spool databases, for example:

[Provider]
  [Provider.SQLDB]
    Backend = "pgx"
    DataSourceName = "postgresql://provider:s3cr3tp0stgr355@127.0.0.1:5433/katzenpost"
  [Provider.SpoolDB]
    Backend = "sql"
  [Provider.UserDB]
    Backend = "sql"

This configuration sample demonstrates how to use a Postgres database for both the user database and the spool database. The Backend parameter is set to pgx which means “use a postgresql database”.

  • DataSourceName is the SQL data source name or URI. The format of this parameter is dependent on the database driver being used.

Setup the Postgres SQL database backend:

Install postgres Postgres 9.5 or later is required. On a debian system you can install it like so:

apt install postgresql

Configure postgres access The pg_hba.conf file is the place to configure access to the databases. It’s parsed from top to bottom, first matching rule is applied. You probably need to add a rule for your provider user fairly early. On a debian system this file may be located here:

/etc/postgresql/9.6/main/pg_hba.conf

Start a shell as the postgres user. If you are superuser you can use su or sudo to start the shell as postgres like:

sudo -u postgres

or without sudo:

su - postgres

Add the database user provider:

createuser -U postgres provider

Add a database:

createdb -U postgres -O provider katzenpost

Start the postgres shell:

psql

Set the password for your new user:

ALTER USER provider WITH PASSWORD 's3cr3tp0stgr355';

Test to see if you can connect:

psql -U provider -h 127.0.0.1 katzenpost

If all goes fine, it’s time to load the SQL, that creates the Katzenpost database schema and stored procedures:

psql -U provider --password -d katzenpost -h 127.0.0.1 -f create_database-postgresql.sql

That SQL script is located in our server git repository, here:

  1. Start the Katzenpost server.

Runtime configuration changes with the management socket

The socat commandline utility can be use to connect to the management socket and issue commands. Connect with a commandline like so:

socat unix:/<path-to-data-dir>/management_sock STDOUT

The following commands are possible:

  • QUIT - Exit this management socket session.
  • SHUTDOWN - Cause the server to gracefully shutdown.
  • ADD_USER - Add a user and associate it with the given link key in either hex or base64. The syntax of the command is as follows:
ADD_USER alice X25519_public_key_in_hex_or_base64
  • UPDATE_USER - Update the link key of a given user. The syntax of the command is as follows:
UPDATE_USER alice X25519_public_key_in_hex_or_base64
  • REMOVE_USER - Remove a given user. The syntax of the command is as follows:
REMOVE_USER alice
  • SET_USER_IDENTITY - Set a given user’s identity key. The syntax of the command is as follows:
SET_USER_IDENTITY alice X25519_public_key_in_hex_or_base64
  • REMOVE_USER_IDENTITY - Remove a given user’s identity key. MUST be called before removing the user with the REMOVE_USER command. The syntax of this command is as follows:
REMOVE_USER_IDENTITY alice
  • USER_IDENTITY - Retrieve the identity key of the given user. The syntax of the command is as follows:
USER_IDENTITY alice
  • SEND_RATE - Sets the rate limiter to the given packets per minute rate:
SEND_RATE 30
  • SEND_BURST - Sets the rate limiter burst to the given maximum:
SEND_BURST 4

3.3 - Public Key Infrastructure

Overview

This is essentially a single point of failure. If this PKI system becomes compromised by an adversary it's game over for anonymity and security guarantees. Consider using the voting authority instead.

The Katzenpost voting Directory Authority system is a replacement for the non-voting Directory Authority. However it’s voting protocol is NOT byzantine fault tolerant. Therefore a Directory Authority server which is participating in the voting protocol can easily perform a denial of service attack for each voting round. This would cause the mix network to become totally unusable.

All Katzenpost PKI systems have two essential components:

  • A client library
  • Server infrastructure

Furthermore this client library has two types of users, namely mixes and clients. That is, mixes must use the library to upload/download their mix descriptors and clients use the library to download a network consensus document so that they can route messages through the mix network.

Install

See the authority readme:

Configuration

A sample configuration file can be found in our docker repository, here:

CLI usage of The Non-voting Directory Authority

The non-voting authority has the following command line usage:

./nonvoting --help
Usage of ./nonvoting:
    -f string
        Path to the authority config file. (default "katzenpost-authority.toml")
    -g    Generate the keys and exit immediately.

The -g option is used to generate the public and private keys for the Directory Authority. Clients of the PKI will use this public key to verify retrieved network consensus documents. However before invoking the authority with this command line option you MUST provide a valid configuration file. This file will specify a data directory where these keys will be written. Normal invocation will omit this -g option because the keypair should already be present.

A minimal configuration suitable for using with this -g option for generating the key pair looks like this:

[Authority]
    Addresses = [ "192.0.2.1:12345" ]
    DataDir = "/var/run/katzauth"

Example invocation command line:

./nonvoting -g -f my_authority_config.toml

However the invocation may fail if the permissions on the data directory are not restricted to the owning user:

./nonvoting -g -f my_authority_config.toml
Failed to spawn authority instance: authority: DataDir '/var/run/katzauth' has invalid permissions 'drwxr-xr-x'

Fix permissions like so:

chmod 700 /var/run/katzauth

A successful run will print output that looks like this:

14:47:43.141 NOTI authority: Katzenpost is still pre-alpha.  DO NOT DEPEND ON IT FOR STRONG SECURITY OR ANONYMITY.
14:47:43.142 NOTI authority: Authority identity public key is: 375F00F6EA20ACFB3F4CDCA7FDB50AE427BF02035B6A2F5789281DAA7290B2BB

Note that if you choose to configure logging to a file one disk, you can implement log rotation by moving the log file and then sending the HUP to the authority server process. This will cause the daemon to rewrite the log file in the location specified by the config file.

Configuring The Non-voting Directory Authority

Authority section

The Authority section contains information which is mandatory, for example:

[Authority]
    Addresses = [ "192.0.2.1:29483", "[2001:DB8::1]:29483" ]
    DataDir = "/var/lib/katzenpost-authority"
  • Addresses contains one or more IP addresses which correspond to local network interfaces to listen for connections on. These can be specified as IPv4 or IPv6 addresses.
  • DataDir specifies the absolute path to the server’s state files including the keypair use to sign network consensus documents.

Logging section

The logging section controls the logging, for example:

[Logging]
    Disable = false
    File = "/var/log/katzenpost.log"
    Level = "DEBUG"
  • Disable is used to disable logging if set to true.
  • File specifies the file to log to. If omitted then stdout is used.
  • Debug may be set to one of the following:
  • ERROR
  • WARNING
  • NOTICE
  • INFO
  • DEBUG

Parameters section

The Parameters section holds the network parameters, for example:

[Parameters]
    SendRatePerMinute = 30
    Mu = 0.00025
    MuMaxDelay = 9000
    LambdaP = 15.0
    SendShift = 3
    LambdaPMaxDelay = 3000
    LambdaL = 0.00025
    LambdaLMaxDelay = 9000
    LambdaD = 0.00025
    LambdaDMaxDelay = 9000
    LambdaM = 0.00025
    LambdaMMaxDelay = 9000
  • SendRatePerMinute is the rate limiter maximum allowed rate of packets per client.
  • Mu is the inverse of the mean of the exponential distribution that the Sphinx packet per-hop mixing delay will be sampled from.
  • MuMaxDelay is the maximum Sphinx packet per-hop mixing delay in milliseconds.
  • LambdaP LambdaP is the inverse of the mean of the exponential distribution that clients will sample to determine the time interval between sending messages from it's FIFO egress queue or drop decoy messages if the queue is empty.
  • LambdaPMaxDelay is the maximum send interval for LambdaP in milliseconds
  • LambdaL LambdaL is the inverse of the mean of the exponential distribution that clients will sample to determine the time interval between sending decoy loop messages.
  • LambdaLMaxDelay sets the maximum send interval for LambdaL in milliseconds.
  • LambdaD is the inverse of the mean of the exponential distribution that clients will sample to determine the time interval between sending decoy drop messages.
  • LambdaDMaxDelay is the maximum send interval in milliseconds.
  • LambdaM is the inverse of the mean of the exponential distribution that mixes will sample to determine send timing of mix loop decoy traffic.
  • LambdaMMaxDelay sets the maximum delay for LambdaM

Mixes section

The Mixes array defines the list of white-listed non-provider nodes, for example:

[[Mixes]]
IdentityKey = "kAiVchOBwHVtKJVFJLsdCQ9UyN2SlfhLHYqT8ePBetg="

[[Mixes]]
IdentityKey = "900895721381C0756D28954524BB1D090F54C8DD9295F84B1D8A93F1E3C17AD8"
  • IdentityKey is the node’s EdDSA signing key, in either Base16 or Base64 format.

Provider section

The Providers array defines the list of white-listed Provider nodes, for example:

[[Providers]]
Identifier = "provider1"
IdentityKey = "0AV1syaCdBbm3CLmgXLj6HdlMNiTeeIxoDc8Lgk41e0="

[[Providers]]
Identifier = "provider2"
IdentityKey = "375F00F6EA20ACFB3F4CDCA7FDB50AE427BF02035B6A2F5789281DAA7290B2BB"
  • Identifier is the human readable provider identifier, such as a FQDN.
  • IdentityKey is the provider’s EdDSA signing key, in either Base16 or Base64 format.

3.4 - Setup Your Own Mixnet

Katzenpost is still pre-alpha.DO NOT DEPEND ON IT FOR STRONG SECURITY OR ANONYMITY.

Mix networks are meant to be decentralized and therefore should be operated by multiple entities. You can of course be the only operator of a mix network for testing purposes.

Build Software

Take a look at our docker repo. This will explain how to configure and run a katzenpost mixnet.

A Katzenpost mix network has two binary programs, a PKI and a Mix Provider.

Katzenpost server side requires a recent golang. See golang install instructions: https://golang.org/doc/install

Follow the build instructions for each Katzenpost component repo.

The produced binaries are statically linked, so you can build the authority and the server code on one machine, and then distribute them to any Linux based machines to run.

Synchronize Clock

Each network component, the PKI and mixes/providers, MUST have the correct time. We recommend chrony for the purpose of time synchronization.

apt install chrony

Add Users to the Provider

This step might not need to be performed if you are using a client that auto-registers users with their Katzenpost Provider; such as catchat.

Add User to the Provider using the management interface:

socat unix:/<path-to-data-dir>/management_sock STDOUT
ADD_USER alice X25519_public_key_in_hex_or_base64

In case you want to use the automatic key discovery for mailproxy, the user identity key (identity.public.pem) also needs to be set:

SET_USER_IDENTITY alice X25519_public_key_in_hex_or_base64

3.5 - Torification of Katzenpost

Abstract

Tor and mixnets provide orthogonal anonymity properties and therefore it can be advantageous for clients to connect to their mixnet Provider and Directory Authority service(s) over Tor onion services. This document describes how to configure these services and a mailproxy client to use Tor onion services.

Introduction

This document assumes you have already installed Tor. You can either install Tor as part of the Tor Browser Bundle or you can install it standalone. Obviously only clients may be interested in using Tor Browser Bundle.

  1. Install Tor Browser Bundle: https://www.torproject.org/download/download-easy.html.en
  2. Install a standlone Tor in Debian/Ubuntu: https://www.torproject.org/docs/debian.html.en

Mailproxy

Here is a complete mailproxy configuration that uses only Tor onion services for it's communication with the mix network:

[Proxy]
    POP3Address = "127.0.0.1:2524"
    SMTPAddress = "127.0.0.1:2525" DataDir = "/home/user/.mailproxy"

[Logging]
    Disable = false Level = "NOTICE"

[NonvotingAuthority]
    [NonvotingAuthority.PlaygroundAuthority]
        Address = "lxqkz5d5e3pehagu.onion:61832"
        PublicKey = "o4w1Nyj/nKNwho5SWfAIfh7SMU8FRx52nMHGgYsMHqQ="

[[Account]]
    User = "alice"
    Provider = "playground"
    ProviderKeyPin = "imigzI26tTRXyYLXujLEPI9QrNYOEgC4DElsFdP9acQ="
    Authority = "PlaygroundAuthority"

[Management]
    Enable = false

[UpstreamProxy]
    PreferedTransports = ["onion"]
    Type = "tor+socks5"
    Network = "tcp" Address = "127.0.0.1:9050"

3.6 - Voting Directory Authority

Overview

Most of the mixnet papers are written with the assumption that the mixnet’s PKI exists and performs their functions perfectly. Our mixnet PKI is a so called Directory Authority design which is inspired by the Tor’s and Mixminion’s Directory Authority.

For more details about the design of the Katzenpost voting PKI you should see our specification document:

As with Katzenpost clients and mixes, this authority server uses our post quantum cryptographic noise based wire protocol as described in the specification document:

Each authority’s configuration has the public link layer key of each of it’s peers for authenticating the wire protocol connection.

Peers are also configured with each other’s public signing key so that they may verify each other’s signatures on votes. The voting system is used to create a document describing the collective view of the network. Mixnet clients download the consensus document so that they may utilize the network to route their Sphinx packets.

Install

See the authority readme:

CLI usage of The Voting Directory Authority

The voting authority has the following command line usage:

./voting -h
Usage of ./voting:
    -f string
        Path to the authority config file. (default "katzenpost-authority.toml")
    -g    Generate the keys and exit immediately.

The -g option is used to generate the public and private signing and link keys.

Configuring The Voting Directory Authority

A sample configuration file can be found in our docker repository, here:

Authority section

The Authority section contains information which is mandatory, for example:

[Authority]
    Addresses = [ "192.0.2.1:29483", "[2001:DB8::1]:29483" ]
    DataDir = "/var/lib/katzenpost-authority"
  • Addresses contains one or more IP addresses which correspond to local network interfaces to listen for connections on. These can be specified as IPv4 or IPv6 addresses.
  • DataDir specifies the absolute path to the server's state files including the keypair use to sign network consensus documents.

Logging section

The logging section controls the logging, for example:

[Logging]
    Disable = false
    File = "/var/log/katzenpost.log"
    Level = "DEBUG"
  • Disable is used to disable logging if set to true.
  • File specifies the file to log to. If omitted then stdout is used.
  • Debug may be set to one of the following:
  • ERROR
  • WARNING
  • NOTICE
  • INFO
  • DEBUG

Parameters section

The Parameters section holds the network parameters, for example:

[Parameters]
    SendRatePerMinute = 30
    Mu = 0.00025
    MuMaxDelay = 9000
    LambdaP = 15.0
    SendShift = 3
    LambdaPMaxDelay = 3000
    LambdaL = 0.00025
    LambdaLMaxDelay = 9000
    LambdaD = 0.00025
    LambdaDMaxDelay = 9000
    LambdaM = 0.00025
    LambdaMMaxDelay = 9000
  • SendRatePerMinute is the rate limiter maximum allowed rate of packets per client.
  • Mu is the inverse of the mean of the exponential distribution that the Sphinx packet per-hop mixing delay will be sampled from.
  • MuMaxDelay is the maximum Sphinx packet per-hop mixing delay in milliseconds.
  • LambdaP LambdaP is the inverse of the mean of the exponential distribution that clients will sample to determine the time interval between sending messages from it's FIFO egress queue or drop decoy messages if the queue is empty.
  • LambdaPMaxDelay is the maximum send interval for LambdaP in milliseconds
  • LambdaL LambdaL is the inverse of the mean of the exponential distribution that clients will sample to determine the time interval between sending decoy loop messages.
  • LambdaLMaxDelay sets the maximum send interval for LambdaL in milliseconds.
  • LambdaD is the inverse of the mean of the exponential distribution that clients will sample to determine the time interval between sending decoy drop messages.
  • LambdaDMaxDelay is the maximum send interval in milliseconds.
  • LambdaM is the inverse of the mean of the exponential distribution that mixes will sample to determine send timing of mix loop decoy traffic.
  • LambdaMMaxDelay sets the maximum delay for LambdaM

Debug Section

  • IdentityKey is this authority’s EdDSA signing key, in either Base16 OR Base64 format.
  • LinkKey is this authority’s ECDH link layer key, in either Base16 OR Base64 format.
  • Layers is the number of non-provider layers in the network topology.
  • MinNoderPerLayer is the minimum number of nodes per layer required to form a valid Document.
  • GenerateOnly if set to true causes the server to halt and clean up the data dir right after long term key generation.

Mixes Section

The Mixes configuration section looks like this :

[[Mixes]]
    IdentityKey = "kAiVchOBwHVtKJVFJLsdCQ9UyN2SlfhLHYqT8ePBetg="

[[Mixes]]
    IdentityKey = "900895721381C0756D28954524BB1D090F54C8DD9295F84B1D8A93F1E3C17AD8"
  • IdentityKey is the node's EdDSA signing key, in either Base16 OR Base64 format.

Providers Section

Configure like so: :

[[Providers]]
    Identifier = "example.com"
    IdentityKey = "0AV1syaCdBbm3CLmgXLj6HdlMNiTeeIxoDc8Lgk41e0="
  • Identifier is the human readable provider identifier, such as a FQDN.
  • IdentityKey is the provider's EdDSA signing key, in either Base16 OR Base64 format.

4 - Specifications

These works are licensed under the Creative Commons Attribution-ShareAlike 4.0 International License.

The canonical versions of the specifications live in RFC plaintext format at:

Pull requests to fix formatting welcome!

4.1 - Certificate Format

Abstract

This document proposes a certificate format that Katzenpost mix server, directory authority server and clients will use.

1. Introduction

Mixes and Directory Authority servers need to have key agility in the sense of operational abilities such as key rotation and key revocation. That is, we wish for mixes and authorities to periodically utilize a long-term signing key for generating certificates for new short-term signing keys.

Yet another use-case for these certificate is to replace the use of JOSE RFC7515 in the voting Directory Authority system KATZMIXPKI for the multi-signature documents exchanged for voting and consensus.

1.1 Conventions Used in This Document

The key words “MUST”, “MUST NOT”, “REQUIRED”, “SHALL”, “SHALL NOT”, “SHOULD”, “SHOULD NOT”, “RECOMMENDED”, “MAY”, and “OPTIONAL” in this document are to be interpreted as described in RFC2119.

1.2 Terminology

Tbw…

2. Document Format

The CBOR RFC7049 serialization format is used to serialize certificates:

Signature is a cryptographic signature which has an associated signer ID.

type Signature struct {
        // Identity is the identity of the signer.
        Identity []byte
        // Signature is the actual signature value.
        Signature []byte
}

Certificate structure for serializing certificates.

type certificate struct {
    // Version is the certificate format version.
    Version uint32

    // Expiration is seconds since Unix epoch.
    Expiration int64

    // KeyType indicates the type of key
    // that is certified by this certificate.
    KeyType string

    // Certified is the data that is certified by
    // this certificate.
    Certified []byte

    // Signatures are the signature of the certificate.
    Signatures []Signature
}

That is, one or more signatures sign the certificate. However the Certified field is not the only information that is signed. The Certified field along with the other non-signature fields are all concatenated together and signed. Before serialization the signatures are sorted by their identity so that the output is binary deterministic.

2.1 Certificate Types

The certificate type field indicates the type of certificate. So far we have only two types:

  • identity key certificate
  • directory authority certificate

Both mixes and directory authority servers have a secret, long-term identity key. This key is ideally stored encrypted and offline, it’s used to sign key certificate documents. Key certificates contain a medium-term signing key that is used to sign other documents. In the case of an “authority signing key”, it is used to sign vote and consensus documents whereas the “mix singing key” is used to sign mix descriptors which are uploaded to the directory authority servers.

2.2. Certificate Key Types

It’s more practical to continue using Ed25519 ED25519 keys but it’s also possible that in the future we could upgrade to a stateless hash based post quantum cryptographic signature scheme such as SPHINCS-256 or SPHINCS+. SPHINCS256

3. Golang API

Our golang implementation is agnostic to the specific cryptographic signature scheme which is used. Cert can handle single and multiple signatures per document and has a variety of helper functions that ease use for multi signature use cases.

4. Acknowledgments

This specification was inspired by Tor Project’s certificate format specification document:

Appendix A. References

Appendix A.1 Normative References

Appendix A.2 Informative References

Appendix B. Citing This Document

Appendix B.1 Bibtex Entry

Note that the following bibtex entry is in the IEEEtran bibtex style as described in a document called “How to Use the IEEEtran BIBTEX Style”.

@online{KatzenCert,
title = {Certificate Format Specification},
author = {David Stainton},
url = {https://github.com/katzenpost/katzenpost/blob/master/docs/specs/certificate.rst},
year = {2018}
}

ED25519

KATZMIXPKI

Angel, Y., Piotrowska, A., Stainton, D.,
"Katzenpost Mix Network Public Key Infrastructure Specification",
December 2017,
https://github.com/katzenpost/katzenpost/blob/master/docs/specs/pki.md

RFC2119

Bradner, S.,
"Key words for use in RFCs to Indicate Requirement Levels",
BCP 14, RFC 2119, DOI 10.17487/RFC2119,
March 1997,
http://www.rfc-editor.org/info/rfc2119

RFC7049

C. Bormannm, P. Hoffman,
"Concise Binary Object Representation (CBOR)",
Internet Engineering Task Force (IETF),
October 2013,
https://tools.ietf.org/html/rfc7049

RFC7515

Jones, M., Bradley, J., Sakimura, N.,
"JSON Web Signature (JWS)",
May 2015,
https://tools.ietf.org/html/rfc7515

RFC7693

Saarinen, M-J., Ed., and J-P. Aumasson,
"The BLAKE2 Cryptographic Hash and Message Authentication Code (MAC)",
RFC 7693, DOI 10.17487/RFC7693,
November 2015,
http://www.rfc-editor.org/info/rfc7693

SPHINCS256

Bernstein, D., Hopwood, D., Hulsing, A., Lange, T., Niederhagen, R., Papachristodoulou, L., Schwabe, P., Wilcox O' Hearn, Z.,
"SPHINCS: practical stateless hash-based signatures",
http://sphincs.cr.yp.to/sphincs-20141001.pdf

4.2 - KEMSphinx

Abstract

Here I present a modification of the Sphinx cryptographic packet format that uses a KEM instead of a NIKE whilst preserving the properties of bitwise unlinkability, constant packet size and route length hiding.

1. Introduction

We’ll express our KEM Sphinx header in pseudo code. The Sphinx body will be exactly the same as SPHINXSPEC. Our basic KEM API has three functions:

  • PRIV_KEY, PUB_KEY = GEN_KEYPAIR(RNG)
  • ct, ss = ENCAP(PUB_KEY) - Encapsulate generates a shared secret, ss, for the public key and encapsulates it into a ciphertext.
  • ss = DECAP(PRIV_KEY, ct) - Decapsulate computes the shared key, ss, encapsulated in the ciphertext, ct, for the private key.

Additional notation includes:

  • || = concatenate two binary blobs together
  • PRF = pseudo random function, a cryptographic hash function, e.g. Blake2b.

Therefore we must embed these KEM ciphertexts in the KEMSphinx header, one KEM ciphertext per mix hop.

2. Post Quantum Hybrid KEM

Special care must be taken in order correctly compose a hybrid post quantum KEM that is IND-CCA2 robust.

The hybrid post quantum KEMs found in Cloudflare’s circl library are suitable to be used with Noise or TLS but not with KEM Sphinx because they are not IND-CCA2 robust. Noise and TLS achieve IND-CCA2 security by mixing in the public keys and ciphertexts into the hash object and therefore do not require an IND-CCA2 KEM.

Firstly, our post quantum KEM is IND-CCA2 however we must specifically take care to make our NIKE to KEM adapter have semantic security. Secondly, we must make a security preserving KEM combiner.

2.1 NIKE to KEM adapter

We easily achieve our IND-CCA2 security by means of hashing together the DH shared secret along with both of the public keys:

func ENCAPSULATE(their_pubkey publickey) ([]byte, []byte) {
    my_privkey, my_pubkey = GEN_KEYPAIR(RNG)
    ss = DH(my_privkey, their_pubkey)
    ss2 = PRF(ss || their_pubkey || my_pubkey)
    return my_pubkey, ss2
}

func DECAPSULATE(my_privkey, their_pubkey) []byte {
    s = DH(my_privkey, their_pubkey)
    shared_key = PRF(ss || my_pubkey || their_pubkey)
    return shared_key
}

2.2 KEM Combiner

The KEM Combiners paper KEMCOMB makes the observation that if a KEM combiner is not security preserving then the resulting hybrid KEM will not have IND-CCA2 security if one of the composing KEMs does not have IND-CCA2 security. Likewise the paper points out that when using a security preserving KEM combiner, if only one of the composing KEMs has IND-CCA2 security then the resulting hybrid KEM will have IND-CCA2 security.

Our KEM combiner uses the split PRF design from the paper when combining two KEM shared secrets together we use a hash function to also mix in the values of both KEM ciphertexts. In this pseudo code example we are hashing together the two shared secrets from the two underlying KEMs, ss1 and ss2. Additionally the two ciphertexts from the underlying KEMs, cct1 and cct2, are also hashed together:

func SplitPRF(ss1, ss2, cct1, cct2 []byte) []byte {
    cct := cct1 || cct2
    return PRF(ss1 || cct) XOR PRF(ss2 || cct)
}

Which simplifies to:

SplitPRF := PRF(ss1 || cct2) XOR PRF(ss2 || cct1)

The Split PRF can be used to combine an arbitrary number of KEMs. Here’s what it looks like with three KEMs:

func SplitPRF(ss1, ss2, ss3, cct1, cct2, cct3 []byte) []byte {
    cct := cct1 || cct2 || cct3
    return PRF(ss1 || cct) XOR PRF(ss2 || cct) XOR PRF(ss3 || cct)
}

3. KEMSphinx Header Design

NIKE Sphinx header elements:

  1. Version number (MACed but not encrypted)
  2. Group element
  3. Encrypted per routing commands
  4. MAC for this hop (authenticates header fields 1 thru 4)

KEM Sphinx header elements:

  1. Version number (MACed but not encrypted)
  2. One KEM ciphertext for use with the next hop
  3. Encrypted per routing commands AND KEM ciphtertexts, one for each additional hop
  4. MAC for this hop (authenticates header fields 1 thru 4)

We can say that KEMSphinx differs from NIKE Sphinx by replacing the header’s group element (e.g. an X25519 public key) field with the KEM ciphertext. Subsequent KEM ciphertexts for each hop are stored inside the Sphinx header “routing information” section.

First we must have a data type to express a mix hop, and we can use lists of these hops to express a route:

type PathHop struct {
    public_key kem.PublicKey
    routing_commands Commands
}

Here’s how we construct a KEMSphinx packet header where [path]{.title-ref} is a list of PathHop, and indicates the route through the network:

  1. Derive the KEM ciphertexts for each hop.
route_keys = []
route_kems = []
for i := 0; i < num_hops; i++ {
    kem_ct, ss := ENCAP(path[i].public_key)
    route_kems += kem_ct
    route_keys += ss
}
  1. Derive the routing_information keystream and encrypted padding for each hop.

Same as in SPHINXSPEC except for the fact that each routing info slot is now increased by the size of the KEM ciphertext.

  1. Create the routing_information block.

Here we modify the Sphinx implementation to pack the next KEM ciphertext into each routing information block. Each of these blocks is decrypted for each mix mix hop which will decrypt the KEM ciphertext for the next hop in the route.

  1. Assemble the completed Sphinx Packet Header and Sphinx Packet Payload SPRP key vector. Same as in SPHINXSPEC except the kem_element field is set to the first KEM ciphertext, route_kems[0]:
var sphinx_header SphinxHeader
sphinx_header.additional_data = version
sphinx_header.kem_element = route_kems[0]
sphinx_header.routing_info = routing_info
sphinx_header.mac = mac

2. KEMSphinx Unwrap Operation

Most of the design here will be exactly the same as in SPHINXSPEC. However there are a few notable differences:

  1. The shared secret is derived from the KEM ciphertext instead of a DH.
  2. Next hop’s KEM ciphertext stored in the encrypted routing information.

3. Acknowledgments

I would like to thank Peter Schwabe for the original idea of simply replacing the Sphinx NIKE with a KEM and for answering all my questions. I’d also like to thank Bas Westerbaan for answering questions.

Appendix A. References

KEMCOMB

Federico Giacon, Felix Heuer, Bertram Poettering,
"KEM Combiners",
2018
https://link.springer.com/chapter/10.1007/978-3-319-76578-5_7>PKC

SPHINX09

Danezis, G., Goldberg, I.,
"Sphinx: A Compact and Provably Secure Mix Format\",
DOI 10.1109/SP.2009.15,
May 2009,
https://cypherpunks.ca/~iang/pubs/Sphinx_Oakland09.pdf

SPHINXSPEC

Angel, Y., Danezis, G., Diaz, C., Piotrowska, A., Stainton, D.,
"Sphinx Mix Network Cryptographic Packet Format Specification"
July 2017,
https://github.com/katzenpost/katzenpost/blob/master/docs/specs/sphinx.md

4.3 - LIONESS Wide-Block-Cipher

Abstract

This document defines the LIONESS Wide-Block-Cipher construct, and provides a concrete parameterization based around the BLAKE2b hash algorithm and ChaCha20 stream cipher.

This document does not introduce any new crypto, but is meant to serve as a stable reference and implementation guide.

1. Introduction

LIONESS is a provably secure wide-block-cipher proposed by Anderson and Biham in [AB96]{.citation}. Internally it is a four round unbalanced Feistel cipher comprised of a keyed hash function and a stream cipher. It supports processing large abitrary sized blocks, with a minimum block size imposed by the choice of primitives, and has the property such that each bit of the ciphertext is dependent on every single bit of the plaintext and vice versa.

1.1 Conventions Used in This Document

The key words “MUST”, “MUST NOT”, “REQUIRED”, “SHALL”, “SHALL NOT", “SHOULD”, “SHOULD NOT”, “RECOMMENDED”, “MAY”, and “OPTIONAL” in this document are to be interpreted as described in RFC2119.

  • x | y is the concatenation of x and y
  • x ^ y is the bitwise XOR of x and y
  • A byte is an 8-bit octet

x[a:b] is the sub-vector of x where a/b denote the start/end byte indexes (inclusive-exclusive). a/b may be omitted to signify the start/end of the vector x respectively.

2. The LIONESS Wide-Block-Cipher Construct

LIONESS is parameteriezed with a keyed hash function (MAC) and a stream cipher as follows.

H(k, m) is a keyed hash function (MAC) with the key k, for a message m of arbitrary size. Let HashLength denote the size of the output and the key in bytes.

H() MUST be collision resistant or preimage resistant, and implementations SHOULD pick algorithms that provide both properties.

S(k) is a pseudo random function (stream cipher) which given the HashLength sized input k will generate an output of arbitrary size.

S() MUST be key recovery resistant.

2.1 LIONESS Encryption

LIONESS-Encrypt(k1, k2, k3, k4, plaintext) -> ciphertext

Inputs:

k1,k2,k3,k4 Round keys, each HashLength bytes in size.

plaintext The plaintext to encrypt, greater than HashLength bytes in size.

Output:

ciphertext The resulting ciphertext.

The output of LIONESS-Encrypt is calculated as follows:

L = plaintext[0:HashLength]
R = plaintext[HashLength:]
R = R ^ S(L ^ k1)
L = L ^ H(k2, R)
R = R ^ S(L ^ k3)
L = L ^ H(k4, R)
ciphertext = L | R

2.2 LIONESS Decryption

LIONESS-Decrypt(k1, k2, k3, k4, ciphertext) -> plaintext

Inputs:

k1,k2,k3,k4 Round keys, each HashLength bytes in size.

ciphertext The ciphertext to decrypt, greater than HashLength bytes in size.

Output:

plaintext The resulting plaintext.

The output of LIONESS-Decrypt is calculated as follows:

L = ciphertext[0:HashLength]
R = ciphertext[HashLength:]
L = L ^ H(k4, R)
R = R ^ S(L ^ k3)
L = L ^ H(k2, R)
R = R ^ S(L ^ k1)
plaintext = L | R

3. LIONESS-BLAKE2b-ChaCha20

LIONESS-BLAKE2b-ChaCha20 is a concrete parameterization of LIONESS based around the BLAKE2b RFC7693 hash algorithm and ChaCha20 RFC7539 stream cipher. It provides a security level of at least 256 bits, and supports a per-call initialization vector.

Plaintext and Ciphertext MUST NOT exceed 32 + ((1 \<\< 32) \* 64) bytes.

For sections 3.1 and 3.2:

Let BLAKE2b(k, m) return the BLAKE2b digest calculated with key k, and message m, truncated to 32 bytes.

Let ChaCha20(k, n, m) return the ChaCha20 encrypted ciphertext with key k, nonce n, and message m, with the counter initialized to 0.

3.1 LIONESS-BLAKE2b-ChaCha20 Encryption

LIONESS-BLAKE2b-ChaCha20-Encrypt(key, iv, plaintext) -> ciphertext

Inputs:

key The key, 128 bytes in size. iv The initialization vector, 48 bytes in size. plaintext The plaintext to encrypt, greater than 32 bytes in size.

Output:

ciphertext The resulting ciphertext.

The output of LIONESS-BLAKE2b-ChaCha20-Encrypt is calculated as follows:

k1 = key[0:32]
k2 = key[32:64]
k3 = key[64:96]
k4 = key[96:128]
iv1 = iv[0:12]
iv2 = iv[12:24]
iv3 = iv[24:36]
iv4 = iv[36:48]

L = ciphertext[0:32]
R = ciphertext[32:]
R = ChaCha20(L ^ k1, iv1, R)
L = L ^ BLAKE2b(k2 | iv2, R)
R = ChaCha20(L ^ k3, iv3, R)
L = L ^ BLAKE2b(k4 | iv4, R)
ciphertext = L | R

3.2 LIONESS-BLAKE2b-ChaCha20 Decryption

LIONESS-BLAKE2b-ChaCha20-Decrypt(key, iv, ciphertext) -> plaintext

Inputs:

key The key, 128 bytes in size.

iv The initialization vector, 48 bytes in size.

ciphertext The ciphertext to decrypt, greater than 32 bytes in size.

Output:

plaintext The resulting plaintext.

The output of LIONESS-BLAKE2b-ChaCha20-Decrypt is calculated as follows:

k1 = key[0:32]
k2 = key[32:64]
k3 = key[64:96]
k4 = key[96:128]
iv1 = iv[0:12]
iv2 = iv[12:24]
iv3 = iv[24:36]
iv4 = iv[36:48]

L = ciphertext[0:32]
R = ciphertext[32:]
L = L ^ BLAKE2b(k4 | iv4, R)
R = ChaCha20(L ^ k3, iv3, R)
L = L ^ BLAKE2b(k2 | iv2, R)
R = ChaCha20(L ^ k1, iv1, R)
plaintext = L | R

4. Implementation Considerations

When choosing the underlying stream cipher or MAC, implementors may wish to consider the initialization overhead such as key scheduling, as the performance impact can be non-negligible depending on algorithm choice.

5. Security Considerations

When parameterizing the LIONESS construct care MUST be taken to pick cryptographic primitives that meet the requirements specified in Section 2.1. Depending on the primitive chosen for S(), there may be a maximum block size imposed by the maximum amount of data that S() may encrypt with a given key.

Care MUST be taken to avoid leaking sensitive information via side-channels, however this is primarily influenced by the algorithms and implementations selected for H() and S() than the LIONESS construct itself.

No claims are made regarding the security of LIONESS when the same key material is used to encrypt multiple blocks, beyond those made in MPRA11. Conservative users may wish to avoid this behavior, use LIONESS as the building block for standard block cipher constructs that take initialization vectors, or incorporate initialization vectors in the H() and S() calls.

Appendix A. References

Appendix A.1 Normative References

Appendix A.2 Informative References

Appendix B. LIONESS-ChaCha20-BLAKE2b Test Vector

Appendix C. Citing This Document

Appendix C.1 Bibtex Entry

Note that the following bibtex entry is in the IEEEtran bibtex style as described in a document called “How to Use the IEEEtran BIBTEX Style”.

@online{LionessSpec,
title = {The LIONESS Wide-Block-Cipher Specification},
author = {Yawning Angel},
url = {https://github.com/katzenpost/katzenpost/blob/main/docs/specs/lioness.md},
year = {2017}
}

AB96

Anderson, R., Biham, E.,
"Two Practical and Provably Secure Block Ciphers: BEAR and LION",
1996

MPRA11

Maines, L., Piva, M., Rimoldi, A., Sala, M.,
"On the provable security of BEAR and LION schemes",
arXiv:1105.0259,
May 2011,
https://arxiv.org/abs/1105.0259

RFC2119

Bradner, S.,
"Key words for use in RFCs to Indicate Requirement Levels",
BCP 14, RFC 2119, DOI 10.17487/RFC2119,
March 1997,
http://www.rfc-editor.org/info/rfc2119

RFC7539

Nir, Y. and A. Langley,
"ChaCha20 and Poly1305 for IETF Protocols",
RFC 7539, DOI 10.17487/RFC7539,
May 2015,
http://www.rfc-editor.org/info/rfc7539

RFC7693

Saarinen, M-J., Ed., and J-P. Aumasson,
"The BLAKE2 Cryptographic Hash and Message Authentication Code (MAC)",
RFC 7693, DOI 10.17487/RFC7693,
November 2015,
http://www.rfc-editor.org/info/rfc7693

4.4 - Mix Network Specification

Abstract

This document describes the high level architecture and detailed protocols and behavior required of mix nodes participating in the Katzenpost Mix Network.

1. Introduction

This specification provides the design of a mix network meant provide an anonymous messaging protocol between clients and public mixnet services.

Various system components such as client software, end to end messaging protocols, Sphinx cryptographic packet format and wire protocol are described in their own specification documents.

1.1 Terminology

  • A KiB is defined as 1024 8 bit octets.

  • Mixnet - A mixnet also known as a mix network is a network of mixes that can be used to build various privacy preserving protocols.

  • Mix - A cryptographic router that is used to compose a mixnet. Mixes use a cryptographic operation on messages being routed which provides bitwise unlinkability with respect to input versus output messages. Katzenpost is a decryption mixnet that uses the Sphinx cryptographic packet format.

  • Node - A Mix. Client's are NOT considered nodes in the mix network. However note that network protocols are often layered; in our design documents we describe "mixnet hidden services" which can be operated by mixnet clients. Therefore if you are using node in some adherence to methematical termonology one could conceivably designate a client as a node. That having been said, it would not be appropriate to the discussion of our core mixnet protocol to refer to the clients as nodes.

  • Entry mix, Entry node - An entry mix is a mix that has some additional features:

  1. An entry mix is always the first hop in routes where the message originates from a client.
  2. An entry mix authenticates client’s direct connections via the mixnet’s wire protocol.
  3. An entry mix queues reply messages and allows clients to retrieve them later.
  • Service mix - A service mix is a mix that has some additional features:
  1. A service mix is always the last hop in routes where the message originates from a client.
  2. A service mix runs mixnet services which use a Sphinx SURB based protocol.
  • User - An agent using the Katzenpost system.

  • Client - Software run by the User on its local device to participate in the Mixnet. Again let us reiterate that a client is not considered a "node in the network" at the level of analysis where we are discussing the core mixnet protocol in this here document.

  • Katzenpost - A project to design many improved decryption mixnet protocols.

    Classes of traffic - We distinguish the following classes of traffic:

  • SURB Replies (also sometimes referred to as ACKs)

  • Forward messages

  • Packet - Also known as a Sphinx packet. A nested encrypted packet that, is routed through the mixnet and cryptographically transformed at each hop. The length of the packet is fixed for every class of traffic. Packet payloads encapsulate messages.

  • Payload - The payload, also known as packet payload, is a portion of a Packet containing a message, or part of a message, to be delivered anonymously.

  • Message - A variable-length sequence of octets sent anonymously through the network. Short messages are sent in a single packet; long messages are fragmented across multiple packets.

  • MSL - Maximum Segment Lifetime, 120 seconds.

1.2 Conventions Used in This Document

The key words “MUST”, “MUST NOT”, “REQUIRED”, “SHALL”, “SHALL NOT”, “SHOULD”, “SHOULD NOT”, “RECOMMENDED”, “MAY”, and “OPTIONAL” in this document are to be interpreted as described in RFC2119

2. System Overview

The presented system design is based on LOOPIX Below, we present the system overview.

The entry mixes are responsible for authenticating clients, accepting packets from the client, and forwarding them to the mix network, which then relays packets to the destination service mix. Our network design uses a strict topology where forward message traverse the network from entry mix to service mix. Service mixes can optionally reply if the forward message contained a Single Use Reply Block (see SPHINXSPEC.

The PKI system that handles the distribution of various network wide parameters, and information required for each participant to participate in the network such as IP address/port combinations that each node can be reached at, and cryptographic public keys. The specification for the PKI is beyond the scope of this document and is instead covered in KATZMIXPKI.

The mix network provides neither reliable nor in-order delivery semantics. The described mix network is neither a user facing messaging system nor is it an application. It is intended to be a low level protocol which can be composed to form more elaborate mixnet protocols with stronger more useful privacy notions.

2.1 Threat Model

Here we cannot present the threat model to the higher level mixnet protocols. However this low level core mixnet protocol does have it’s own threat model which we attempt to illucidate here.

We assume that the clients only talk to mixnet services. These services make use of a client provided delivery token known as a SURB (Single Use Reply Block) to send their replies to the client without knowing the client’s entry mix. This system guarantees third-party anonymity, meaning that no parties other than client and the service are able to learn that the client and service are communicating. Note that this is in contrast with other designs, such as Mixminion, which provide sender anonymity towards recipients as well as anonymous replies.

Mixnet clients will randomly select an entry node to use and may reconnect if disconnected for under a duration threshold. The entry mix can determine the approximate message volume originating from and destined to a given client. We consider the entry mix follows the protocol and might be an honest-but-curious adversary.

External local network observers can not determine the number of Packets traversing their region of the network because of the use of decoy traffic sent by the clients. Global observers will not be able to de-anonymize packet paths if there are enough packets traversing the mix network. Longer term statistical disclosure attacks are likely possible in order to link senders and receivers.

A malicious mix only has the ability to remember which input packets correspond to the output packets. To discover the entire path all of the mixes in the path would have to be malicious. Moreover, the malicious mixes can drop, inject, modify or delay the packets for more or less time than specified.

2.2 Network Topology

The Katzenpost Mix Network uses a layered topology consisting of a fixed number of layers, each containing a set of mixes. At any given time each Mix MUST only be assigned to one specific layer. Each Mix in a given layer N is connected to every other Mix in the previous and next layer, and or every participating Provider in the case of the mixes in layer 0 or layer N (first and last layer). :

Layer 0          Layer 1        Layer 2        Layer 3           Layer 4
+-----------+      +-------+      +-------+      +-------+      +-------------+
+-> | entry mix | -+-> |  Mix  | -+-> |  Mix  | -+-> |  Mix  | -+-> | service mix |
|   +-----------+  |   +-------+  |   +-------+  |   +-------+  |   +-------------+
|                  |              |              |              |
|   +-----------+  |   +-------+  |   +-------+  |   +-------+  |   +-------------+
+-> | entry mix | -+-> |  Mix  | -+-> |  Mix  | -+-> |  Mix  | -+-> | service mix |
|   +-----------+  |   +-------+  |   +-------+  |   +-------+  |   +-------------+
|                  |              |              |              |
|                  |   +-------+  |   +-------+  |   +-------+  |   +-------------+
|                  +-> |  Mix  | -+-> |  Mix  | -+-> |  Mix  | -+-> | service mix |
|                      +-------+      +-------+      +-------+  |   +-------------+
|                                                               |
+---------------------------------------------------------------+

Note: Multiple distinct connections are collapsed in the figure for sake of brevity/clarity.

The network topology MUST also maximize the number of security domains traversed by the packets. This can be achieved by not allowing mixes from the same security domain to be in different layers.

Requirements for the topology:

  • Should allow for non-uniform throughput of each mix (Get bandwidth weights from the PKI).
  • Should maximize distribution among security domains, in this case the mix descriptor specified family field would indicate the security domain or entity operating the mix.
  • Other legal jurisdictional region awareness for increasing the cost of compulsion attacks.

3. Packet Format Overview

For the packet format of the transported messages we use the Sphinx cryptographic packet format. The detailed description of the packet format, construction, processing and security / anonymity considerations see SPHINXSPEC, “The Sphinx Mix Network Cryptographic Packet Format Specification”.

As the Sphinx packet format is generic, the Katzenpost Mix Network must provide a concrete instantiation of the format, as well as additional Sphinx per-hop routing information commands.

3.1 Sphinx Cryptographic Primitives

For the current version of the Katzenpost Mix Network, let the following cryptographic primitives be used as described in the Sphinx specification.

  • H(M) - As the output of this primitive is only used locally to a Mix, any suitable primitive may be used.
  • MAC(K, M) - HMAC-SHA256 RFC6234, M_KEY_LENGTH of 32 bytes (256 bits), and MAC_LENGTH of 32 bytes (256 bits).
  • KDF(SALT, IKM) - HKDF-SHA256, HKDF-Expand only, with SALT used as the info parameter.
  • S(K, IV) - CTR-AES256 [SP80038A], S_KEY_LENGTH of 32 bytes (256 bits), and S_IV_LENGTH of 12 bytes (96 bits), using a 32 bit counter.
  • SPRP_Encrypt(K, M)/SPRP_Decrypt(K, M) - AEZv5 AEZV5, SPRP_KEY_LENGTH of 48 bytes (384 bits). As there is a disconnect between AEZv5 as specified and the Sphinx usage, let the following be the AEZv5 parameters:
    • nonce - 16 bytes, reusing the per-hop Sphinx header IV.
    • additional_data - Unused.
    • tau - 0 bytes.
  • EXP(X, Y) - X25519 RFC7748 scalar multiply, GROUP_ELEMENT_LENGTH of 32 bytes (256 bits), G is the X25519 base point.

3.2 Sphinx Packet Parameters

The following parameters are used as for the Katzenpost Mix Network instantiation of the Sphinx Packet Format:

  • AD_SIZE - 2 bytes.
  • SECURITY_PARAMETER - 32 bytes. (except for our SPRP which we plan to upgrade)
  • PER_HOP_RI_SIZE - (XXX/ya: Addition is hard, let's go shopping.)
  • NODE_ID_SIZE - 32 bytes, the size of the Ed25519 public key, used as Node identifiers.
  • RECIPIENT_ID_SIZE - 64 bytes, the maximum size of local-part component in an e-mail address.
  • SURB_ID_SIZE - Single Use Reply Block ID size, 16 bytes.
  • MAX_HOPS - 5, the ingress provider, a set of three mixes, and the egress provider.
  • PAYLOAD_SIZE - (XXX/ya: Subtraction is hard, let's go shopping.)
  • KDF_INFO - The byte string Katzenpost-kdf-v0-hkdf-sha256.

The Sphinx Packet Header additional_data field is specified as follows:

struct {
    uint8_t version;  /* 0x00 */
    uint8_t reserved; /* 0x00 */
} KatzenpostAdditionalData;

Double check to ensure that this causes the rest of the packet header to be 4 byte aligned, when wrapped in the wire protocol command and framing. This might need to have 3 bytes reserved instead.

All nodes MUST reject Sphinx Packets that have additional_data that is not as specified in the header.

Design decision.

  • We can eliminate a trial decryption step per packet around the epoch transitions by having a command that rewrites the AD on a per-hop basis and including an epoch identifier.

I am uncertain as to if the additional complexity is worth it for a situation that can happen for a few minutes out of every epoch.

3.3 Sphinx Per-hop Routing Information Extensions

The following extensions are added to the Sphinx Per-Hop Routing Information commands.

Let the following additional routing commands be defined in the extension RoutingCommandType range (0x80 - 0xff):

enum {
    mix_delay(0x80),
} KatzenpostCommandType;

The mix_delay command structure is as follows:

struct {
    uint32_t delay_ms;
} NodeDelayCommand;

4. Mix Node Operation

All Mixes behave in the following manner:

  • Accept incoming connections from peers, and open persistent connections to peers as needed Section 4.1 <4.1>.
  • Periodically interact with the PKI to publish Identity and Sphinx packet public keys, and to obtain information about the peers it should be communicating with, along with periodically rotating the Sphinx packet keys for forward secrecy Section 4.2 <4.2>.
  • Process inbound Sphinx Packets, delay them for the specified time and forward them to the appropriate Mix and or Provider Section 4.3 <4.3>.

All Nodes are identified by their link protocol signing key, for the purpose of the Sphinx packet source routing hop identifier.

All Nodes participating in the Mix Network MUST share a common view of time, via NTP or similar time synchronization mechanism.

All communication to and from participants in the Katzenpost Mix Network is done via the Katzenpost Mix Network Wire Protocol KATZMIXWIRE.

Nodes are responsible for establishing the connection to the next hop, for example, a mix in layer 0 will accept inbound connections from all Providers listed in the PKI, and will proactively establish connections to each mix in layer 1.

Nodes MAY accept inbound connections from unknown Nodes, but MUST not relay any traffic until they became known via listing in the PKI document, and MUST terminate the connection immediately if authentication fails for any other reason.

Nodes MUST impose an exponential backoff when reconnecting if a link layer connection gets terminated, and the minimum retry interval MUST be no shorter than 5 seconds.

Nodes MAY rate limit inbound connections as required to keep load and or resource use at a manageable level, but MUST be prepared to handle at least one persistent long lived connection per potentially eligible peer at all times.

4.2 Sphinx Mix and Provider Key Rotation

Each Node MUST rotate the key pair used for Sphinx packet processing periodically for forward secrecy reasons and to keep the list of seen packet tags short. The Katzenpost Mix Network uses a fixed interval (epoch), so that key rotations happen simultaneously throughout the network, at predictable times.

Let each epoch be exactly 10800 seconds (3 hours) in duration, and the 0th Epoch begin at 2017-06-01 00:00 UTC. For more details see our “Katzenpost Mix Network Public Key Infrastructure Specification” document. KATZMIXPKI

4.3 Sphinx Packet Processing

The detailed processing of the Sphinx packet is described in the Sphinx specification: “The Sphinx Mix Network Cryptographic Packet Format Specification”. Below, we present an overview of the steps which the node is performing upon receiving the packet:

  1. Records the time of reception.
  2. Perform a Sphinx_Unwrap operation to authenticate and decrypt a packet, discarding it immediately if the operation fails.
  3. Apply replay detection to the packet, discarding replayed packets immediately.
  4. Act on the routing commands.

All packets processed by Mixes MUST contain the following commands.

  • NextNodeHopCommand, specifying the next Mix or Provider that the packet will be forwarded to.
  • NodeDelayCommand, specifying the delay in milliseconds to be applied to the packet, prior to forwarding it to the Node specified by the NextNodeHopCommand, as measured from the time of reception.

Mixes MUST discard packets that have any commands other than a NextNodeHopCommand or a NodeDelayCommand. Note that this does not apply to Providers or Clients, which have additional commands related to recipient and SURB (Single Use Reply Block) processing.

Nodes MUST continue to accept the previous epoch’s key for up to 1MSL past the epoch transition, to tolerate latency and clock skew, and MUST start accepting the next epoch’s key 1MSL prior to the epoch transition where it becomes the current active key.

Upon the final expiration of a key (1MSL past the epoch transition), Nodes MUST securely destroy the private component of the expired Sphinx packet processing key along with the backing store used to maintain replay information associated with the expired key.

Nodes MAY discard packets at any time, for example to keep congestion and or load at a manageable level, however assuming the Sphinx_Unwrap operation was successful, the packet MUST be fed into the replay detection mechanism.

Nodes MUST ensure that the time a packet is forwarded to the next Node is around the time of reception plus the delay specified in NodeDelayCommand. Since exact millisecond processing is unpractical, implementations MAY tolerate a small window around that time for packets to be forwarded. That tolerance window SHOULD be kept minimal.

Nodes MUST discard packets that have been delayed for significantly more time than specified by the NodeDelayCommand.

5. Anonymity Considerations

5.1 Topology

Layered topology is used because it offers the best level of anonymity and ease of analysis, while being flexible enough to scale up traffic. Whereas most mixnet papers discuss their security properties in the context of a cascade topology, which does not scale well, or a free-route network, which quickly becomes intractable to analyze when the network grows, while providing slightly worse anonymity than a layered topology. MIXTOPO10

Important considerations when assigning mixes to layers, in order of decreasing importance, are:

  1. Security: do not allow mixes from one security domain to be in different layers to maximise the number of security domains traversed by a packet
  2. Performance: arrange mixes in layers to maximise the capacity of the layer with the lowest capacity (the bottleneck layer)
  3. Security: arrange mixes in layers to maximise the number of jurisdictions traversed by a packet (this is harder to do really well than it seems, requires understanding of legal agreements such as MLATs).

5.2 Mixing strategy

As a mixing technique the Poisson mix strategy LOOPIX and KESDOGAN98 is used, which REQUIRES that a packet at each hop in the route is delayed by some amount of time, randomly selected by the sender from an exponential distribution. This strategy allows to prevent the timing correlation of the incoming and outgoing traffic from each node. Additionally, the parameters of the distribution used for generating the delay can be tuned up and down depending on the amount of traffic in the network and the application for which the system is deployed.

6. Security Considerations

The source of all authority in the mixnet system comes from the Directory Authority system which is also known as the mixnet PKI. This system gives the mixes and clients a consistent view of the network while allowing human intervention when needed. All public mix key material and network connection information is distributed by this Directory Authority system.

Appendix A. References

Appendix A.1 Normative References

Appendix A.2 Informative References

Appendix B. Citing This Document

Appendix B.1 Bibtex Entry

Note that the following bibtex entry is in the IEEEtran bibtex style as described in a document called “How to Use the IEEEtran BIBTEX Style”.

@online{KatzMixnet,
title = {Katzenpost Mix Network Specification},
author = {Yawning Angel and George Danezis and Claudia Diaz and Ania Piotrowska and David Stainton},
url = {https://github.com/katzenpost/katzenpost/blob/main/docs/specs/mixnet.rst},
year = {2017}
}

AEZV5

Hoang, V., Krovetz, T., Rogaway, P.,
"AEZ v5: Authenticated Encryption by Enciphering",
March 2017,
http://web.cs.ucdavis.edu/~rogaway/aez/aez.pdf

KATZMIXE2E

Angel, Y., Danezis, G., Diaz, C., Piotrowska, A., Stainton, D.,
"Katzenpost Mix Network End-to-end Protocol Specification", 
July 2017,
https://github.com/katzenpost/katzenpost/blob/main/docs/specs/old/end_to_end.md

KATZMIXPKI

Angel, Y., Piotrowska, A., Stainton, D.,
"Katzenpost Mix Network Public Key Infrastructure Specification",
December 2017,
https://github.com/katzenpost/katzenpost/blob/master/docs/specs/pki.md

KATZMIXWIRE

Angel, Y., 
"Katzenpost Mix Network Wire Protocol Specification",
June 2017.
https://github.com/katzenpost/katzenpost/blob/master/docs/specs/wire-protocol.md

KESDOGAN98

Kesdogan, D., Egner, J., and Büschkes, R.,
"Stop-and-Go-MIXes Providing Probabilistic Anonymity in an Open System."
Information Hiding, 1998,
https://www.freehaven.net/anonbib/cache/stop-and-go.pdf

LOOPIX

Piotrowska, A., Hayes, J., Elahi, T., Meiser, S., Danezis, G.,
"The Loopix Anonymity System",
USENIX, August, 2017
https://arxiv.org/pdf/1703.00536.pdf

MIXTOPO10

Diaz, C., Murdoch, S., Troncoso, C.,
"Impact of Network Topology on Anonymity and Overhead in Low-Latency Anonymity Networks",
PETS, July 2010,
https://www.esat.kuleuven.be/cosic/publications/article-1230.pdf

RFC2119

Bradner, S.,
"Key words for use in RFCs to Indicate Requirement Levels",
BCP 14, RFC 2119, DOI 10.17487/RFC2119,
March 1997,
http://www.rfc-editor.org/info/rfc2119

RFC5246

Dierks, T. and E. Rescorla,
"The Transport Layer Security (TLS) Protocol Version 1.2",
RFC 5246, DOI 10.17487/RFC5246,
August 2008,
https://www.rfc-editor.org/info/rfc5246

RFC6234

Eastlake 3rd, D. and T. Hansen,
"US Secure Hash Algorithms (SHA and SHA-based HMAC and HKDF)\"
RFC 6234, DOI 10.17487/RFC6234,
May 2011,
https://www.rfc-editor.org/info/rfc6234

RFC7748

Langley, A., Hamburg, M., and S. Turner,
"Elliptic Curves for Security", 
RFC 7748,
January 2016.

SP80038A

Dworkin, M.,
"Recommendation for Block Cipher Modes of Operation",
SP800-38A, 10.6028/NIST.SP.800,
December 2001,
https://doi.org/10.6028/NIST.SP.800-38A

SPHINXSPEC

Angel, Y., Danezis, G., Diaz, C., Piotrowska, A., Stainton, D.,
"Sphinx Mix Network Cryptographic Packet Format Specification"
July 2017,
https://github.com/katzenpost/katzenpost/blob/master/docs/specs/sphinx.md

4.5 - Provider-side Autoresponder Extension

Abstract

1. Introduction

This interface is meant to provide support for various autoresponder agents “Kaetzchen” that run on Katzenpost provider instances, thus bypassing the need to run a discrete client instance to provide functionality. The use-cases for such agents include, but are not limited to, user identity key lookup, a discard address, and a loop-back responder for the purpose of cover traffic.

1.1 Conventions Used in This Document

The key words “MUST”, “MUST NOT”, “REQUIRED”, “SHALL”, “SHALL NOT”, “SHOULD”, “SHOULD NOT”, “RECOMMENDED”, “MAY”, and “OPTIONAL” in this document are to be interpreted as described in RFC2119.

1.2. Terminology

SURB - “single use reply block” SURBs are used to achieve recipient anonymity, that is to say, SURBs function as a cryptographic delivery token that you can give to another client entity so that they can send you a message without them knowing your identity or location on the network. See SPHINXSPEC and SPHINX.

BlockSphinxPlaintext - The payload structure which is encapsulated by the Sphinx body. It is described in KATZMIXE2E, section

  1. Client and Provider processing of received packets

2. Extension Overview

Each Kaetzchen agent will register as a potential recipient on its Provider. When the Provider receives a forward packet destined for a Kaetzchen instance, it will hand off the fully unwrapped packet along with its corresponding SURB to the agent, which will then act on the packet and optionally reply utilizing the SURB.

3. Agent Requirements

  • Each agent operation MUST be idempotent.
  • Each agent operation request and response MUST fit within one Sphinx packet.
  • Each agent SHOULD register a recipient address that is prefixed with (Or another standardized delimiter, agreed to by all participating providers in a given mixnet).
  • Each agent SHOULD register a recipient address that consists of a
  • RFC5322 dot-atom value, and MUST register recipient addresses that are at most 64 octets in length.
  • The first byte of the agent's response payload MUST be 0x01 to allow clients to easily differentiate between SURB-ACKs and agent responses.

3.1 Mix Message Formats

Messages from clients to Kaetzchen use the following payload format in the forward Sphinx packet:

struct {
    uint8_t flags;
    uint8_t reserved; /* Set to 0x00. */
    select (flags) {
    case 0:
    opaque padding[sizeof(SphinxSURB)];
    case 1:
    SphinxSURB surb;
    }
    opaque plaintext[];
} KaetzchenMessage;

The plaintext component of a KaetzchenMessage MUST be padded by appending ‘0x00’ bytes to make the final total size of a KaetzchenMessage equal to that of a BlockSphinxPlaintext.

Messages (replies) from the Kaetzchen to client use the following payload format in the SURB generated packet::

struct {
    opaque plaintext[];
} KaetzchenReply;

The plaintext component of a KaetzchenReply MUST be padded by appending ‘0x00’ bytes to make the final total size of a KaetzchenReply equal to that of a BlockSphinxPlaintext

4. PKI Extensions

Each provider SHOULD publish the list of publicly accessible Kaetzchen agent endpoints in its MixDescriptor, along with any other information required to utilize the agent.

Provider should make this information available in the form of a map in which the keys are the label used to identify a given service, and the value is a map with arbitrary keys.

Valid service names refer to the services defined in extensions to this specification. Every service MUST be implemented by one and only one Kaetzchen agent.

For each service, the provider MUST advertise a field for the endpoint at which the Kaetzchen agent can be reached, as a key value pair where the key is endpoint, and the value is the provider side endpoint identifier.

{ "kaetzchen":
    { "keyserver" : {
            "endpoint": "+keyserver",
            "version" : 1 } },
    { "discard" : {
            "endpoint": "+discard", } },
    { "loop" : {
            "endpoint": "+loopback",
            "restrictedToUsers": true } },
}

5. Anonymity Considerations

In the event that the mix keys for the entire return path are compromised, it is possible for adversaries to unwrap the SURB and determine the final recipient of the reply.

Depending on what sort of operations a given agent implements, there may be additional anonymity impact that requires separate consideration.

Clients MUST NOT have predictable retranmission otherwise this makes active confirmations attacks possible which could be used to discover the ingress Provider of the client.

6. Security Considerations

It is possible to use this mechanism to flood a victim with unwanted traffic by constructing a request with a SURB destined for the target.

Depending on the operations implemented by each agent, the added functionality may end up being a vector for Denial of Service attacks in the form of CPU or network overload.

Unless the agent implements additional encryption, message integrity and privacy is limited to that which is provided by the base Sphinx packet format and parameterization.

7. Acknowledgments

The inspiration for this extension comes primarily from a design by Vincent Breitmoser.

Appendix A. References

Appendix A.1 Normative References

Appendix A.2 Informative References

Appendix B. Citing This Document

Appendix B.1 Bibtex Entry

Note that the following bibtex entry is in the IEEEtran bibtex style as described in a document called “How to Use the IEEEtran BIBTEX Style”.

@online{Kaetzchen,
title = {Katzenpost Provider-side Autoresponder Extension},
author = {Yawning Angel and Kali Kaneko and David Stainton},
url = {https://github.com/katzenpost/katzenpost/blob/main/docs/specs/kaetzchen.md},
year = {2018}
}

KATZMIXE2E

Angel, Y., Danezis, G., Diaz, C., Piotrowska, A., Stainton, D.,
"Katzenpost Mix Network End-to-end Protocol Specification",
July 2017,
https://github.com/katzenpost/katzenpost/blob/main/docs/specs/old/end_to_end.md

KATZMIXPKI

Angel, Y., Piotrowska, A., Stainton, D.,
"Katzenpost Mix Network Public Key Infrastructure Specification",
December 2017,
https://github.com/katzenpost/katzenpost/blob/main/docs/specs/pki.md

RFC2119

Bradner, S.,
"Key words for use in RFCs to Indicate Requirement Levels",
BCP 14, RFC 2119, DOI 10.17487/RFC2119,
March 1997,
http://www.rfc-editor.org/info/rfc2119

RFC5322

Resnick, P., Ed.,
"Internet Message Format",
RFC 5322, DOI 10.17487/RFC5322,
October 2008,
https://www.rfc-editor.org/info/rfc5322

SPHINX

Danezis, G., Goldberg, I.,
"Sphinx: A Compact and Provably Secure Mix Format",
DOI 10.1109/SP.2009.15,
May 2009,
http://research.microsoft.com/en-us/um/people/gdane/papers/sphinx-eprint.pdf

SPHINXSPEC

Angel, Y., Danezis, G., Diaz, C., Piotrowska, A., Stainton, D.,
"Sphinx Mix Network Cryptographic Packet Format Specification"
July 2017,
https://github.com/katzenpost/katzenpost/blob/main/docs/specs/sphinx.md

4.6 - Public Key Infrastructure

Abstract

1. Introduction

Mixnets are designed with the assumption that a Public Key Infrastructure (PKI) exists and it gives each client the same view of the network. This specification is inspired by the Tor and Mixminion Directory Authority systems MIXMINIONDIRAUTH TORDIRAUTH whose main features are precisely what we need for our PKI. These are decentralized systems meant to be collectively operated by multiple entities.

The mix network directory authority system (PKI) is essentially a cooperative decentralized database and voting system that is used to produce network consensus documents which mix clients periodically retrieve and use for their path selection algorithm when creating Sphinx packets. These network consensus documents are derived from a voting process between the Directory Authority servers.

This design prevents mix clients from using only a partial view of the network for their path selection so as to avoid fingerprinting and bridging attacks FINGERPRINTING, BRIDGING, and LOCALVIEW.

The PKI is also used by Authority operators to specify network-wide parameters, for example in the Katzenpost Decryption Mix Network KATZMIXNET the Poisson mix strategy is used and, therefore, all clients must use the same lambda parameter for their exponential distribution function when choosing hop delays in the path selection. The Mix Network Directory Authority system, aka PKI, SHALL be used to distribute such network-wide parameters in the network consensus document that have an impact on security and performance.

1.1 Conventions Used in This Document

The key words “MUST”, “MUST NOT”, “REQUIRED”, “SHALL”, “SHALL NOT”, “SHOULD”, “SHOULD NOT”, “RECOMMENDED”, “MAY”, and “OPTIONAL” in this document are to be interpreted as described in RFC2119.

The “C” style Presentation Language as described in RFC5246 Section 4 is used to represent data structures for additional cryptographic wire protocol commands. KATZMIXWIRE

1.2 Terminology

PKI - Public Key Infrastructure

Directory Authority system - refers to specific PKI schemes used by

Mixminion and Tor

MSL - maximum segment lifetime

mix descriptor - A database record which describes a component mix

family - Identifier of security domains or entities operating one or more mixes in the network. This is used to inform the path selection algorithm.

nickname - simply a nickname string that is unique in the consensus document, see “Katzenpost Mix Network Specification” section “2.2. Network Topology”.

layer - The layer indicates which network topology layer a particular mix resides in.

Provider - A service operated by a third party that Clients communicate directly with to communicate with the Mixnet. It is responsible for Client authentication, forwarding outgoing messages to the Mixnet, and storing incoming messages for the Client. The Provider MUST have the ability to perform cryptographic operations on the relayed messages.

1.3 Security Properties Overview

This Directory Authority system has the following feature goals and security properties:

  • All Directory Authority servers must agree with each other on the set of Directory Authorities.
  • All Directory Authority servers must agree with each other on the set of mixes.
  • This system is intentionally designed to provide identical network consensus documents to each mix client. This mitigates epistemic attacks against the client path selection algorithm such as fingerprinting and bridge attacks FINGERPRINTING BRIDGING.
  • This system is NOT byzantine-fault-tolerant, it instead allows for manual intervention upon consensus fault by the Directory Authority operators. Further, these operators are responsible for expelling bad acting operators from the system.
  • This system enforces the network policies such as mix join policy wherein intentionally closed mixnets will prevent arbitrary hosts from joining the network by authenticating all descriptor signatures with a list of allowed public keys.
  • The Directory Authority system for a given mix network is essentially the root of all authority.

1.4 Differences from Tor and Mixminion Directory Authority systems

In this document we specify a Directory Authority system which is different from that of Tor's and Mixminion’s in a number of ways:

  • The list of valid mixes is expressed in an allowlist. For the time being there is no specified “bandwidth authority” system which verifies the health of mixes (Further research required in this area).
  • There’s no non-directory channel to inform clients that a node is down, so it will end up being a lot of packet loss, since clients will continue to include the missing node in their path selection until keys published by the node expire and it falls out of the consensus.
  • The schema of the mix descriptors is different from that used in Mixminion and Tor, including a change which allows our mix descriptor to express n Sphinx mix routing public keys in a single mix descriptor whereas in the Tor and Mixminion Directory Authority systems, n descriptors are used.
  • The serialization format of mix descriptors is different from that used in Mixminion and Tor.
  • The shared random number computation is performed every voting round, and is required for a vote to be accepted by each authority. The shared random number is used to deterministically generate the network topology.

2. Overview of Mix PKI Interaction

Each Mix MUST rotate the key pair used for Sphinx packet processing periodically for forward secrecy reasons and to keep the list of seen packet tags short. SPHINX09 SPHINXSPEC The Katzenpost Mix Network uses a fixed interval (epoch), so that key rotations happen simultaneously throughout the network, at predictable times.

Each Directory Authority server MUST use some time synchronization protocol in order to correctly use this protocol. This Directory Authority system requires time synchronization to within a few minutes.

Let each epoch be exactly 1200 seconds (20 minutes) in duration, and the 0th Epoch begin at 2017-06-01 00:00 UTC.

To facilitate smooth operation of the network and to allow for delays that span across epoch boundaries, Mixes MUST publish keys to the PKI for at least 3 epochs in advance, unless the mix will be otherwise unavailable in the near future due to planned downtime.

At an epoch boundary, messages encrypted to keys from the previous epoch are accepted for a grace period of 2 minutes.

Thus, at any time, keys for all Mixes for the Nth through N + 2nd epoch will be available, allowing for a maximum round trip (forward message + SURB) delay + transit time of 40 minutes. SURB lifetime is limited to a single epoch because of the key rotation epoch, however this shouldn’t present any useability problems since SURBs are only used for sending ACK messages from the destination Provider to the sender as described in KATZMIXE2E.

2.1 PKI Protocol Schedule

There are two main constraints to Authority schedule:

  1. There MUST be enough key material extending into the future so that clients are able to construct Sphinx packets with a forward and reply paths.

  2. All participants should have enough time to participate in the protocol; upload descriptors, vote, generate documents, download documents, establish connections for user traffic.

The epoch duration of 20 minutes is more than adequate for these two constraints.

NOTE: perhaps we should make it shorter? but first let’s do some scaling and bandwidth calculations to see how bad it gets…

2.1.1 Directory Authority Server Schedule

Directory Authority server interactions are conducted according to the following schedule, where T is the beginning of the current epoch, and P is the length of the epoch period.

  • T - Epoch begins
  • T + P/2 - Vote exchange
  • T + (5/8)*P - Reveal exchange
  • T + (6/8)*P - Tabulation and signature exchange
  • T + (7/8)*P - Publish consensus

2.1.2 Mix Schedule

Mix PKI interactions are conducted according to the following schedule, where T is the beginning of the current epoch.

T + P/8 - Deadline for publication of all mixes documents for the next epoch.

T + (7/8)*P - This marks the beginning of the period where mixes perform staggered fetches of the PKI consensus document.

T + (8/9)*P - Start establishing connections to the new set of relevant mixes in advance of the next epoch.

T + P - 1MSL - Start accepting new Sphinx packets encrypted to the next epoch’s keys.

T + P + 1MSL - Stop accepting new Sphinx packets encrypted to the previous epoch’s keys, close connections to peers no longer listed in the PKI documents and erase the list of seen packet tags.

Mix layer changes are controlled by the Directory Authorities and therefore a mix can be reassigned to a different layer in our stratified topology at any new epoch. Mixes will maintain incoming and outgoing connections to the various nodes until all mix keys have expired, iff the node is still listed anywhere in the current document.

3. Voting for Consensus Protocol

In our Directory Authority protocol, all the actors conduct their behavior according to a common schedule as outlined in section "2.1 PKI Protocol Schedule". The Directory Authority servers exchange messages to reach consensus about the network. Other tasks they perform include collecting mix descriptor uploads from each mix for each key rotation epoch, voting, shared random number generation, signature exchange and publishing of the network consensus documents.

3.1 Protocol Messages

There are only two document types in this protocol:

  • mix_descriptor: A mix descriptor describes a mix.
  • directory: A directory contains a list of descriptors and other information that describe the mix network.

Mix descriptor and directory documents MUST be properly signed.

3.1.1 Mix Descriptor and Directory Signing

Mixes MUST compose mix descriptors which are signed using their private identity key, an ed25519 key. Directories are signed by one or more Directory Authority servers using their authority key, also an ed25519 key. In all cases, signing is done using JWS RFC7515.

3.2 Vote Exchange

As described in section “2.1 PKI Protocol Schedule”, the Directory Authority servers begin the voting process 1/8 of an epoch period after the start of a new epoch. Each Authority exchanges vote directory messages with each other.

Authorities archive votes from other authorities and make them available for retreival. Upon receiving a new vote, the authority examines it for new descriptors and includes any valid descriptors in its view of the network.

Each Authority includes in its vote a hashed value committing to a choice of a random number for the vote. See section 4.3 for more details.

3.2.1 Voting Wire Protocol Commands

The Katzenpost Wire Protocol as described in KATZMIXWIRE is used by Authorities to exchange votes. We define additional wire protocol commands for sending votes:

enum {

:   vote(22), vote_status(23),

} Command;

The structures of these commands are defined as follows:

struct {
:   uint64_t epoch_number; opaque public_key[ED25519_KEY_LENGTH];
    opaque payload[];

} VoteCommand;

struct {
:   uint8 error_code;

} VoteStatusCommand;

3.2.2 The vote Command

The vote command is used to send a PKI document to a peer Authority during the voting period of the PKI schedule.

The payload field contains the signed and serialized PKI document representing the sending Authority’s vote. The public_key field contains the public identity key of the sending Authority which the receiving Authority can use to verify the signature of the payload. The epoch_number field is used by the receiving party to quickly check the epoch for the vote before deserializing the payload.

Each authority MUST include its commit value for the shared random computation in this phase along with its signed vote. This computation is derived from the Tor Shared Random Subsystem, TORSRV.

3.2.3 The vote_status Command

The vote_status command is used to reply to a vote command. The error_code field indicates if there was a failure in the receiving of the PKI document.

enum {

:   vote_ok(0), /\* None error condition. */ vote_too_early(1), /*
    The Authority should try again later. */ vote_too_late(2), /*
    This round of voting was missed. \*/
}

The epoch_number field of the vote struct is compared with the epoch that is currently being voted on. vote_too_early and vote_too_late are replied back to the voter to report that their vote was not accepted.

3.3 Reveal Exchange

As described in section “2.1 PKI Protocol Schedule”, the Directory Authority servers exchange the reveal values after they have exchanged votes which contain a commit value. Each Authority exchanges reveal messages with each other.

3.3.1 Reveal Wire Protocol Commands

The Katzenpost Wire Protocol as described in KATZMIXWIRE is used by Authorities to exchange reveal values previously commited to in their votes. We define additional wire protocol commands for exchanging reveals:

enum {
:   reveal(25), reveal_status(26),
} Command;

The structures of these commands are defined as follows:

struct {
:   uint64_t epoch_number; opaque public_key[ED25519_KEY_LENGTH];
    opaque payload[];

} RevealCommand;

struct {
:   uint8 error_code;

} RevealStatusCommand;

3.3.2 The reveal Command

The reveal command is used to send a reveal value to a peer authority during the reveal period of the PKI schedule.

The payload field contains the signed and serialized reveal value. The public_key field contains the public identity key of the sending Authority which the receiving Authority can use to verify the signature of the payload. The epoch_number field is used by the receiving party to quickly check the epoch for the reveal before deserializing the payload.

3.3.3 The reveal_status Command

The reveal_status command is used to reply to a reveal command. The error_code field indicates if there was a failure in the receiving of the shared random reveal value.

enum {

:   reveal_ok(8), /* None error condition. */ reveal_too_early(9), 
    /* The Authority should try again later. */
    reveal_not_authorized(10), /* The Authority was rejected. */
    reveal_already_received(11), /* The Authority has already revealed
    this round. */ reveal_too_late(12) /* This round of revealing was
    missed. */

} Errorcodes;

The epoch_number field of the reveal struct is compared with the epoch that is currently being voted on. reveal_too_early and reveal_too_late are replied back to the authority to report their reveal was not accepted. The status code reveal_not_authorized is used if the Authority is rejected. The reveal_already_received is used to report that a valid reveal command was already received for this round.

3.4 Cert Exchange

The Cert command is the same as a Vote but contains the set of Reveal values as seen by the voting peer. In order to ensure that a misconfigured or malicious Authority operator cannot amplify their ability to influence the threshold voting process, after Reveal messages have been exchanged, Authorities vote again, including the Reveals seen by them. Authorities may not introduce new MixDescriptors at this phase in the protocol.

Otherwise, a consensus partition can be obtained by witholding Reveal values from a threshold number of Peers. In the case of an even-number of Authorities, a denial of service by a single Authority was observed.

3.5 Vote Tabulation for Consensus Computation

The main design constraint of the vote tabulation algorithm is that it MUST be a deterministic process that produces the same result for each directory authority server. This result is known as a network consensus file.

A network consensus file is a well formed directory struct where the status field is set to consensus and contains 0 or more descriptors, the mix directory is signed by 0 or more directory authority servers. If signed by the full voting group then this is called a fully signed consensus.

  1. Validate each vote directory:
  • that the liveness fields correspond to the following epoch
  • status is vote
  • version number matches ours
  1. Compute a consensus directory:

Here we include a modified section from the Mixminion PKI spec MIXMINIONDIRAUTH:

For each distinct mix identity in any vote directory:

  • If there are multiple nicknames for a given identity, do not include any descriptors for that identity.

  • If half or fewer of the votes include the identity, do not include any descriptors for the identity. This also guarantees that there will be only one identity per nickname.

  • If we are including the identity, then for each distinct descriptor that appears in any vote directory:

    • Do not include the descriptor if it will have expired on the date the directory will be published.
    • Do not include the descriptor if it is superseded by other descriptors for this identity.
    • Do not include the descriptor if it not valid in the next epoch.
    • Otherwise, include the descriptor.
  • Sort the list of descriptors by the signature field so that creation of the consensus is reproducible.

  • Set directory status field to consensus.

  1. Compute a shared random number from the values revealed in the “Reveal” step. Authorities whose reveal value does not verify their commit value MUST be excluded from the consensus round. Authorities ensure that their peers MUST participate in Commit-and-Reveal, and MUST use correct Reveal values obtained from other Peers as part of the “Cert” exchange.

  2. Generate or update the network topology using the shared random number as a seed to a deterministic random number generator that determines the order that new mixes are placed into the topology.

3.6 Signature Collection

Each Authority signs their view of consensus, and exchanges detached Signatures with each other. Upon receiving each Signature it is added to the signatures on the Consensus if it validates the Consensus. The Authority SHOULD warn the administrator if network partition is detected.

If there is disagreement about the consensus directory, each authority collects signatures from only the servers which it agrees with about the final consensus.

// TODO: consider exchanging peers votes amongst authorities (or hashes thereof) to // ensure that an authority has distributed one and only unique vote amongst its peers.

3.7 Publication

If the consensus is signed by a majority of members of the voting group then it's a valid consensus and it is published.

4. PKI Protocol Data Structures

4.1 Mix Descriptor Format

Note that there is no signature field. This is because mix descriptors are serialized and signed using JWS. The IdentityKey field is a public ed25519 key. The MixKeys field is a map from epoch to public X25519 keys which is what the Sphinx packet format uses.

::: note ::: title Note :::

XXX David: replace the following example with a JWS example: :::

{
    "Version": 0,
    "Name": "",
    "Family": "",
    "Email": "",
    "AltContactInfo":"",
    "IdentityKey": "",
    "LinkKey":"",
    "MixKeys": {
       "Epoch": "EpochPubKey",
    },
    "Addresses": ["IP:Port"],
    "Layer": 0,
    "LoadWeight":0,
    "AuthenticationType":""
}

4.1.1 Scheduling Mix Downtime

Mix operators can publish a half empty mix descriptor for future epochs to schedule downtime. The mix descriptor fields that MUST be populated are:

  • Version
  • Name
  • Family
  • Email
  • Layer
  • IdentityKey
  • MixKeys

The map in the field called "MixKeys" should reflect the scheduled downtime for one or more epochs by not have those epochs as keys in the map.

4.2 Directory Format

Note: replace the following example with a JWS example

{
    "Signatures": [],
    "Version": 0,
    "Status": "vote",
    "Lambda" : 0.274,
    "MaxDelay" : 30,
    "Topology" : [],
    "Providers" : [],
}

4.3 Shared Random Value structure

Katzenpost’s Shared Random Value computation is inspired by Tor’s Shared Random Subsystem TORSRV.

Each voting round a commit value is included in the votes sent to other authorities. These are produced as follows:

H = blake2b-256

COMMIT = Uint64(epoch) | H(REVEAL) REVEAL = Uint64(epoch) | H(RN)

After the votes are collected from the voting round, and before signature exchange, the Shared Random Value field of the consensus document is the output of H over the input string calculated as follows:

  1. Validated Reveal commands received including the authorities own reveal are sorted by reveal value in ascending order and appended to the input in format IdentityPublicKeyBytes_n | RevealValue_n

However instead of the Identity Public Key bytes we instead encode the Reveal with the blake2b 256 bit hash of the public key bytes.

  1. If a SharedRandomValue for the previous epoch exists, it is appended to the input string, otherwise 32 NUL (x00) bytes are used.
REVEALS = ID_a \| R_a \| ID_b \| R_b \| \... SharedRandomValue =
H("shared-random" | Uint64(epoch) | REVEALS | PREVIOUS_SRV)

5. PKI Wire Protocol

The Katzenpost Wire Protocol as described in KATZMIXWIRE is used by both clients and by Directory Authority peers. In the following section we describe additional wire protocol commands for publishing mix descriptors, voting and consensus retrieval.

5.1 Mix Descriptor publication

The following commands are used for publishing mix descriptors and setting mix descriptor status:

enum {
      /* Extending the wire protocol Commands. */
      post_descriptor(20),
      post_descriptor_status(21),
}

The structures of these command are defined as follows:

struct {
   uint64_t epoch_number;
   opaque payload[];
} PostDescriptor;

struct {
   uint8 error_code;
} PostDescriptorStatus;

5.1.1 The post_descriptor Command

The post_descriptor command allows mixes to publish their descriptors.

5.1.2 The post_descriptor_status Command

The post_descriptor_status command is sent in response to a post_descriptor command, and uses the following error codes:

enum {
   descriptor_ok(0),
   descriptor_invalid(1),
   descriptor_conflict(2),
   descriptor_forbidden(3),
} ErrorCodes;

5.2 Voting

The following commands are used by Authorities to exchange votes:

enum {
   vote(22),
   vote_status(23),
   get_vote(24),
} Command;

The structures of these commands are defined as follows:

struct {
    uint64_t epoch_number;
    opaque public_key[ED25519_KEY_LENGTH];
    opaque payload[];
} VoteCommand;

struct {
   uint8 error_code;
} VoteStatusCommand;

5.2.1 The vote Command

The vote command is used to send a PKI document to a peer Authority during the voting period of the PKI schedule.

The payload field contains the signed and serialized PKI document representing the sending Authority’s vote. The public_key field contains the public identity key of the sending Authority which the receiving Authority can use to verify the signature of the payload. The epoch_number field is used by the receiving party to quickly check the epoch for the vote before deserializing the payload.

5.2.2 The vote_status Command

The vote_status command is used to reply to a vote command. The error_code field indicates if there was a failure in the receiving of the PKI document.

enum {
   vote_ok(0),               /* None error condition. */
   vote_too_early(1),        /* The Authority should try again later. */
   vote_too_late(2),         /* This round of voting was missed. */
   vote_not_authorized(3),   /* The voter's key is not authorized. */
   vote_not_signed(4),       /* The vote signature verification failed */
   vote_malformed(5),        /* The vote payload was invalid */
   vote_already_received(6), /* The vote was already received */
   vote_not_found(7),        /* The vote was not found */
}

The epoch_number field of the vote struct is compared with the epoch that is currently being voted on. vote_too_early and vote_too_late are replied back to the voter to report that their vote was not accepted.

5.2.3 The get_vote Command

The get_vote command is used to request a PKI document (vote) from a peer Authority. The epoch field contains the epoch from which to request the vote, and the public_key field contains the public identity key of the Authority of the requested vote. A successful query is responded to with a vote command, and queries that fail are responded to with a vote_status command with error_code vote_not_found(7).

5.3 Retrieval of Consensus

Providers in the Katzenpost mix network system KATZMIXNET may cache validated network consensus files and serve them to clients over the mix network's link layer wire protocol KATZMIXWIRE. We define additional wire protocol commands for requesting and sending PKI consensus documents:

enum {
   /* Extending the wire protocol Commands. */
   get_consensus(18),
   consensus(19),
} Command;

The structures of these commands are defined as follows:
struct {
    uint64_t epoch_number;
} GetConsensusCommand;

struct {
   uint8 error_code;
   opaque payload[];
} ConsensusCommand;

5.3.1 The get_consensus Command

The get_consensus command is a command that is used to retrieve a recent consensus document. If a given get_consensus command contains an Epoch value that is either too big or too small then a reply consensus command is sent with an empty payload. Otherwise if the consensus request is valid then a consensus command containing a recent consensus document is sent in reply.

Initiators MUST terminate the session immediately upon reception of a get_consensus command.

5.3.2 The consensus Command

The consensus command is a command that is used to send a recent consensus document. The error_code field indicates if there was a failure in retrieval of the PKI consensus document.

enum {
   consensus_ok(0),        /* None error condition and SHOULD be accompanied with
                              a valid consensus payload. */
   consensus_not_found(1), /* The client should try again later. */
   consensus_gone(2),      /* The consensus will not be available in the future. */
} ErrorCodes;

5.4.1 The Cert Command

The cert command is used to send a PKI document to a peer Authority during the voting period of the PKI schedule. It is the same as the vote command, but must contain the set of SharedRandomCommit and SharedRandomReveal values as seen by the Authority during the voting process.

5.4.2 The CertStatus Command

The cert_status command is the response to a cert command, and is the same as a vote_status response, other than the command identifier. Responses are CertOK, CertTooEarly, CertNotAuthorized, CertNotSigned, CertAlreadyReceived, CertTooLate

5.5 Signature Exchange

Signatures exchange is the final round of the consensus protocol and consists of the Sig and SigStatus commands.

5.5.1 The Sig Command

The sig command contains a detached Signature from PublicKey of Consensus for Epoch.

5.5.2 The SigStatus Command

The sig_status command is the response to a sig command. Responses are SigOK, SigNotAuthorized, SigNotSigned, SigTooEarly, SigTooLate, SigAlreadyReceived, and SigInvalid.

6. Scalability Considerations

TODO: notes on scaling, bandwidth usage etc.

7. Future Work

  • byzantine fault tolerance
  • PQ crypto signatures for all PKI documents: mix descriptors and directories. SPHINCS256 could be used, we already have a golang implementation: https://github.com/Yawning/sphincs256/
  • Make a Bandwidth Authority system to measure health of the network. Also perform load balancing as described in PEERFLOW?
  • Implement byzantine attack defenses as described in MIRANDA and MIXRELIABLE where mix link performance proofs are recorded and used in a reputation system.
  • Choose a different serialization/schema language?
  • Use a append only merkle tree instead of this voting protocol.

8. Anonymity Considerations

  • This system is intentionally designed to provide identical network consensus documents to each mix client. This mitigates epistemic attacks against the client path selection algorithm such as fingerprinting and bridge attacks FINGERPRINTING, BRIDGING.
  • If consensus has failed and thus there is more than one consensus file, clients MUST NOT use this compromised consensus and refuse to run.
  • We try to avoid randomizing the topology because doing so splits the anonymity sets on each mix into two. That is, packets belonging to the previous topology versus the current topology are trivially distinguishable. On the other hand if enough mixes fall out of consensus eventually the mixnet will need to be rebalanced to avoid an attacker compromised path selection. One example of this would be the case where the adversary controls the only mix is one layer of the network topology.

9. Security Considerations

  • The Directory Authority / PKI system for a given mix network is essentially the root of all authority in the system. The PKI controls the contents of the network consensus documents that mix clients download and use to inform their path selection. Therefore if the PKI as a whole becomes compromised then so will the rest of the system in terms of providing the main security properties described as traffic analysis resistance. Therefore a decentralized voting protocol is used so that the system is more resiliant when attacked, in accordance with the principle of least authority. SECNOTSEP{.citation}
  • Short epoch durations make it is more practical to make corrections to network state using the PKI voting rounds.
  • Fewer epoch keys published in advance is a more conservative security policy because it implies reduced exposure to key compromise attacks.
  • A bad acting Directory Authority who lies on each vote and votes inconsistently can trivially cause a denial of service for each voting round.

10. Acknowledgements

We would like to thank Nick Mathewson for answering design questions and thorough design review.

Appendix A. References

Appendix A.1 Normative References

Appendix A.2 Informative References

Appendix B. Citing This Document

Appendix B.1 Bibtex Entry

Note that the following bibtex entry is in the IEEEtran bibtex style as described in a document called “How to Use the IEEEtran BIBTEX Style”.

    @online{KatzMixPKI,
    title = {Katzenpost Mix Network Public Key Infrastructure Specification},
    author = {Yawning Angel and Ania Piotrowska and David Stainton},
    url= {https://github.com/katzenpost/katzenpost/blob/main/docs/specs/pki.rst},
    year = {2017}
    }

BRIDGING

Danezis, G., Syverson, P., “Bridging and Fingerprinting: Epistemic Attacks on Route Selection”, In the Proceedings of PETS 2008, Leuven, Belgium, July 2008, https://www.freehaven.net/anonbib/cache/danezis-pet2008.pdf

FINGERPRINTING

Danezis, G., Clayton, R., “Route Finger printing in Anonymous Communications”, https://www.cl.cam.ac.uk/~rnc1/anonroute.pdf

KATZMIXE2E

Angel, Y., Danezis, G., Diaz, C., Piotrowska, A., Stainton, D., “Katzenpost Mix Network End-to-end Protocol Specification”, July 2017, https://github.com/katzenpost/katzenpost/blob/main/docs/specs/old/end_to_end.md

KATZMIXNET

Angel, Y., Danezis, G., Diaz, C., Piotrowska, A., Stainton, D., “Katzenpost Mix Network Specification”, June 2017, https://github.com/katzenpost/katzenpost/blob/main/docs/specs/mixnet.md

KATZMIXWIRE

Angel, Y. “Katzenpost Mix Network Wire Protocol Specification”, June 2017, https://github.com/katzenpost/katzenpost/blob/main/docs/specs/wire-protocol.md

LOCALVIEW

Gogolewski, M., Klonowski, M., Kutylowsky, M., “Local View Attack on Anonymous Communication”, https://www.freehaven.net/anonbib/cache/esorics05-Klonowski.pdf

MIRANDA

Leibowitz, H., Piotrowska, A., Danezis, G., Herzberg, A., 2017, “No right to ramain silent: Isolating Malicious Mixes” https://eprint.iacr.org/2017/1000.pdf

MIXMINIONDIRAUTH

Danezis, G., Dingledine, R., Mathewson, N., “Type III (Mixminion) Mix Directory Specification”, December 2005, https://www.mixminion.net/dir-spec.txt

MIXRELIABLE

Dingledine, R., Freedman, M., Hopwood, D., Molnar, D., 2001 “A Reputation System to Increase MIX-Net Reliability”, In Information Hiding, 4th International Workshop https://www.freehaven.net/anonbib/cache/mix-acc.pdf

PEERFLOW

Johnson, A., Jansen, R., Segal, A., Syverson, P., “PeerFlow: Secure Load Balancing in Tor”, Preceedings on Privacy Enhancing Technologies, July 2017, https://petsymposium.org/2017/papers/issue2/paper12-2017-2-source.pdf

RFC2119

Bradner, S., “Key words for use in RFCs to Indicate Requirement Levels”, BCP 14, RFC 2119, DOI 10.17487/RFC2119, March 1997, https://www.rfc-editor.org/info/rfc2119

RFC5246

Dierks, T. and E. Rescorla, “The Transport Layer Security (TLS) Protocol Version 1.2”, RFC 5246, DOI 10.17487/RFC5246, August 2008, http://www.rfc-editor.org/info/rfc5246

RFC7515

Jones, M., Bradley, J., Sakimura, N., “JSON Web Signature (JWS)”, May 2015, https://tools.ietf.org/html/rfc7515

SECNOTSEP

Miller, M., Tulloh, B., Shapiro, J., “The Structure of Authority: Why Security Is not a Separable Concern”, http://www.erights.org/talks/no-sep/secnotsep.pdf

SPHINCS256

Bernstein, D., Hopwood, D., Hulsing, A., Lange, T., Niederhagen, R., Papachristodoulou, L., Schwabe, P., Wilcox O’ Hearn, Z., “SPHINCS: practical stateless hash-based signatures”, http://sphincs.cr.yp.to/sphincs-20141001.pdf

SPHINX09

Danezis, G., Goldberg, I., “Sphinx: A Compact and Provably Secure Mix Format”, DOI 10.1109/SP.2009.15, May 2009, http://research.microsoft.com/en-us/um/people/gdane/papers/sphinx-eprint.pdf

SPHINXSPEC

Angel, Y., Danezis, G., Diaz, C., Piotrowska, A., Stainton, D., “Sphinx Mix Network Cryptographic Packet Format Specification” July 2017, https://github.com/katzenpost/katzenpost/blob/main/docs/specs/sphinx.md

TORDIRAUTH

“Tor directory protocol, version 3”, https://gitweb.torproject.org/torspec.git/tree/dir-spec.txt

TORSRV

“Tor Shared Random Subsystem Specification”, https://gitweb.torproject.org/torspec.git/tree/srv-spec.txt

4.7 - Sphinx Cryptographic Packet Format

Abstract

This document defines the Sphinx cryptographic packet format for decryption mix networks, and provides a parameterization based around generic cryptographic primitives types. This document does not introduce any new crypto, but is meant to serve as an implementation guide.

1. Introduction

The Sphinx cryptographic packet format is a compact and provably secure design introduced by George Danezis and Ian Goldberg SPHINX09. It supports a full set of security features: indistinguishable replies, hiding the path length and relay position, detection of tagging attacks and replay attacks, as well as providing unlinkability for each leg of the packet’s journey over the network.

1.1 Terminology

  • Message - A variable-length sequence of octets sent anonymously through the network.
  • Packet - A fixed-length sequence of octets transmitted anonymously through the network, containing the encrypted message and metadata for routing.
  • Header - The packet header consisting of several components, which convey the information necessary to verify packet integrity and correctly process the packet.
  • Payload - The fixed-length portion of a packet containing an encrypted message or part of a message, to be delivered anonymously.
  • Group - A finite set of elements and a binary operation that satisfy the properties of closure, associativity, invertability, and the presence of an identity element.
  • Group element - An individual element of the group.
  • Group generator - A group element capable of generating any other element of the group, via repeated applications of the generator and the group operation.

1.2 Conventions Used in This Document

The key words “MUST”, “MUST NOT”, “REQUIRED”, “SHALL”, “SHALL NOT”, “SHOULD”, “SHOULD NOT”, “RECOMMENDED”, “MAY”, and “OPTIONAL” in this document are to be interpreted as described in RFC2119.

The “C” style Presentation Language as described in RFC5246 Section 4 is used to represent data structures, except for cryptographic attributes, which are specified as opaque byte vectors.

  • x | y denotes the concatenation of x and y.
  • x ^ y denotes the bitwise XOR of x and y.
  • byte an 8-bit octet.
  • x[a:b] denotes the sub-vector of x where a/b denote the start/end byte indexes (inclusive-exclusive); a/b may be omitted to signify the start/end of the vector x respectively.
  • x[y] denotes the y'th element of list x.
  • x.len denotes the length of list x.
  • ZEROBYTES(N) denotes N bytes of 0x00.
  • RNG(N) denotes N bytes of cryptographic random data.
  • LEN(N) denotes the length in bytes of N.
  • CONSTANT_TIME_CMP(x, y) denotes a constant time comparison between the byte vectors x and y, returning true iff x and y are equal.

2. Cryptographic Primitives

This specification uses the following cryptographic primitives as the foundational building blocks for Sphinx:

  • H(M) - A cryptographic hash function which takes an octet array M to produce a digest consisting of a HASH_LENGTH byte octet array. H(M) MUST be pre-image and collision resistant.

  • MAC(K, M) - A cryptographic message authentication code function which takes a M_KEY_LENGTH byte octet array key K and arbitrary length octet array message M to produce an authentication tag consisting of a MAC_LENGTH byte octet array.

  • KDF(SALT, IKM) - A key derivation function which takes an arbitrary length octet array salt SALT and an arbitrary length octet array initial key IKM, to produce an octet array of arbitrary length.

  • S(K, IV) - A pseudo-random generator (stream cipher) which takes a S_KEY_LENGTH byte octet array key K and a S_IV_LENGTH byte octet array initialization vector IV to produce an octet array key stream of arbitrary length.

  • SPRP_Encrypt(K, M)/SPRP_Decrypt(K, M) - A strong pseudo-random permutation (SPRP) which takes a SPRP_KEY_LENGTH byte octet array key K and arbitrary length message M, and produces the encrypted ciphertext or decrypted plaintext respectively.

    When used with the default payload authentication mechanism, the SPRP MUST be "fragile" in that any amount of modifications to M results in a large number of unpredictable changes across the whole message upon a SPRP_Encrypt() or SPRP_Decrypt() operation.

  • EXP(X, Y) - An exponentiation function which takes the GROUP_ELEMENT_LENGTH byte octet array group elements X and Y, and returns X ^^ Y as a GROUP_ELEMENT_LENGTH byte octet array.

    Let G denote the generator of the group, and EXP_KEYGEN() return a GROUP_ELEMENT_LENGTH byte octet array group element usable as private key.

    The group defined by G and EXP(X, Y) MUST satisfy the Decision Diffie-Hellman problem.

  • EXP_KEYGEN() - Returns a new "suitable" private key for EXP().

2.1 Sphinx Key Derivation Function

Sphinx Packet creation and processing uses a common Key Derivation Function (KDF) to derive the required MAC and symmetric cryptographic keys from a per-hop shared secret.

The output of the KDF is partitioned according to the following structure:

struct {
    opaque header_mac[M_KEY_LENGTH];
    opaque header_encryption[S_KEY_LENGTH];
    opaque header_encryption_iv[S_IV_LENGTH];
    opaque payload_encryption[SPRP_KEY_LENGTH]
    opaque blinding_factor[GROUP_ELEMENT_LENGTH];
} SphinxPacketKeys;

Sphinx_KDF( info, shared_secret ) -> packet_keys

Inputs:

  • info The optional context and application specific information.
  • shared_secret The per-hop shared secret derived from the Diffie-Hellman key exchange.

Outputs:

  • packet_keys The SphinxPacketKeys required to handle packet creation or processing.

The output packet_keys is calculated as follows:

kdf_out = KDF( info, shared_secret )
packet_keys = kdf_out[:LEN( SphinxPacketKeys )]

3. Sphinx Packet Parameters

3.1 Sphinx Parameter Constants

The Sphinx Packet Format is parameterized by the implementation based on the application and security requirements.

  • AD_LENGTH - The constant amount of per-packet unencrypted additional data in bytes.
  • PAYLOAD_TAG_LENGTH - The length of the message payload authentication tag in bytes. This SHOULD be set to at least 16 bytes (128 bits).
  • PER_HOP_RI_LENGTH - The length of the per-hop Routing Information (Section 4.1.1 <4.1.1>{.interpreted-text role=“ref”}) in bytes.
  • NODE_ID_LENGTH - The node identifier length in bytes.
  • RECIPIENT_ID_LENGTH - The recipient identifier length in bytes.
  • SURB_ID_LENGTH - The Single Use Reply Block (Section 7 <7.0>{.interpreted-text role=“ref”}) identifier length in bytes.
  • MAX_HOPS - The maximum number of hops a packet can traverse.
  • PAYLOAD_LENGTH - The per-packet message payload length in bytes, including a PAYLOAD_TAG_LENGTH byte authentication tag.
  • KDF_INFO - A constant opaque byte vector used as the info parameter to the KDF for the purpose of domain separation.

3.2 Sphinx Packet Geometry

The Sphinx Packet Geometry is derived from the Sphinx Parameter Constants Section 3.1. These are all derived parameters, and are primarily of interest to implementors.

  • ROUTING_INFO_LENGTH - The total length of the "routing information" Sphinx Packet Header component in bytes:
ROUTING_INFO_LENGTH = PER_HOP_RI_LENGTH * MAX_HOPS
  • HEADER_LENGTH - The length of the Sphinx Packet Header in bytes:
HEADER_LENGTH = AD_LENGTH + GROUP_ELEMENT_LENGTH + ROUTING_INFO_LENGTH + MAC_LENGTH
  • PACKET_LENGTH - The length of the Sphinx Packet in bytes:
PACKET_LENGTH = HEADER_LENGTH + PAYLOAD_LENGTH

4. The Sphinx Cryptographic Packet Structure

Each Sphinx Packet consists of two parts: the Sphinx Packet Header and the Sphinx Packet Payload:

struct {
    opaque header[HEADER_LENGTH];
    opaque payload[PAYLOAD_LENGTH];
} SphinxPacket;
  • header - The packet header consists of several components, which convey the information necessary to verify packet integrity and correctly process the packet.
  • payload - The application message data.

4.1 Sphinx Packet Header

The Sphinx Packet Header refers to the block of data immediately preceding the Sphinx Packet Payload in a Sphinx Packet.

The structure of the Sphinx Packet Header is defined as follows:

struct {
    opaque additional_data[AD_LENGTH]; /* Unencrypted. */
    opaque group_element[GROUP_ELEMENT_LENGTH];
    opaque routing_information[ROUTING_INFO_LENGTH];
    opaque MAC[MAC_LENGTH];
} SphinxHeader;
  • additional_data - Unencrypted per-packet Additional Data (AD) that is visible to every hop. The AD is authenticated on a per-hop basis.

    As the additional_data is sent in the clear and traverses the network unaltered, implementations MUST take care to ensure that the field cannot be used to track individual packets.

  • group_element - An element of the cyclic group, used to derive the per-hop key material required to authenticate and process the rest of the SphinxHeader and decrypt a single layer of the Sphinx Packet Payload encryption.

  • routing_information - A vector of per-hop routing information, encrypted and authenticated in a nested manner. Each element of the vector consists of a series of routing commands, specifying all of the information required to process the packet.

    The precise encoding format is specified in Section 4.1.1 <4.1.1>{.interpreted-text role=“ref”}.

  • MAC - A message authentication code tag covering the additional_data, group_element, and routing_information.

4.1.1 Per-hop routing information

The routing_information component of the Sphinx Packet Header contains a vector of per-hop routing information. When processing a packet, the per hop processing is set up such that the first element in the vector contains the routing commands for the current hop.

The structure of the routing information is as follows:

struct {
    RoutingCommand routing_commands<1..2^8-1>; /* PER_HOP_RI_LENGTH bytes */
    opaque encrypted_routing_commands[ROUTING_INFO_LENGTH - PER_HOP_RI_LENGTH];
} RoutingInformation;

The structure of a single routing command is as follows:

struct {
    RoutingCommandType command;
    select (RoutingCommandType) {
        case null:               NullCommand;
        case next_node_hop:      NextNodeHopCommand;
        case recipient:          RecipientCommand;
        case surb_reply:         SURBReplyCommand;
    };
} RoutingCommand;

The following routing commands are currently defined:

enum {
    null(0),
    next_node_hop(1),
    recipient(2),
    surb_reply(3),

    /* Routing commands between 0 and 0x7f are reserved. */

    (255)
} RoutingCommandType;

The null routing command structure is as follows:

struct {
    opaque padding<0..PER_HOP_RI_LENGTH-1>;
} NullCommand;

The next_node_hop command structure is as follows:

struct {
    opaque next_hop[NODE_ID_LENGTH];
    opaque MAC[MAC_LENGTH];
} NextNodeHopCommand;

The recipient command structure is as follows:

struct {
    opaque recipient[RECIPEINT_ID_LENGTH];
} RecipientCommand;

The surb_reply command structure is as follows:

struct {
    opaque id[SURB_ID_LENGTH];
} SURBReplyCommand;

While the NullCommand padding field is specified as opaque, implementations SHOULD zero fill the padding. The choice of 0x00 as the terminal NullCommand is deliberate to ease implementation, as ZEROBYTES(N) produces a valid NullCommand RoutingCommand, resulting in “appending zero filled padding” producing valid output.

Implementations MUST pad the routing_commands vector so that it is exactly PER_HOP_RI_LENGTH bytes, by appending a terminal NullCommand if necessary.

Every non-terminal hop’s routing_commands MUST include a NextNodeHopCommand.

4.2 Sphinx Packet Payload

The Sphinx Packet Payload refers to the block of data immediately following the Sphinx Packet Header in a Sphinx Packet.

For most purposes the structure of the Sphinx Packet Payload can be treated as a single contiguous byte vector of opaque data.

Upon packet creation, the payload is repeatedly encrypted (unless it is a SURB Reply, see Section 7.0 via keys derived from the Diffie-Hellman key exchange between the packet's group_element and the public key of each node in the path.

Authentication of packet integrity is done by prepending a tag set to a known value to the plaintext prior to the first encrypt operation. By virtue of the fragile nature of the SPRP function, any alteration to the encrypted payload as it traverses the network will result in an irrecoverably corrupted plaintext when the payload is decrypted by the recipient.

5. Sphinx Packet Creation

For the sake of brevity, the pseudocode for all of the operations will take a vector of the following PathHop structure as a parameter named path[] to specify the path a packet will traverse, along with the per-hop routing commands and per-hop public keys.

struct {
    /* There is no need for a node_id here, as
       routing_commands[0].next_hop specifies that
       information for all non-terminal hops. */
    opaque public_key[GROUP_ELEMENT_LENGTH];
    RoutingCommand routing_commands<1...2^8-1>;
} PathHop;

It is assumed that each routing_commands vector except for the terminal entry contains at least a RoutingCommand consisting of a partially assembled NextNodeHopCommand with the next_hop element filled in with the identifier of the next hop.

5.1 Create a Sphinx Packet Header

Both the creation of a Sphinx Packet and the creation of a SURB requires the generation of a Sphinx Packet Header, so it is specified as a distinct operation.

Sphinx_Create_Header( additional_data, path[] ) -> sphinx_header,
                                                   payload_keys

Inputs:

  • additional_data The Additional Data that is visible to every node along the path in the header.
  • path The vector of PathHop structures in hop order, specifying the node id, public key, and routing commands for each hop.

Outputs: sphinx_header The resulting Sphinx Packet Header.

  • payload_keys The vector of SPRP keys used to encrypt the Sphinx Packet Payload, in hop order.

The Sphinx_Create_Header operation consists of the following steps:

  1. Derive the key material for each hop.
num_hops = route.len
route_keys = [ ]
route_group_elements = [ ]
priv_key = EXP_KEYGEN()

/* Calculate the key material for the 0th hop. */
group_element = EXP( G, priv_key )
route_group_elements += group_element
shared_secret = EXP( path[0].public_key, priv_key )
route_keys += Sphinx_KDF( KDF_INFO, shared_secret )
blinding_factor = keys[0].blinding_factor

/* Calculate the key material for rest of the hops. */
for i = 1; i < num_hops; ++i:
    shared_secret = EXP( path[i].public_key, priv_key )
    for j = 0; j < i; ++j:
        shared_secret = EXP( shared_secret, keys[j].blinding_factor )
    route_keys += Sphinx_KDF( KDF_INFO, shared_secret )
    group_element = EXP( group_element, keys[i-1].blinding_factor )
    route_group_elements += group_element

At the conclusion of the derivation process:

  • route_keys - A vector of per-hop SphinxKeys.
  • route_group_elements - A vector of per-hop group elements.
  1. Derive the routing_information keystream and encrypted padding for each hop.
ri_keystream = [ ]
ri_padding = [ ]

for i = 0; i < num_hops; ++i:
    keystream = ZEROBYTES( ROUTING_INFO_LENGTH + PER_HOP_RI_LENGTH ) ^
                  S( route_keys[i].header_encryption,
                     route_keys[i].header_encryption_iv )
    ks_len = LEN( keystream ) - (i + 1) * PER_HOP_RI_LENGTH

    padding = keystream[ks_len:]
    if i > 0:
        prev_pad_len = LEN( ri_padding[i-1] )
        padding = padding[:prev_pad_len] ^ ri_padding[i-1] |
            padding[prev_pad_len]

    ri_keystream += keystream[:ks_len]
    ri_padding += padding

At the conclusion of the derivation process:
   ri_keystream - A vector of per-hop routing_information
                  encryption keystreams.
   ri_padding   - The per-hop encrypted routing_information
                  padding.
  1. Create the routing_information block.
/* Start with the terminal hop, and work backwards. */
i = num_hops - 1

/* Encode the terminal hop's routing commands. As the
   terminal hop can never have a NextNodeHopCommand, there
   are no per-hop alterations to be made. */
ri_fragment = path[i].routing_commands |
   ZEROBYTES( PER_HOP_RI_LENGTH - LEN( path[i].routing_commands ) )

/* Encrypt and MAC. */
ri_fragment ^= ri_keystream[i]
mac = MAC( route_keys[i].header_mac, additional_data |
               route_group_elements[i] | ri_fragment |
               ri_padding[i-1] )
routing_info = ri_fragment
if num_hops < MAX_HOPS:
    pad_len = (MAX_HOPS - num_hops) * PER_HOP_RI_LENGTH
    routing_info = routing_info | RNG( pad_len )

/* Calculate the routing info for the rest of the hops. */
for i = num_hops - 2; i >= 0; --i:
    cmds_to_encode = [ ]

    /* Find and finalize the NextNodeHopCommand. */
    for j = 0; j < LEN( path[i].routing_commands; j++:
        cmd = path[i].routing_commands[j]
        if cmd.command == next_node_hop:
          /* Finalize the NextNodeHopCommand. */
          cmd.MAC = mac
        cmds_to_encode = cmds_to_encode + cmd /* Append */

    /* Append a terminal NullCommand. */
    ri_fragment = cmds_to_encode |
        ZEROBYTES( PER_HOP_RI_LENGTH - LEN( cmds_to_encode ) )

    /* Encrypt and MAC */
    routing_info = ri_fragment | routing_info /* Prepend. */
    routing_info ^= ri_keystream[i]
    if i > 0:
        mac = MAC( route_keys[i].header_mac, additional_data |
                   route_group_elements[i] | routing_info |
                   ri_padding[i-1] )
    else:
        mac = MAC( route_keys[i].header_mac, additional_data |
                   route_group_elements[i] | routing_info )

At the conclusion of the derivation process:
   routing_info - The completed routing_info block.
   mac          - The MAC for the 0th hop.
  1. Assemble the completed Sphinx Packet Header and Sphinx Packet Payload SPRP key vector.
/* Assemble the completed Sphinx Packet Header. */
SphinxHeader sphinx_header
sphinx_header.additional_data = additional_data
sphinx_header.group_element = route_group_elements[0] /* From step 1. */
sphinx_header.routing_info = routing_info   /* From step 3. */
sphinx_header.mac = mac                     /* From step 3. */

/* Preserve the Sphinx Payload SPRP keys, to return to the
   caller. */
payload_keys = [ ]
for i = 0; i < nr_hops; ++i:
    payload_keys += route_keys[i].payload_encryption

At the conclusion of the assembly process:
   sphinx_header - The completed sphinx_header, to be returned.
   payload_keys  - The vector of SPRP keys, to be returned.

5.2 Create a Sphinx Packet

Sphinx_Create_Packet( additional_data, path[], payload ) -> sphinx_packet

Inputs:

  • additional_data The Additional Data that is visible to every node along the path in the header.
  • path The vector of PathHop structures in hop order, specifying the node id, public key, and routing commands for each hop.
  • payload The packet payload message plaintext.

Outputs:

  • sphinx_packet The resulting Sphinx Packet.

The Sphinx_Create_Packet operation consists of the following steps:

  1. Create the Sphinx Packet Header and SPRP key vector.
sphinx_header, payload_keys =
    Sphinx_Create_Header( additional_data, path )
  1. Prepend the authentication tag, and append padding to the payload.
payload = ZERO_BYTES( PAYLOAD_TAG_LENGTH ) | payload
payload = payload | ZERO_BYTES( PAYLOAD_LENGTH - LEN( payload ) )
  1. Encrypt the payload.
for i = nr_hops - 1; i >= 0; --i:
    payload = SPRP_Encrypt( payload_keys[i], payload )
  1. Assemble the completed Sphinx Packet.
SphinxPacket sphinx_packet
sphinx_packet.header = sphinx_header
sphinx_packet.payload = payload

6. Sphinx Packet Processing

Mix nodes process incoming packets first by performing the Sphinx_Unwrap operation to authenticate and decrypt the packet, and if applicable prepare the packet to be forwarded to the next node.

If Sphinx_Unwrap returns an error for any given packet, the packet MUST be discarded with no additional processing.

After a packet has been unwrapped successfully, a replay detection tag is checked to ensure that the packet has not been seen before. If the packet is a replay, the packet MUST be discarded with no additional processing.

The routing commands for the current hop are interpreted and executed, and finally the packet is forwarded to the next mix node over the network or presented to the application if the current node is the final recipient.

6.1 Sphinx_Unwrap Operation

The Sphinx_Unwrap operation is the majority of the per-hop packet processing, handling authentication, decryption, and modifying the packet prior to forwarding it to the next node.

Sphinx_Unwrap( routing_private_key, sphinx_packet ) -> sphinx_packet,
                                                      routing_commands,
                                                      replay_tag

Inputs:

  • private_routing_key A group element GROUP_ELEMENT_LENGTH bytes in length, that serves as the unwrapping Mix’s private key.
  • sphinx_packet A Sphinx packet to unwrap.

Outputs:

  • error Indicating a unsuccessful unwrap operation if applicable.
  • sphinx_packet The resulting Sphinx packet.
  • routing_commands A vector of RoutingCommand, specifying the post unwrap actions to be taken on the packet.
  • replay_tag A tag used to detect whether this packet was processed before.

The Sphinx_Unwrap operation consists of the following steps:

  1. (Optional) Examine the Sphinx Packet Header’s Additional Data.

If the header’s additional_data element contains information required to complete the unwrap operation, such as specifying the packet format version or the cryptographic primitives used examine it now.

Implementations MUST NOT treat the information in the additional_data element as trusted until after the completion of Step 3 (“Validate the Sphinx Packet Header”).

  1. Calculate the hop's shared secret, and replay_tag.
hdr = sphinx_packet.header
shared_secret = EXP( hdr.group_element, private_routing_key )
replay_tag = H( shared_secret )
  1. Derive the various keys required for packet processing.
keys = Sphinx_KDF( KDF_INFO, shared_secret )
  1. Validate the Sphinx Packet Header.
derived_mac = MAC( keys.header_mac, hdr.additional_data |
                  hdr.group_element |
                  hdr.routing_information )
if !CONSTANT_TIME_CMP( derived_mac, hdr.MAC):
    /* MUST abort processing if the header is invalid. */
    return ErrorInvalidHeader
  1. Extract the per-hop routing commands for the current hop.
/* Append padding to preserve length-invariance, as the routing
    commands for the current hop will be removed. */
padding = ZEROBYTES( PER_HOP_RI_LENGTH )
B = hdr.routing_information | padding

/* Decrypt the entire routing_information block. */
B = B ^ S( keys.header_encryption, keys.header_encryption_iv )
  1. Parse the per-hop routing commands.
cmd_buf = B[:PER_HOP_RI_LENGTH]
new_routing_information = B[PER_HOP_RI_LENGTH:]

next_mix_command_idx = -1
routing_commands = [ ]
for idx = 0; idx < PER_HOP_RI_LENGTH {
     /* WARNING: Bounds checking omitted for brevity. */
     cmd_type = b[idx]
     cmd = NULL
     switch cmd_type {
        case null: goto done  /* No further commands. */

        case next_node_hop:
            cmd = RoutingCommand( B[idx:idx+1+LEN( NextNodeHopCommand )] )
            next_mix_command_idx = i /* Save for step 7. */
            idx += 1 + LEN( NextNodeHopCommand )
            break

        case recipient:
            cmd = RoutingCommand( B[idx:idx+1+LEN( FinalDestinationCommand )] )
            idx += 1 + LEN( RecipientCommand )
            break

        case surb_reply:
            cmd = RoutingCommand( B[idx:idx+1+LEN( SURBReplyCommand )] )
            idx += 1 + LEN( SURBReplyCommand )
            break

      default:
            /* MUST abort processing on unrecognized commands. */
            return ErrorInvalidCommand
    }
    routing_commands += cmd /* Append cmd to the tail of the list. */
}
done:

At the conclusion of the parsing step:

  • routing_commands - A vector of SphinxRoutingCommand, to be applied at this hop.
  • new_routing_information - The routing_information block to be sent to the next hop if any.
  1. Decrypt the Sphinx Packet Payload.
payload = sphinx_packet.payload
payload = SPRP_Decrypt( key.payload_encryption, payload )
sphinx_packet.payload = payload
  1. Transform the packet for forwarding to the next mix, if the routing commands vector included a NextNodeHopCommand.
if next_mix_command_idx != -1:
    cmd = routing_commands[next_mix_command_idx]
    hdr.group_element = EXP( hdr.group_element, keys.blinding_factor )
    hdr.routing_information = new_routing_information
    hdr.mac = cmd.MAC
    sphinx_packet.hdr = hdr

6.2 Post Sphinx_Unwrap Processing

Upon the completion of the Sphinx_Unwrap operation, implementations MUST take several additional steps. As the exact behavior is mostly implementation specific, pseudocode will not be provided for most of the post processing steps.

  1. Apply replay detection to the packet.

The replay_tag value returned by Sphinx_Unwrap MUST be unique across all packets processed with a given private_routing_key.

The exact specifics of how to detect replays is left up to the implementation, however any replays that are detected MUST be discarded immediately.

  1. Act on the routing commands, if any.

The exact specifics of how implementations chose to apply routing commands is deliberately left unspecified, however in general:

  • If there is a NextNodeHopCommand, the packet should be forwarded to the next node based on the next_hop field upon completion of the post processing.

    The lack of a NextNodeHopCommand indicates that the packet is destined for the current node.

  • If there is a SURBReplyCommand, the packet should be treated as a SURBReply destined for the current node, and decrypted accordingly (See Section 7.2)

  • If the implementation supports multiple recipients on a single node, the RecipientCommand command should be used to determine the correct recipient for the packet, and the payload delivered as appropriate.

    It is possible for both a RecipientCommand and a NextNodeHopCommand to be present simultaneously in the routing commands for a given hop. The behavior when this situation occurs is implementation defined.

  1. Authenticate the packet if required.

If the packet is destined for the current node, the integrity of the payload MUST be authenticated.

The authentication is done as follows:

derived_tag = sphinx_packet.payload[:PAYLOAD_TAG_LENGTH]
expected_tag = ZEROBYTES( PAYLOAD_TAG_LENGTH )
if !CONSTANT_TIME_CMP( derived_tag, expected_tag ):
    /* Discard the packet with no further processing. */
    return ErrorInvalidPayload

Remove the authentication tag before presenting the payload to the application.

sphinx_packet.payload = sphinx_packet.payload[PAYLOAD_TAG_LENGTH:]

7. Single Use Reply Block (SURB) Creation

A Single Use Reply Block (SURB) is a delivery token with a short lifetime, that can be used by the recipient to reply to the initial sender.

SURBs allow for anonymous replies, when the recipient does not know the sender of the message. Usage of SURBs guarantees anonymity properties but also makes the reply messages indistinguishable from forward messages both to external adversaries as well as the mix nodes.

When a SURB is created, a matching reply block Decryption Token is created, which is used to decrypt the reply message that is produced and delivered via the SURB.

The Sphinx SURB wire encoding is implementation defined, but for the purposes of illustrating creation and use, the following will be used:

struct {
    SphinxHeader sphinx_header;
    opaque first_hop[NODE_ID_LENGTH];
    opaque payload_key[SPRP_KEY_LENGTH];
} SphinxSURB;

7.1 Create a Sphinx SURB and Decryption Token

Structurally a SURB consists of three parts, a pre-generated Sphinx Packet Header, a node identifier for the first hop to use when using the SURB to reply, and cryptographic keying material by which to encrypt the reply’s payload. All elements must be securely transmitted to the recipient, perhaps as part of a forward Sphinx Packet's Payload, but the exact specifics on how to accomplish this is left up to the implementation.

When creating a SURB, the terminal routing_commands vector SHOULD include a SURBReplyCommand, containing an identifier to ensure that the payload can be decrypted with the correct set of keys (Decryption Token). The routing command is left optional, as it is conceivable that implementations may chose to use trial decryption, and or limit the number of outstanding SURBs to solve this problem.

Sphinx_Create_SURB( additional_data, first_hop, path[] ) ->
                                                 sphinx_surb,
                                                 decryption_token

Inputs:

  • additional_data The Additional Data that is visible to every node along the path in the header.
  • first_hop The node id of the first hop the recipient must use when replying via the SURB.
  • path The vector of PathHop structures in hop order, specifying the node id, public key, and routing commands for each hop.

Outputs:

  • sphinx_surb The resulting Sphinx SURB.
  • decryption_token The Decryption Token associated with the SURB.

The Sphinx_Create_SURB operation consists of the following steps:

  1. Create the Sphinx Packet Header and SPRP key vector.
sphinx_header, payload_keys =
      Sphinx_Create_Header( additional_data, path )
  1. Create a key for the final layer of encryption.
final_key = RNG( SPRP_KEY_LENGTH )
  1. Build the SURB and Decryption Token.
SphinxSURB sphinx_surb;
sphinx_surb.sphinx_header = sphinx_header
sphinx_surb.first_hop = first_hop
sphinx_surb.payload_key = final_key

decryption_token = final_key + payload_keys /* Prepend */

7.2 Decrypt a Sphinx Reply Originating from a SURB

A Sphinx Reply packet that was generated using a SURB is externally indistinguishable from a forward Sphinx Packet as it traverses the network. However, the recipient of the reply has an additional decryption step, the packet starts off unencrypted, and accumulates layers of Sphinx Packet Payload decryption as it traverses the network.

Determining which decryption token to use when decrypting the SURB reply can be done via the SURBReplyCommand’s id field, if one is included at the time of the SURB’s creation.

Sphinx_Decrypt_SURB_Reply( decryption_token, payload ) -> message

Inputs:

  • decryption_token The vector of keys allowing a client to decrypt the reply ciphertext payload. This decryption_token is generated when the SURB is created.
  • payload The Sphinx Packet ciphertext payload.

Outputs:

  • error Indicating a unsuccessful unwrap operation if applicable.
  • message The plaintext message.

The Sphinx_Decrypt_SURB_Reply operation consists of the following steps:

  1. Encrypt the message to reverse the decrypt operations the payload acquired as it traversed the network.
for i = LEN( decryption_token ) - 1; i > 0; --i:
    payload = SPRP_Encrypt( decryption_token[i], payload )
  1. Decrypt and authenticate the message ciphertext.
message = SPRP_Decrypt( decryption_token[0], payload )

derived_tag = message[:PAYLOAD_TAG_LENGTH]
expected_tag = ZEROBYTES( PAYLOAD_TAG_LENGTH )
if !CONSTANT_TIME_CMP( derived_tag, expected_tag ):
    return ErrorInvalidPayload

message = message[PAYLOAD_TAG_LENGTH:]

8. Single Use Reply Block Replies

The process for using a SURB to reply anonymously is slightly different from the standard packet creation process, as the Sphinx Packet Header is already generated (as part of the SURB), and there is an additional layer of Sphinx Packet Payload encryption that must be performed.

Sphinx_Create_SURB_Reply( sphinx_surb, payload ) -> sphinx_packet

Inputs:

  • sphinx_surb The SphinxSURB structure, decoded from the implementation defined wire encoding.
  • payload The packet payload message plaintext.

The Sphinx_Create_SURB_Reply operation consists of the following steps:

  1. Prepend the authentication tag, and append padding to the payload.
payload = ZERO_BYTES( PAYLOAD_TAG_LENGTH ) | payload
payload = payload | ZERO_BYTES( PAYLOAD_LENGTH - LEN( payload ) )
  1. Encrypt the payload.
payload = SPRP_Encrypt( sphinx_surb.payload_key, payload )
  1. Assemble the completed Sphinx Packet.
SphinxPacket sphinx_packet
sphinx_packet.header = sphinx_surb.sphinx_header
sphinx_packet.payload = payload

The completed sphinx_packet MUST be sent to the node specified via sphinx_surb.node_id, as the entire reply sphinx_packet’s header is pre-generated.

9. Anonymity Considerations

9.1 Optional Non-constant Length Sphinx Packet Header Padding

Depending on the mix topology, there is no hard requirement that the per-hop routing info is padded to one fixed constant length.

For example, assuming a layered topology (referred to as stratified topology in the literature) MIXTOPO10, where the layer of any given mix node is public information, as long as the following two invariants are maintained, there is no additional information available to an adversary:

  1. All packets entering any given mix node in a certain layer are uniform in length.
  2. All packets leaving any given mix node in a certain layer are uniform in length.

The only information available to an external or internal observer is the layer of any given mix node (via the packet length), which is information they are assumed to have by default in such a design.

9.2 Additional Data Field Considerations

The Sphinx Packet Construct is crafted such that any given packet is bitwise unlinkable after a Sphinx_Unwrap operation, provided that the optional Additional Data (AD) facility is not used. This property ensures that external passive adversaries are unable to track a packet based on content as it traverses the network. As the on-the-wire AD field is static through the lifetime of a packet (ie: left unaltered by the Sphinx_Unwrap operation), implementations and applications that wish to use this facility MUST NOT transmit AD that can be used to distinctly identify individual packets.

9.3 Forward Secrecy Considerations

Each node acting as a mix MUST regenerate their asymmetric key pair relatively frequently. Upon key rotation the old private key MUST be securely destroyed. As each layer of a Sphinx Packet is encrypted via key material derived from the output of an ephemeral/static Diffie-Hellman key exchange, without the rotation, the construct does not provide Perfect Forward Secrecy. Implementations SHOULD implement defense-in-depth mitigations, for example by using strongly forward-secure link protocols to convey Sphinx Packets between nodes.

This frequent mix routing key rotation can limit SURB usage by directly reducing the lifetime of SURBs. In order to have a strong Forward Secrecy property while maintaining a higher SURB lifetime, designs such as forward secure mixes SFMIX03 could be used.

9.4 Compulsion Threat Considerations

Reply Blocks (SURBs), forward and reply Sphinx packets are all vulnerable to the compulsion threat, if they are captured by an adversary. The adversary can request iterative decryptions or keys from a series of honest mixes in order to perform a deanonymizing trace of the destination.

While a general solution to this class of attacks is beyond the scope of this document, applications that seek to mitigate or resist compulsion threats could implement the defenses proposed in COMPULS05 via a series of routing command extensions.

9.5 SURB Usage Considerations for Volunteer Operated Mix Networks

Given a hypothetical scenario where Alice and Bob both wish to keep their location on the mix network hidden from the other, and Alice has somehow received a SURB from Bob, Alice MUST not utilize the SURB directly because in the volunteer operated mix network the first hop specified by the SURB could be operated by Bob for the purpose of deanonymizing Alice.

This problem could be solved via the incorporation of a “cross-over point” such as that described in MIXMINION, for example by having Alice delegating the transmission of a SURB Reply to a randomly selected crossover point in the mix network, so that if the first hop in the SURB’s return path is a malicious mix, the only information gained is the identity of the cross-over point.

10. Security Considerations

10.1 Sphinx Payload Encryption Considerations

The payload encryption’s use of a fragile (non-malleable) SPRP is deliberate and implementations SHOULD NOT substitute it with a primitive that does not provide such a property (such as a stream cipher based PRF). In particular there is a class of correlation attacks (tagging attacks) targeting anonymity systems that involve modification to the ciphertext that are mitigated if alterations to the ciphertext result in unpredictable corruption of the plaintext (avalanche effect).

Additionally, as the PAYLOAD_TAG_LENGTH based tag-then-encrypt payload integrity authentication mechanism is predicated on the use of a non-malleable SPRP, implementations that substitute a different primitive MUST authenticate the payload using a different mechanism.

Alternatively, extending the MAC contained in the Sphinx Packet Header to cover the Sphinx Packet Payload will both defend against tagging attacks and authenticate payload integrity. However, such an extension does not work with the SURB construct presented in this specification, unless the SURB is only used to transmit payload that is known to the creator of the SURB.

Appendix A. References

Appendix A.1 Normative References

Appendix A.2 Informative References

Appendix B. Citing This Document

Appendix B.1 Bibtex Entry

Note that the following bibtex entry is in the IEEEtran bibtex style as described in a document called “How to Use the IEEEtran BIBTEX Style”.

@online{SphinxSpec,
title = {Sphinx Mix Network Cryptographic Packet Format Specification},
author = {Yawning Angel and George Danezis and Claudia Diaz and Ania Piotrowska and David Stainton},
url = {https://github.com/katzenpost/katzenpost/blob/master/docs/specs/sphinx.rst},
year = {2017}
}

COMPULS05

Danezis, G., Clulow, J., “Compulsion Resistant Anonymous Communications”, Proceedings of Information Hiding Workshop, June 2005, https://www.freehaven.net/anonbib/cache/ih05-danezisclulow.pdf

MIXMINION

Danezis, G., Dingledine, R., Mathewson, N., “Mixminion: Design of a Type III Anonymous Remailer Protocol”, https://www.mixminion.net/minion-design.pdf

MIXTOPO10

Diaz, C., Murdoch, S., Troncoso, C., “Impact of Network Topology on Anonymity and Overhead in Low-Latency Anonymity Networks”, PETS, July 2010, https://www.esat.kuleuven.be/cosic/publications/article-1230.pdf

RFC2119

Bradner, S., “Key words for use in RFCs to Indicate Requirement Levels”, BCP 14, RFC 2119, DOI 10.17487/RFC2119, March 1997, http://www.rfc-editor.org/info/rfc2119

RFC5246

Dierks, T. and E. Rescorla, “The Transport Layer Security (TLS) Protocol Version 1.2”, RFC 5246, DOI 10.17487/RFC5246, August 2008, http://www.rfc-editor.org/info/rfc5246

SFMIX03

Danezis, G., “Forward Secure Mixes”, Proceedings of 7th Nordic Workshop on Secure IT Systems, 2002, https://www.freehaven.net/anonbib/cache/Dan:SFMix03.pdf

SPHINX09

Danezis, G., Goldberg, I., “Sphinx: A Compact and Provably Secure Mix Format”, DOI 10.1109/SP.2009.15, May 2009, https://cypherpunks.ca/~iang/pubs/Sphinx_Oakland09.pdf

4.8 - Sphinx Packet Replay Detection

Abstract

This document defines the replay detection for any protocol that uses Sphinx cryptographic packet format. This document is meant to serve as an implementation guide and document the existing replay protect for deployed mix networks.

1. Introduction

The Sphinx cryptographic packet format is a compact and provably secure design introduced by George Danezis and Ian Goldberg SPHINX09. Although it supports replay detection, the exact mechanism of replay detection is neither described in SPHINX09 nor is it described in our SPHINXSPEC. Therefore we shall describe in detail how to efficiently detect Sphinx packet replay attacks.

1.1 Terminology

  • Epoch - A fixed time interval defined in section “4.2 Sphinx Mix and Provider Key Rotation” of KATZMIXNET.
  • Packet - A fixed-length sequence of bytes transmitted through the network, containing the encrypted message and metadata for routing.
  • Header - The packet header consisting of several components, which convey the information necessary to verify packet integrity and correctly process the packet.
  • Payload - The fixed-length portion of a packet containing an encrypted message or part of a message, to be delivered.
  • Group - A finite set of elements and a binary operation that satisfy the properties of closure, associativity, invertability, and the presence of an identity element.
  • Group element - An individual element of the group.
  • Group generator - A group element capable of generating any other element of the group, via repeated applications of the generator and the group operation.

SEDA - Staged Event Driven Architecture. SEDA 1. A highly parallelizable computation model. 2. A computational pipeline composed of multiple stages connected by queues utilizing active queue management algorithms that can evict items from the queue based on dwell time or other criteria where each stage is a thread pool. 3. The only correct way to efficiently implement a software based router on general purpose computing hardware.

1.2 Conventions Used in This Document

The key words “MUST”, “MUST NOT”, “REQUIRED”, “SHALL”, “SHALL NOT”, “SHOULD”, “SHOULD NOT”, “RECOMMENDED”, “MAY”, and “OPTIONAL” in this document are to be interpreted as described in RFC2119.

2. Sphinx Cryptographic Primitives

This specification borrows the following cryptographic primitives constants from our SPHINXSPEC:

  • H(M) - A cryptographic hash function which takes an byte array M to produce a digest consisting of a HASH_LENGTH byte array. H(M) MUST be pre-image and collision resistant.

  • EXP(X, Y) - An exponentiation function which takes the GROUP_ELEMENT_LENGTH byte array group elements X and Y, and returns X ^^ Y as a GROUP_ELEMENT_LENGTH byte array.

Let G denote the generator of the group, and EXP_KEYGEN() return a GROUP_ELEMENT_LENGTH byte array group element usable as private key.

The group defined by G and EXP(X, Y) MUST satisfy the Decision Diffie-Hellman problem.

2.1 Sphinx Parameter Constants

  • HASH_LENGTH - 32 bytes. Katzenpost currently uses SHA-512/256. RFC6234
  • GROUP_ELEMENT_LENGTH - 32 bytes. Katzenpost currently uses X25519. RFC7748

3. System Overview

Mixes as currently deployed, have two modes of operation:

  1. Sphinx routing keys and replay caches are persisted to disk
  2. Sphinx routing keys and replay caches are persisted to memory

These two modes of operation fundamentally represent a tradeoff between mix server availability and notional compulsion attack resistance. Ultimately it will be the mix operator’s decision to make since they affect the security and availability of their mix servers. In particular since mix networks are vulnerable to the various types of compulsion attacks (see SPHINXSPEC section 9.4 Compulsion Threat Considerations) and therefore there is some advantage to NOT persisting the Sphinx routing keys to disk. The mix operator can simply poweroff the mix server before seizure rather than physically destroying the disk in order to prevent capture of the Sphinx routing keys. An argument can be made for the use of full disk encryption, however this may not be practical for servers hosted in remote locations.

On the other hand, persisting Sphinx routing keys and replay caches to disk is useful because it allows mix operators to shutdown their mix server for maintenance purposes without loosing these Sphinx routing keys and replay caches. This means that as soon as the maintenance operation is completed the mix server is able to rejoin the network. Our current PKI system KATZMIXPKI does NOT provide a mechanism to notify Directory Authorities of such an outage or maintenance period. Therefore if there is loss of Sphinx routing keys this results in a mix outage until the next epoch.

The two modes of operation both completely prevent replay attacks after a system restart. In the case of the disk persistence, replay attacks are prevented because all packets traversing the mix have their replay tags persisted to disk cache. This cache is therefore once again used to prevent replays after a system restart. In the case of memory persistence replays are prevented upon restart because the Sphinx routing keys are destroyed and therefore the mix will not participant in the network until at least the next epoch rotation. However availability of the mix may require two epoch rotations because in accordance with KATZMIXPKI mixes publish future epoch keys so that Sphinx packets flowing through the network can seamlessly straddle the epoch boundaries.

4. Sphinx Packet Replay Cache

4.1 Sphinx Replay Tag Composition

The following excerpt from our SPHINXSPEC shows how the replay tag is calculated.

hdr = sphinx_packet.header
shared_secret = EXP( hdr.group_element, private_routing_key )
replay_tag = H( shared_secret )

However this tag is not utilized in replay detection until the rest of the Sphinx packet is fully processed and it’s header MAC verified as described in SPHINXSPEC.

4.2 Sphinx Replay Tag Caching

It would be sufficient to use a key value store or hashmap to detect the presence of a duplicate replay tag however we additionaly employ a bloom filter to increase performance. Sphinx keys must periodically be rotated and destroyed to mitigate compulsion attacks and therefore our replay caches must likewise be rotated. This kind of key erasure scheme limits the window of time that an adversary can perform a compulsion attack. See our PKI specification KATZMIXPKI for more details regarding epoch key rotation and the grace period before and after the epoch boundary.

We tune our bloom filter for line-speed; that is to say the bloom filter for a given replay cache is tuned for the maximum number of Sphinx packets that can be sent on the wire during the epoch duration of the Sphinx routing key. This of course has to take into account the size of the Sphinx packets as well as the maximum line speed of the network interface. This is a conservative tuning heuristic given that there must be more than this maximum number of Sphinx packets in order for there to be duplicate packets.

Our bloomfilter with hashmap replay detection cache looks like this:

replay cache

Note that this diagram does NOT express the full complexity of the replay caching system. In particular it does not describe how entries are entered into the bloom filter and hashmap. Upon either bloom filter mismatch or hashmap mismatch both data structures must be locked and the replay tag inserted into each.

For the disk persistence mode of operation the hashmap can simply be replaced with an efficient key value store. Persistent stores may use a write back cache and other techniques for efficiency.

4.3 Epoch Boundaries

Since mixes publish future epoch keys (see KATZMIXPKI) so that Sphinx packets flowing through the network can seamlessly straddle the epoch boundaries, our replay detection forms a special kind of double bloom filter system. During the epoch grace period mixes perform trial decryption of Sphinx packets. The replay cache used will be the one that is associated with the Sphinx routing key which was successfully used to decrypt (unwrap transform) the Sphinx packet. This is not a double bloom filter in the normal sense of this term since each bloom filter used is distinct and associated with it’s own cache, furthermore, replay tags are only ever inserted into one cache and one bloom filter.

4.4 Cost Of Checking Replays

The cost of checking a replay tag from a single replay cache is the sum of the following operations:

  1. Sphinx packet unwrap operation
  2. A bloom filter lookup
  3. A hashmap or cache lookup

Therefore these operations are roughly O(1) in complexity. However Sphinx packets processed near epoch boundaries will not be constant time due to trial decryption with two Sphinx routing keys as mentioned above in section “3.3 Epoch Boundaries”.

5. Concurrent Processing of Sphinx Packet Replay Tags

The best way to implement a software based router is with a SEDA computational pipeline. We therefore need a mechanism to allow multiple threads to reference our rotating Sphinx keys and associated replay caches. Here we shall describe a shadow memory system which the mix server uses such that the individual worker threads shall always have a reference to the current set of candidate mix keys and associates replay caches.

5.1 PKI Updates

The mix server periodically updates it’s knowledge of the network by downloading a new consensus document as described in KATZMIXPKI. The individual threads in the “cryptoworker” thread pool which process Sphinx packets make use of a MixKey data structure which consists of:

  1. Sphinx routing key material (public and private X25519 keys)
  2. Replay Cache
  3. Reference Counter

Each of these “cryptoworker” thread pool has it’s own hashmap associating epochs to a reference to the MixKey. The mix server PKI threat maintains a single hashmap which associates the epochs with the corresponding MixKey. We shall refer to this hashmap as MixKeys. After a new MixKey is added to MixKeys, a “reshadow” operation is performed for each “cryptoworker” thread. The “reshadow” operation performs two tasks:

  1. Removes entries from each “cryptoworker” thread's hashmap that are no longer present in MixKeys and decrements the MixKey reference counter.
  2. Adds entries present in MixKeys but are not present in the thread’s hashmap and increments the MixKey reference counter.

Once a given MixKey reference counter is decremented to zero, the MixKey and it’s associated on disk data is purged. Note that we do not discuss synchronization primitives, however it should be obvious that updating the replay cache should likely make use of a mutex or similar primitive to avoid data races between “cryptoworker” threads.

Appendix A. References

Appendix A.1 Normative References

Appendix A.2 Informative References

Appendix B. Citing This Document

Appendix B.1 Bibtex Entry

Note that the following bibtex entry is in the IEEEtran bibtex style as described in a document called “How to Use the IEEEtran BIBTEX Style”.

@online{SphinxReplay,
title = {Sphinx Packet Replay Detection Specification},
author = {David Stainton},
url = {https://github.com/katzenpost/katzenpost/blob/main/docs/specs/sphinx_replay_detection.rst},
year = {2019}
}

COMPULS05

Danezis, G., Clulow, J., “Compulsion Resistant Anonymous Communications”, Proceedings of Information Hiding Workshop, June 2005, https://www.freehaven.net/anonbib/cache/ih05-danezisclulow.pdf

KATZMIXNET

Angel, Y., Danezis, G., Diaz, C., Piotrowska, A., Stainton, D., “Katzenpost Mix Network Specification”, June 2017, https://github.com/katzenpost/katzenpost/blob/main/docs/specs/mixnet.md

KATZMIXPKI

Angel, Y., Piotrowska, A., Stainton, D., “Katzenpost Mix Network Public Key Infrastructure Specification”, December 2017, https://github.com/katzenpost/katzenpost/blob/main/docs/specs/pki.md

RFC2119

Bradner, S., “Key words for use in RFCs to Indicate Requirement Levels”, BCP 14, RFC 2119, DOI 10.17487/RFC2119, March 1997, http://www.rfc-editor.org/info/rfc2119

RFC6234

Eastlake 3rd, D. and T. Hansen, “US Secure Hash Algorithms (SHA and SHA-based HMAC and HKDF)”, RFC 6234, DOI 10.17487/RFC6234, May 2011, https://www.rfc-editor.org/info/rfc6234

RFC7748

Langley, A., Hamburg, M., and S. Turner, “Elliptic Curves for Security”, RFC 7748, January 2016.

SEDA

Welsh, M., Culler, D., Brewer, E., “SEDA: An Architecture for Well-Conditioned, Scalable Internet Services”, ACM Symposium on Operating Systems Principles, 2001, http://www.sosp.org/2001/papers/welsh.pdf

SPHINX09

Danezis, G., Goldberg, I., “Sphinx: A Compact and Provably Secure Mix Format”, DOI 10.1109/SP.2009.15, May 2009, https://cypherpunks.ca/~iang/pubs/Sphinx_Oakland09.pdf

SPHINXSPEC

Angel, Y., Danezis, G., Diaz, C., Piotrowska, A., Stainton, D., “Sphinx Mix Network Cryptographic Packet Format Specification” July 2017, https://github.com/katzenpost/katzenpost/blob/main/docs/specs/sphinx.md

4.9 - Threat Model

Abstract

Here we describe the threat model of the katzenpost mix network transport protocol and also discuss threat models for addition protocol layers.

Overview of Mix Network Threat Model

Katzenpost is a decryption mixnet with no verification. In this context we can talk about individual mix nodes as a specialized kind of cryptographic router that has two properties other than routing messages to their destination:

  1. bitwise unlinkability
  2. latency

Bitwise unlinkability means that the adversary would not be able to link an output message with an input message based on the bits in the message. Likewise adding latency also helps prevent a network observer from correlating input messages with output messages on a mix node.

When mixes are composed in a network we get the Anytrust property which means that if a path (aka route) consists of at least one honest mix then there is still uncertainty for a passive network observer for the correlation between sent and received messages. There are many caveats here, see attacks below.

There is a hierarchy of security notions that are used by cryptographers to compare cryptographic primitives. Likewise, a hierarchy of privacy notions help us compare anonymous communication protocols. See NOTIONS for an in depth discussion and algebraic analsysis of privacy notions for anonymous communication protocols.

At the time of writing it is my understanding that the mixnet protocols we want to see in the world would have these privacy notions:

  1. Sender Unobservability
  2. Receiver Unobservability
  3. Sender Receiver Unlinkability

The unobservability notions are a result of using decoy traffic whose use has limitations expressed in the anonymity trilemma.

Our threat model will consider the following eight categories of attacks:

  1. n-1 attacks
  2. epistemic attacks
  3. compulsion attacks
  4. tagging attacks
  5. statistical disclosure attacks
  6. denial of service attacks
  7. timing attacks
  8. cryptographic attacks (including considerations regarding cryptographic attacks by a sufficiently powerful quantum computational adversary)

As of the time of this writing, ALL mixnet attacks that we know about fit into one or more of the above categories or composite attacks, a combination of several of the above attacks.

1. statistical disclosure attacks

Adversary capability: In general for this particular attack category we assume that the adversary is a global passive adversary. However all of these attacks are actually possible with only a view of the perimeter of the mix network.

Adversary strategy: The adversary’s goal is to capture as much of the social graph as they can by collecting sets of possible recipients of each sender and likewise sets of possible senders for each recipient.

Primary Mitigation tactic:

The best defense against intersection attacks is to use end to end decoy traffic which is sent and received by clients.

Imagine there is a mixnet with 10 Providers and only Alice and Bob are currently connected and sending messages. Alice does not send any decoy traffic. Mixes send loop decoys to themselves. Alice only sends messages to a Provider which Bob can retrieve messages from.

In this example we can say two things for certain:

  1. mix decoy loops will be uniformly distributed over all 10 Providers
  2. Alice’s sent messages will go to 1 of 10 Providers

Therefore, without the use of end to end decoy traffic, a global adversary would be able to keep track of their observations of the network perimeter and learn statistical information about the social graph. In this example the uniform distribution of messages is perturbed by Alice's behavior on the network.

Mitigation tactic #1

In the context of Katzenpost/Loopix these classical style set intersection attacks don’t work at full granularity because destination messages are queued on edge nodes (known as Providers) along with many other received messages to and from other users of the mix network.

Mitigation tactic #2

Many of our future protocols will scatter message segments across an ever changing set of Providers. This seems likely to increase uncertainty for adversaries.

Mitigation tactic #3

Clients will send decoy traffic such that the rate of all messages arriving on all of the Providers will be uniformly distributed even when the client is sending legit messages to a subset of Providers.

Mitigation tactic #4

Providers will modulate their decoy traffic send rate to be inversely proportional to the sum of all the rates of incoming messages from all clients directly connected to that Provider. In other words, when clients send zero messages the Provider sends a constant rate of decoy traffic. The Provider reduces it's decoy send rate when clients increase their send rate such that the total rate of messages coming out of the Provider remains the same if measured over a large enough period of time.

Conclusion

The sucess of a statistical disclosure attacks often has a lot to do with the advesary’s ability to predict user behavior. Likewise if user behavior is very repetative and predictable then that might increase the probability that a statistical disclosure attack would work. These attacks could in theory take days/weeks or even months to perform depending on how much statistical information is leaked.

Statistical disclosure attacks such as short term timing correlation that the Tor network is known to be trivially vulnerable against do not in general apply to mix networks due to the added latency. However as latency is decreased we find ourselves pondering the Anonymity Trilemma ANONTRILEMMA which clearly states that Strong Anonymity is in opposition to low latency unless we send lots of decoy traffic. We need a formal methodogy for tuning the mixnet AND making the numerical calculations of the various tradeoffs that are the result of the mixnet tuning.

2. epistemic attacks

An epistemic attack refers to an attack where the adversary uses their knowledge of a mixnet client’s knowledge of the network to their advantage. For example if Alice only learns of a subset of the network nodes then the adversary who knows this about Alice (or perhaps caused Alice to have partial knowledge) will be able to at least state some obvious conclusions such as: “messages sent along these routes are more likely to have come from Alice than any other client”.

In general we mitigate this attack category by designing our key management and distribution (aka the dirauth system aka PKI) such that it shares the same information with all the clients.

3. compulsion attacks

Adversary capability: The adversary uses forceful means to procure the information they are after: violence, legal system, remotely compromising mix nodes using a zero day from the black market etc.

Conclusion

Reply Blocks (SURBs), forward and reply Sphinx packets SPHINX09 are all vulnerable to the compulsion threat, if they are captured by an adversary. The adversary can request iterative decryptions or keys from a series of honest mixes in order to perform a deanonymizing trace of the destination.

While a general solution to this class of attacks is beyond the scope of this document, applications that seek to mitigate or resist compulsion threats could implement the defenses proposed in COMPULS05 via a series of routing command extensions.

4. tagging attacks

There are many different types of tagging attacks. This is the only one I could think of that applies to Katzenpost, in an albeit contrived scenario.

Adversary capability

If the adversary is allowed to view the final payload decryption and can mutate the packet during it’s transit then a 1 bit tagging attack is possible.

Adversary strategy:

Flipping a bit during transit would cause lots of bits to be flipped in each subsequent decryption set and thus the final payload integrity tag would be destroyed. So for the adversary, either the interity tag is intact or it is destroyed; this attack leaks 1 bit of information to the advesary.

Conclusion

This is Sphinx payload tagging attack is a result of the Sphinx design. However it’s a very contrived example and we have trouble imagining it would apply in the real world.

Sphinx Payload Encryption Considerations

The payload encryption’s use of a fragile (non-malleable) SPRP is deliberate and implementations SHOULD NOT substitute it with a primitive that does not provide such a property (such as a stream cipher based PRF). In particular there is a class of correlation attacks (tagging attacks) targeting anonymity systems that involve modification to the ciphertext that are mitigated if alterations to the ciphertext result in unpredictable corruption of the plaintext (avalanche effect).

Additionally, as the PAYLOAD_TAG_LENGTH based tag-then-encrypt payload integrity authentication mechanism is predicated on the use of a non-malleable SPRP, implementations that substitute a different primitive MUST authenticate the payload using a different mechanism.

Alternatively, extending the MAC contained in the Sphinx Packet Header to cover the Sphinx Packet Payload will both defend against tagging attacks and authenticate payload integrity. However, such an extension does not work with the SURB construct presented in this specification, unless the SURB is only used to transmit payload that is known to the creator of the SURB.

5. denial of service attacks

We don’t have much defense against DOS attacks. Currently the Provider has a per client rate limiter that can be tuned by the dirauth system.

6. timing attacks

We probably have potential for many many timing attacks. Can we enumerate some of the more obvious and powerful timing attacks here?

7. n-1 Attacks

Adversary capability: Adversary is active and can send messages into the mix network AND the adversary can drop or delay messages sent to the mix network. Therefore the adversary has compromised the upstream routers for each of the perimeter mix nodes.

Adversary strategy: There are many variations of n-1 attacks and the one that works on Poisson mix strategy is this:

The adversary must delay or drop input messages to a given mix until they are reasonably certain the mix is empty before allowing the target message to enter and then exit the mix. The result of this attack is that the adversary learns where the target message is being sent.

Primary Mitigation tactic: Our theoretical defense is:

Each mix node uses a loop decoy heartbeat protocol to detect when an adversary is delaying or dropping input messages; that is, if the mix node doesn’t receive it's own heartbeat loop message then it has detected this attack. A real world implementation would probably add some additional heuristics for example, the n-1 attack is detected when 3 heartbeats in a row were not received.

Our current status is:

  • Mix loop decoy traffic is only implemented on interior mixes but it should also be implemented on Provders.
  • The status of the decoy replies is ignored. Instead it should do bookkeeping and stop routing messages for some duration if certain heuristics are matched which include a threshold number of heartbeat messages not being recently received.

8. cryptographic attacks

This category should include not only merely breaking cryptographic primitives but also breaking the cryptographic protocols on a higher level of abstraction. One great example of this is the following attack on SURB usage described below.

SURB Usage Considerations for Volunteer Operated Mix Networks

Given a hypothetical scenario where Alice and Bob both wish to keep their location on the mix network hidden from the other, and Alice has somehow received a SURB from Bob, Alice MUST not utilize the SURB directly because in the volunteer operated mix network the first hop specified by the SURB could be operated by Bob for the purpose of deanonymizing Alice.

This problem could be solved via the incorporation of a “cross-over point” such as that described in MIXMINION, for example by having Alice delegating the transmission of a SURB Reply to a randomly selected crossover point in the mix network, so that if the first hop in the SURB’s return path is a malicious mix, the only information gained is the identity of the cross-over point.

Appendix A. References

Appendix A.1 Normative References

ANONTRILEMMA

Das, D., Meiser, S., Mohammadi, E., Kate, A., “Anonymity Trilemma: Strong Anonymity, Low Bandwidth Overhead, Low Latency—Choose Two”, IEEE Symposium on Security and Privacy, 2018, https://eprint.iacr.org/2017/954.pdf

COMPULS05

Danezis, G., Clulow, J., “Compulsion Resistant Anonymous Communications”, Proceedings of Information Hiding Workshop, June 2005, https://www.freehaven.net/anonbib/cache/ih05-danezisclulow.pdf

MIXMINION

Danezis, G., Dingledine, R., Mathewson, N., “Mixminion: Design of a Type III Anonymous Remailer Protocol”, https://www.mixminion.net/minion-design.pdf

NOTIONS

Christiane Kuhn, Martin Beck, Stefan Schiffner, Eduard Jorswieck and Thorsten Strufe, PETS 2019, https://petsymposium.org/2019/files/papers/issue2/popets-2019-0022.pdf

SPHINX09

Danezis, G., Goldberg, I., “Sphinx: A Compact and Provably Secure Mix Format”, DOI 10.1109/SP.2009.15, May 2009, https://cypherpunks.ca/~iang/pubs/Sphinx_Oakland09.pdf

4.10 - Wire Protocol

Abstract

This document defines the Katzenpost Mix Network Wire Protocol for use in all network communications to, from, and within the Katzenpost Mix Network.

1. Introduction

The Katzenpost Mix Network Wire Protocol (KMNWP) is the custom wire protocol for all network communications to, from, and within the Katzenpost Mix Network. This protocol provides mutual authentication, and an additional layer of cryptographic security and forward secrecy.

1.1 Conventions Used in This Document

The key words “MUST”, “MUST NOT”, “REQUIRED”, “SHALL”, “SHALL NOT”, “SHOULD”, “SHOULD NOT”, “RECOMMENDED”, “MAY”, and “OPTIONAL” in this document are to be interpreted as described in RFC2119.

The “C” style Presentation Language as described in RFC5246 Section 4 is used to represent data structures, except for cryptographic attributes, which are specified as opaque byte vectors.

x | y denotes the concatenation of x and y.

1.2 Kyber Key Encapsulation Mechanism

This protocol uses the Kyber Key Encapsulation Mechanism KYBER (NIST round 3). Kyber is one of the finalists in the NIST post-quantum cryptography project. Please see the Kyber project page for more information:

2. Core Protocol

The protocol is based on Kyber and Trevor Perrin’s Noise Protocol Framework NOISE along with “Post Quantum Noise” paper PQNOISE. Older previous versions of our transport were based on NOISEHFS.

Our transport protocol begins with a prologue, Noise handshake, followed by a stream of Noise Transport messages in a minimal framing layer, over a TCP/IP connection.

Our Noise protocol string:

Noise_pqXX_Kyber768X25519_ChaChaPoly_BLAKE2b

The protocol string is a very condensed description of our protocol. We use the pqXX two way Noise pattern which is described as follows:

pqXX: -> e <- ekem, s -> skem, s <- skem

The next part of the protocol string specifies the KEM, Kyber768X25519 which is a hybrid KEM where the share secret outputs of both X25519 and Kyber768 are combined.

Finally the ChaChaPoly_BLAKE2b parts of the protocol string indicate which stream cipher and hash function we are using.

As a non-standard modification to the Noise protocol, the 65535 byte message length limit is increased to 1300000 bytes. We send very large messages over our Noise protocol because of our using the Sphincs+ signature scheme which has signatures that are about 49k bytes.

It is assumed that all parties using the KMNWP protocol have a fixed long or short lived Kyber768X25519 keypair (KYBER and RFC7748), the public component of which is known to the other party in advance. How such keys are distributed is beyond the scope of this document.

2.1 Handshake Phase

All sessions start in the Handshake Phase, in which an anonymous authenticated handshake is conducted.

The handshake is a unmodified Noise handshake, with a fixed prologue prefacing the initiator's first Noise handshake message. This prologue is also used as the prologue input to the Noise HandshakeState Initialize() operation for both the initiator and responder.

The prologue is defined to be the following structure:

struct {
    uint8_t protocol_version; /* 0x03 */
} Prologue;

As all Noise handshake messages are fixed sizes, no additional framing is required for the handshake.

Implementations MUST preserve the Noise handshake hash [h] for the purpose of implementing authentication (Section 2.3).

Implementations MUST reject handshake attempts by terminating the session immediately upon any Noise protocol handshake failure and when, as a responder, they receive a Prologue containing an unknown protocol_version value.

Implementations SHOULD impose reasonable timeouts for the handshake process, and SHOULD terminate sessions that are taking too long to handshake.

2.1.1 Handshake Authentication

Mutual authentication is done via exchanging fixed sized payloads as part of the pqXX handshake consisting of the following structure:

struct {
    uint8_t ad_len;
    opaque additional_data[ad_len];
    opaque padding[255 - ad_len];
    uint32_t unix_time;
} AuthenticateMessage;

Where:

  • ad_len - The length of the optional additional data.
  • additional_data - Optional additional data, such as a username, if any.
  • unix_time - 0 for the initiator, the approximate number of seconds since 1970-01-01 00:00:00 UTC for the responder.

The initiator MUST send the AuthenticateMessage after it has received the peer's response (so after -> s, se in Noise parlance).

The contents of the optional additional_data field is deliberately left up to the implementation, however it is RECOMMENDED that implementations pad the field to be a consistent length regardless of contents to avoid leaking information about the authenticating identity.

To authenticate the remote peer given an AuthenticateMessage, the receiving peer must validate the s component of the Noise handshake (the remote peer's long term public key) with the known value, along with any of the information in the additional_data field such as the user name, if any.

If the validation procedure succeeds, the peer is considered authenticated. If the validation procedure fails for any reason, the session MUST be terminated immediately.

Responders MAY add a slight amount (+- 10 seconds) of random noise to the unix_time value to avoid leaking precise load information via packet queueing delay.

2.2 Data Transfer Phase

Upon successfully concluding the handshake the session enters the Data Transfer Phase, where the initiator and responder can exchange KMNWP messages.

A KMNWP message is defined to be the following structure:

enum {
    no_op(0),
    disconnect(1),
    send_packet(2),

    (255),
} Command;

struct {
    Command command;
    uint8_t reserved;    /* MUST be '0x00' */
    uint32_t msg_length; /* 0 <= msg_length <= 1048554) */
    opaque message[msg_length];
    opaque padding[];    /* length is implicit */
} Message;

Notes:

  • The padding field, if any MUST be padded with '0x00' bytes.

All outgoing Message(s) are encrypted and authenticated into a pair of Noise Transport messages, each containing one of the following structures:

struct {
    uint32_t message_length;
} CiphertextHeader;

struct {
    uint32_t message[ciphertext_length-16];
} Ciphertext;

Notes:

  • The ciphertext_length field includes the Noise protocol overhead of 16 bytes, for the Noise Transport message containing the Ciphertext.

All outgoing Message(s) are preceded by a Noise Transport Message containing a CiphertextHeader, indicating the size of the Noise Transport Message transporting the Message Ciphertext. After generating both Noise Transport Messages, the sender MUST call the Noise CipherState Rekey() operation.

To receive incoming Ciphertext messages, first the Noise Transport Message containing the CiphertextHeader is consumed off the network, authenticated and decrypted, giving the receiver the length of the Noise Transport Message containing the actual message itself. The second Noise Transport Message is consumed off the network, authenticated and decrypted, with the resulting message being returned to the caller for processing. After receiving both Noise Transport Messages, the receiver MUST call the Noise CipherState Rekey() operation.

Implementations MUST immediately terminate the session any of the DecryptWithAd() operations fails.

Implementations MUST immediately terminate the session if an unknown command is received in a Message, or if the Message is otherwise malformed in any way.

Implementations MAY impose a reasonable idle timeout, and terminate the session if it expires.

3. Predefined Commands

3.1 The no_op Command

The no_op command is a command that explicitly is a No Operation, to be used to implement functionality such as keep-alives and or application layer padding.

Implementations MUST NOT send any message payload accompanying this command, and all received command data MUST be discarded without interpretation.

3.2 The disconnect Command

The disconnect command is a command that is used to signal explicit session termination. Upon receiving a disconnect command, implementations MUST interpret the command as a signal from the peer that no additional commands will be sent, and destroy the cryptographic material in the receive CipherState.

While most implementations will likely wish to terminate the session upon receiving this command, any additional behavior is explicitly left up to the implementation and application.

Implementations MUST NOT send any message payload accompanying this command, and MUST not send any further traffic after sending a disconnect command.

3.3 The send_packet Command

The send_packet command is the command that is used by the initiator to transmit a Sphinx Packet over the network. The command’s message is the Sphinx Packet destined for the responder.

Initiators MUST terminate the session immediately upon reception of a send_packet command.

4. Anonymity Considerations

Adversaries being able to determine that two parties are communicating via KMNWP is beyond the threat model of this protocol. At a minimum, it is trivial to determine that a KMNWP handshake is being performed, due to the length of each handshake message, and the fixed positions of the various public keys.

5. Security Considerations

It is imperative that implementations use ephemeral keys for every handshake as the security properties of the Kyber KEM are totally lost if keys are ever reused.

Kyber was chosen as the KEM algorithm due to it’s conservative parameterization, simplicty of implementation, and high performance in software. It is hoped that the addition of a quantum resistant algorithm will provide forward secrecy even in the event that large scale quantum computers are applied to historical intercepts.

6. Acknowledgments

I would like to thank Trevor Perrin for providing feedback during the design of this protocol, and answering questions regarding Noise.

Appendix A. References

Appendix A.1 Normative References

Appendix A.2 Informative References

Appendix B. Citing This Document

Appendix B.1 Bibtex Entry

Note that the following bibtex entry is in the IEEEtran bibtex style as described in a document called “How to Use the IEEEtran BIBTEX Style”.

@online{KatzMixWire,
title = {Katzenpost Mix Network Wire Protocol Specification},
author = {Yawning Angel},
url = {https://github.com/katzenpost/katzenpost/blob/master/docs/specs/wire-protocol.rst},
year = {2017}
}

KYBER

Joppe Bos, Léo Ducas, Eike Kiltz, Tancrède Lepoint, Vadim Lyubashevsky, John M. Schanck, Peter Schwabe, Gregor Seiler, Damien Stehlé “CRYSTALS – Kyber: a CCA-secure module-lattice-based KEM”, https://cryptojedi.org/papers/kyber-20180716.pdf

NOISE

Perrin, T., “The Noise Protocol Framework”, May 2017, https://noiseprotocol.org/noise.pdf

NOISEHFS

Weatherley, R., “Noise Extension: Hybrid Forward Secrecy”, https://github.com/noiseprotocol/noise_hfs_spec/blob/master/output/noise_hfs.pdf

PQNOISE

Yawning Angel, Benjamin Dowling, Andreas Hülsing, Peter Schwabe and Florian Weber, “Post Quantum Noise”, 2022, https://eprint.iacr.org/2022/539.pdf

RFC2119

Bradner, S., “Key words for use in RFCs to Indicate Requirement Levels”, BCP 14, RFC 2119, DOI 10.17487/RFC2119, March 1997, https://www.rfc-editor.org/info/rfc2119

RFC5246

Dierks, T. and E. Rescorla, “The Transport Layer Security (TLS) Protocol Version 1.2”, RFC 5246, DOI 10.17487/RFC5246, August 2008, https://www.rfc-editor.org/info/rfc5246

RFC7748

Langley, A., Hamburg, M., and S. Turner, “Elliptic Curves for Security”, RFC 7748, DOI 10.17487/RFC7748, January 2016, http://www.rfc-editor.org/info/rfc7748

5 - Mixnet Academy

These works are licensed under the Creative Commons Attribution-ShareAlike 4.0 International License.

5.1 - Syllabus, Learn Mix Networks for Great Good

(maybe the second title should be: Prevent Murder using Mathematics)

Course Syllabus And Reading List

There are no good introductory papers on mix networks. Instead, the approach is to read all the really important academic papers on mix networks. These papers are roughly organized into several categories such as:

Missing from this list are verified shuffles. These are specialized mix strategies which at times are very useful for specific use cases such as voting.

In a few of these mixnet sections I have included youtube videos I've made to help explain some of the fundamental mixnet concepts. As you read these mixnet papers keep in mind that decryption mixnets have the following attack categories:

  • tagging attacks
  • n-1 attacks
  • compulsion attacks
  • statistical disclosure attacks
  • epistemic attacks

After all this mix network literature we turn to the Classical Packet Switching Network Literature below in the next major section of reading. Many of these important papers happen to not be academic papers but rather come from industry / IETF and are RFCs. Why read these? Aren’t mixnet papers enough? Yes if you want to only publish papers on mix networks then reading about only mix networks may be enough.

However if you want to design real world mix network systems then understanding the mathematical limitations of the packet switching networking design space is extremely important! You must read about the early Internet design mistakes to understand what not to do in your mix network designs. In your mix network designs you must take care to avoid such fatal conditions such as Congestion Collapse.

Have questions? Sit on them for a week and voraciously read papers. If you still have questions then do feel free to ask me. We have a mailing list and IRC channel for such things:

Mix Network Fundamentals

Mix Strategies

Mix Network Topology

Compulsion Attacks And Packet Format

Note that Jeff Burdges has designed but not completely specified a new forward secure mix design that uses Post Quantum cryptographic ratchets. You can learn more about this here:

Statistical Disclosure Attacks and Decoy Traffic

Epistemic Attacks

Modern Mix Network Designs

Classical Packet Switching Network Literature

Congestion Control

Automatic Repeat Request Protocol Considerations

NOTE: many more papers by Milica Stojanovic about underwater acoustic network protocols can be found here:

Router Scheduling (for general purpose computers)

Active Queue Management

Attacks on Congestion Control

Congestion Control with Explicit Signaling

NOTE: for more reading on this subject refer to Dr. Sally Floyd’s ECN reading list