A detailed design specification for our PQ Noise based wire protocol, which is used for transport encryption between all the mix nodes and dirauth nodes.
This section of the Katzenpost technical documentation provides an introduction to the
software components that make up Katzenpost and guidance on how to configure each
component. The intended reader is a system administrator who wants to implement a working,
production Katzenpost network.
For information about the theory and design of this software, see ???. For a quickly deployable,
non-production test network (primarily for use by developers), see Configuring Katzenpost.
Understanding the Katzenpost components
The core of Katzenpost consists of two program executables, dirauth and server.
Running the dirauth commmand runs a directory
authority node, or dirauth, that functions as part
of the mixnet's public-key infrastructure (PKI). Running the server
runs either a mix node, a gateway node, or a
service node, depending on the configuration. Configuration
settings are provided in an associated katzenpost-authority.toml or
katzenpost.toml file respectively.
In addition to the server components, Katzenpost also supports connections to
client applications hosted externally to the mix network and communicating with it
through gateway nodes.
A model mix network is shown in Figure 1.
Figure 1. The pictured element types correspond to discrete client and server programs that
Katzenpost requires to function.
The mix network contains an n-layer topology of mix-nodes, with
three nodes per layer in this example. Sphinx packets traverse the network in one
direction only. The gateway nodes allow clients to interact with the mix network. The
service nodes provide mix network services that mix network clients can interact with.
All messages sent by clients are handed to a connector daemon
hosted on the client system, passed across the Internet to a gateway, and then relayed
to a service node by way of the nine mix nodes. The service node sends its reply back
across the mix-node layers to a gateway, which transmits it across the Internet to be
received by the targeted client. The mix, gateway, and service nodes send mix
descriptors to the dirauths and retrieve a consensus
document from them, described below.
In addition to the server components, Katzenpost supports connections to client
applications hosted externally to the mix network and communicating with it through
gateway nodes and, in some cases, a client connector.
Directory authorities (dirauths)
Dirauths compose the decentralized public key infrastructure (PKI) that serves as
the root of security for the entire mix network. Clients, mix nodes, gateways nodes,
and service nodes rely on the PKI/dirauth system to maintain and sign an up-to-date
consensus document, providing a view of the network including connection information
and public cryptographic key materials and signatures.
Every 20 minutes (the current value for an epoch), each mix,
gateway, and service node signs a mix descriptor and uploads it to the dirauths. The
dirauths then vote on a new consensus document. If consensus is reached, each
dirauth signs the document. Clients and nodes download the document as needed and
verify the signatures. Consensus fails when 1/2 + 1 nodes fail, which yields greater
fault tolerance than, for example, Byzantine Fault Tolerance, which fails when 1/3 +
1 of the nodes fail.
The PKI signature scheme is fully configurable by the dirauths. Our recommendation
is to use a hybrid signature scheme consisting of classical Ed25519 and the
post-quantum, stateless, hash-based signature scheme known as Sphincs+ (with the
parameters: "sphincs-shake-256f"), which is designated in Katzenpost
configurations as "Ed25519 Sphincs+". Examples are provided below.
Mix nodes
The mix node is the fundamental building block of the mix network.
Katzenpost mix nodes are arranged in a layered topology to achieve the best
levels of anonymity and ease of analysis while being flexible enough to scale with
traffic demands.
Gateway nodes
Gateway nodes provide external client access to the mix network. Because gateways
are uniquely positioned to identify clients, they are designed to have as little
information about client behavior as possible. Gateways are randomly selected and
have no persistent relationship with clients and no knowledge of whether a client's
packets are decoys or not. When client traffic through a gateway is slow, the node
additionally generates decoy traffic.
Service nodes
Service
nodes provide functionality requested by clients. They are
logically positioned at the deepest point of the mix network, with incoming queries
and outgoing replies both needing to traverse all n layers of
mix nodes. A service node's functionality may involve storing messages, publishing
information outside of the mixnet, interfacing with a blockchain node, and so on.
Service nodes also process decoy packets.
Clients
Client applications should be designed so that the following conditions are
met:
Separate service requests from a client are unlinkable. Repeating the same
request may be lead to linkability.
Service nodes and clients have no persistent relationship.
Cleints generate a stream of packets addressed to random or pseudorandom
services regardless of whether a real service request is being made. Most of
these packets will be decoy traffic.
Traffic from a client to a service node must be correctly coupled with
decoy traffic. This can mean that the service node is chosen independently
from traffic history, or that the transmitted packet replaces a decoy packet
that was meant to go to the desired service.
Katzenpost currently includes several client applications. All applications
make extensive use of Sphinx single-use reply blocks (SURBs), which enable service
nodes to send replies without knowing the location of the client. Newer clients
require a connection through the client connector, which
provides multiplexing and privilege separation with a consequent reduction in
processing overhead. These clients also implement the Pigeonhole storage and BACAP
protocols detailed in Place-holder for research paper link.
The following client applications are available.
Table 1. Katzenpost clients
Name
Needs connector
Description
Code
Ping
no
The mix network equivalent of an ICMP ping utility, used
for network testing.
This section documents the configuration parameters for each type of Katzenpost
server node. Each node has its own configuration file in TOML format.
Configuring directory authorities
The following configuration is drawn from the reference implementation in
katzenpost/docker/dirauth_mixnet/auth1/authority.toml. In a
real-world mixnet, the component hosts would not be sharing a single IP address. For
more information about the test mixnet, see ???.
Specifies the human-readable identifier for a node, and must be unique
per mixnet. The identifier can be an FQDN but does not have to
be.
Type: string
Required: Yes
WireKEMScheme
Specifies the key encapsulation mechanism (KEM) scheme
for the PQ
Noise-based wire protocol (link layer) that nodes use
to communicate with each other. PQ Noise is a post-quantum variation of
the Noise protocol
framework, which algebraically transforms ECDH handshake
patterns into KEM encapsulate/decapsulate operations.
This configuration option supports the optional use of
hybrid post-quantum cryptography to strengthen security. The following KEM
schemes are supported:
Classical: "x25519", "x448"
Note
X25519 and X448 are actually non-interactive key-exchanges
(NIKEs), not KEMs. Katzenpost uses
a hashed ElGamal cryptographic construction
to convert them from NIKEs to KEMs.
Specifies the cryptographic signature scheme which will be used by all
components of the mix network when interacting with the PKI system. Mix
nodes sign their descriptors using this signature scheme, and dirauth
nodes similarly sign PKI documents using the same scheme.
The following signature schemes are supported: "ed25519", "ed448",
"Ed25519 Sphincs+", "Ed448-Sphincs+", "Ed25519-Dilithium2",
"Ed448-Dilithium3"
Type: string
Required: Yes
Addresses
Specifies a list of one or more address URLs in a format that contains
the transport protocol, IP address, and port number that the node will
bind to for incoming connections. Katzenpost supports URLs with that
start with either "tcp://" or "quic://" such as:
["tcp://192.168.1.1:30001"] and ["quic://192.168.1.1:40001"].
Type: []string
Required: Yes
DataDir
Specifies the absolute path to a node's state directory. This is
where persistence.db is written to disk and
where a node stores its cryptographic key materials when started with
the "-g" command-line option.
Type: string
Required: Yes
Dirauth: Authorities
section
An Authorities section is configured for each peer authority. We
recommend using TOML's style
for multi-line quotations for key materials.
Specifies the human-readable identifier for the node which must be
unique per mixnet. The identifier can be an FQDN but does not have to
be.
Type: string
Required: Yes
IdentityPublicKey
String containing the node's public identity key in PEM format.
IdentityPublicKey is the node's permanent identifier
and is used to verify cryptographic signatures produced by its private
identity key.
Type: string
Required: Yes
PKISignatureScheme
Specifies the cryptographic signature scheme used by all directory
authority nodes. PKISignatureScheme must match the scheme
specified in the Server section of the configuration.
Type: string
Required: Yes
LinkPublicKey
String containing the peer's public link-layer key in PEM format.
LinkPublicKey must match the specified
WireKEMScheme.
Type: string
Required: Yes
WireKEMScheme
Specifies the key encapsulation mechanism (KEM) scheme
for the PQ
Noise-based wire protocol (link layer) that nodes use
to communicate with each other. PQ Noise is a post-quantum variation of
the Noise protocol
framework, which algebraically transforms ECDH handshake
patterns into KEM encapsulate/decapsulate operations.
This configuration option supports the optional use of
hybrid post-quantum cryptography to strengthen security. The following KEM
schemes are supported:
Classical: "x25519", "x448"
Note
X25519 and X448 are actually non-interactive key-exchanges
(NIKEs), not KEMs. Katzenpost uses
a hashed ElGamal cryptographic construction
to convert them from NIKEs to KEMs.
Specifies a list of one or more address URLs in a format that contains
the transport protocol, IP address, and port number that the node will
bind to for incoming connections. Katzenpost supports URLs with that
start with either "tcp://" or "quic://" such as:
["tcp://192.168.1.1:30001"] and ["quic://192.168.1.1:40001"].
Type: []string
Required: Yes
Dirauth: Logging section
The Logging configuration section controls logging behavior across Katzenpost.
Specifies the maximum allowed rate of packets per client per gateway
node. Rate limiting is done on the gateway nodes.
Type: uint64
Required: Yes
Mu
Specifies the inverse of the mean of the exponential distribution from
which the Sphinx packet per-hop mixing delay will be sampled.
Type: float64
Required: Yes
MuMaxDelay
Specifies the maximum Sphinx packet per-hop mixing delay in
milliseconds.
Type: uint64
Required: Yes
LambdaP
Specifies the inverse of the mean of the exponential distribution that
clients sample to determine the time interval between sending messages,
whether actual messages from the FIFO egress queue or decoy messages if
the queue is empty.
Type: float64
Required: Yes
LambdaPMaxDelay
Specifies the maximum send delay interval for LambdaP in
milliseconds.
Type: uint64
Required: Yes
LambdaL
Specifies the inverse of the mean of the exponential distribution that
clients sample to determine the delay interval between loop
decoys.
Type: float64
Required: Yes
LambdaLMaxDelay
Specifies the maximum send delay interval for LambdaL in
milliseconds.
Type: uint64
Required: Yes
LambdaD
LambdaD is the inverse of the mean of the exponential distribution
that clients sample to determine the delay interval between decoy drop
messages.
Type: float64
Required: Yes
LambdaDMaxDelay
Specifies the maximum send interval in for LambdaD in milliseconds.
Type: uint64
Required: Yes
LambdaM
LambdaM is the inverse of the mean of the exponential distribution
that mix nodes sample to determine the delay between mix loop
decoys.
Type: float64
Required: Yes
LambdaG
LambdaG is the inverse of the mean of the exponential distribution
that gateway nodes to select the delay between gateway node
decoys.
Warning
Do not set this value manually in the TOML configuration file. The
field is used internally by the dirauth server state machine.
Type: float64
Required: Yes
LambdaMMaxDelay
Specifies the maximum delay for LambdaM in milliseconds.
Type: uint64
Required: Yes
LambdaGMaxDelay
Specifies the maximum delay for LambdaG in milliseconds.
Specifies the human-readable identifier for a node, and must be unique
per mixnet. The identifier can be an FQDN but does not have to
be.
Type: string
IdentityPublicKeyPem
Path and file name of a mix node's public identity signing key, also
known as the identity key, in PEM format.
Type: string
Required: Yes
Dirauth: SphinxGeometry
section
Sphinx is an encrypted nested-packet format designed primarily for mixnets.
The original Sphinx paper described a non-interactive key exchange
(NIKE) employing classical encryption. The Katzenpost implementation
strongly emphasizes configurability, supporting key encapsulation mechanisms
(KEMs) as well as NIKEs, and enabling the use of either classical or hybrid
post-quantum cryptography. Hybrid constructions offset the newness of
post-quantum algorithms by offering heavily tested classical algorithms as a
fallback.
Note
Sphinx, the nested-packet format, should not be confused with Sphincs or Sphincs+, which
are post-quantum signature schemes.
Katzenpost Sphinx also relies on the following classical cryptographic
primitives:
CTR-AES256, a stream cipher
HMAC-SHA256, a message authentication code (MAC) function
HKDF-SHA256, a key derivation function (KDF)
AEZv5, a strong pseudorandom permutation (SPRP)
All dirauths must be configured to use the same SphinxGeometry
parameters. Any geometry not advertised by the PKI document will fail. Each
dirauth publishes the hash of its SphinxGeometry parameters in the
PKI document for validation by its peer dirauths.
The SphinxGeometry section defines parameters for the Sphinx
encrypted nested-packet format used internally by Katzenpost.
Warning
The values in the SphinxGeometry configuration section must
be programmatically generated by gensphinx. Many of the
parameters are interdependent and cannot be individually modified. Do not
modify the these values by hand.
The settings in this section are generated by the gensphinx
utility, which computes the Sphinx geometry based on the following user-supplied
directives:
The number of mix node layers (not counting gateway and service
nodes)
The length of the application-usable packet payload
The selected NIKE or KEM scheme
The output in TOML should then be pasted unchanged into the node's configuration
file, as shown below. For more information, see ???.
The number of hops a Sphinx packet takes through the mixnet. Because
packet headers hold destination information for each hop, the size of the
header increases linearly with the number of hops.
Type: int
Required: Yes
HeaderLength
The total length of the Sphinx packet header in bytes.
Type: int
Required: Yes
RoutingInfoLength
The total length of the routing information portion of the Sphinx packet
header.
Type: int
Required: Yes
PerHopRoutingInfoLength
The length of the per-hop routing information in the Sphinx packet
header.
Type: int
Required: Yes
SURBLength
The length of a single-use reply block (SURB).
Type: int
Required: Yes
SphinxPlaintextHeaderLength
The length of the plaintext Sphinx packet header.
Type: int
Required: Yes
PayloadTagLength
The length of the payload tag.
Type: int
Required: Yes
ForwardPayloadLength
The total size of the payload.
Type: int
Required: Yes
UserForwardPayloadLength
The size of the usable payload.
Type: int
Required: Yes
NextNodeHopLength
The NextNodeHopLength is derived from the largest routing-information
block that we expect to encounter. Other packets have
NextNodeHop + NodeDelay sections, or a Recipient section, both of which
are shorter.
Type: int
Required: Yes
SPRPKeyMaterialLength
The length of the strong pseudo-random permutation (SPRP) key.
Type: int
Required: Yes
NIKEName
The name of the non-interactive key exchange (NIKE) scheme used by Sphinx
packets.
NIKEName and KEMName are mutually
exclusive.
Type: string
Required: Yes
KEMName
The name of the key encapsulation mechanism (KEM) used by Sphinx
packets.
NIKEName and KEMName are mutually exclusive.
Type: string
Required: Yes
Configuring mix nodes
The following configuration is drawn from the reference implementation in
katzenpost/docker/dirauth_mixnet/mix1/katzenpost.toml. In a
real-world mixnet, the component hosts would not be sharing a single IP address. For
more information about the test mixnet, see ???.
Specifies the human-readable identifier for a node, and must be unique per mixnet.
The identifier can be an FQDN but does not have to be.
Type: string
Required: Yes
WireKEM
WireKEM specifies the key encapsulation mechanism (KEM) scheme
for the PQ
Noise-based wire protocol (link layer) that nodes use
to communicate with each other. PQ Noise is a post-quantum variation of
the Noise protocol
framework, which algebraically transforms ECDH handshake
patterns into KEM encapsulate/decapsulate operations.
This configuration option supports the optional use of
hybrid post-quantum cryptography to strengthen security. The following KEM
schemes are supported:
Classical: "x25519", "x448"
Note
X25519 and X448 are actually non-interactive key-exchanges
(NIKEs), not KEMs. Katzenpost uses
a hashed ElGamal cryptographic construction
to convert them from NIKEs to KEMs.
Specifies the cryptographic signature scheme that will be used by all
components of the mix network when interacting with the PKI system. Mix
nodes sign their descriptors using this signature scheme, and dirauth nodes
similarly sign PKI documents using the same scheme.
Specifies a list of one or more address URLs in a format that contains the
transport protocol, IP address, and port number that the server will bind to
for incoming connections. Katzenpost supports URLs with that start with
either "tcp://" or "quic://" such as: ["tcp://192.168.1.1:30001"] and
["quic://192.168.1.1:40001"].
Type: []string
Required: Yes
BindAddresses
If true, allows setting of listener
addresses that the server will bind to and accept connections on. These
addresses are not advertised in the PKI.
Type: bool
Required: No
MetricsAddress
Specifies the address/port to bind the Prometheus metrics endpoint
to.
Type: string
Required: No
DataDir
Specifies the absolute path to a node's state directory. This is where
persistence.db is written to disk and where a node stores its cryptographic
key materials when started with the "-g" commmand-line option.
Type: string
Required: Yes
IsGatewayNode
If true, the server is a gateway
node.
Type: bool
Required: No
IsServiceNode
If true, the server is a service
node.
Type: bool
Required: No
Mix node: Logging section
The Logging configuration section controls logging behavior across Katzenpost.
Specifies the human-readable identifier for a node, which must be unique per mixnet.
The identifier can be an FQDN but does not have to be.
Type: string
Required: Yes
IdentityPublicKey
String containing the node's public identity key in PEM format.
IdentityPublicKey is the node's permanent identifier
and is used to verify cryptographic signatures produced by its private
identity key.
Type: string
Required: Yes
PKISignatureScheme
Specifies the cryptographic signature scheme that will be used by all
components of the mix network when interacting with the PKI system. Mix
nodes sign their descriptors using this signature scheme, and dirauth nodes
similarly sign PKI documents using the same scheme.
Type: string
Required: Yes
LinkPublicKey
String containing the peer's public link-layer key in PEM format.
LinkPublicKey must match the specified
WireKEMScheme.
Type: string
Required: Yes
WireKEMScheme
The name of the wire protocol key-encapsulation mechanism (KEM) to use.
Type: string
Required: Yes
Addresses
Specifies a list of one or more address URLs in a format that contains the
transport protocol, IP address, and port number that the server will bind to
for incoming connections. Katzenpost supports URLs that start with
either "tcp://" or "quic://" such as: ["tcp://192.168.1.1:30001"] and
["quic://192.168.1.1:40001"].
Type: []string
Required: Yes
Mix node: Management section
The Management section specifies
connectivity information for the Katzenpost control protocol which can be used to make run-time
configuration changes. A configuration resembles the following:
Specifies the path to the management interface socket. If left empty, then management_sock
is located in the configuration's defined DataDir>.
Type: string
Required: No
Mix node: SphinxGeometry section
The SphinxGeometry section defines parameters for the Sphinx
encrypted nested-packet format used internally by Katzenpost.
Warning
The values in the SphinxGeometry configuration section must
be programmatically generated by gensphinx. Many of the
parameters are interdependent and cannot be individually modified. Do not
modify the these values by hand.
The settings in this section are generated by the gensphinx
utility, which computes the Sphinx geometry based on the following user-supplied
directives:
The number of mix node layers (not counting gateway and service
nodes)
The length of the application-usable packet payload
The selected NIKE or KEM scheme
The output in TOML should then be pasted unchanged into the node's configuration
file, as shown below. For more information, see ???.
The number of hops a Sphinx packet takes through the mixnet. Because
packet headers hold destination information for each hop, the size of the
header increases linearly with the number of hops.
Type: int
Required: Yes
HeaderLength
The total length of the Sphinx packet header in bytes.
Type: int
Required: Yes
RoutingInfoLength
The total length of the routing information portion of the Sphinx packet
header.
Type: int
Required: Yes
PerHopRoutingInfoLength
The length of the per-hop routing information in the Sphinx packet
header.
Type: int
Required: Yes
SURBLength
The length of a single-use reply block (SURB).
Type: int
Required: Yes
SphinxPlaintextHeaderLength
The length of the plaintext Sphinx packet header.
Type: int
Required: Yes
PayloadTagLength
The length of the payload tag.
Type: int
Required: Yes
ForwardPayloadLength
The total size of the payload.
Type: int
Required: Yes
UserForwardPayloadLength
The size of the usable payload.
Type: int
Required: Yes
NextNodeHopLength
The NextNodeHopLength is derived from the largest routing-information
block that we expect to encounter. Other packets have
NextNodeHop + NodeDelay sections, or a Recipient section, both of which
are shorter.
Type: int
Required: Yes
SPRPKeyMaterialLength
The length of the strong pseudo-random permutation (SPRP) key.
Type: int
Required: Yes
NIKEName
The name of the non-interactive key exchange (NIKE) scheme used by Sphinx
packets.
NIKEName and KEMName are mutually
exclusive.
Type: string
Required: Yes
KEMName
The name of the key encapsulation mechanism (KEM) used by Sphinx
packets.
NIKEName and KEMName are mutually exclusive.
Type: string
Required: Yes
Mix node: Debug section
The Debug section is the Katzenpost server debug configuration
for advanced tuning.
Specifies the number of worker instances to use for inbound Sphinx
packet processing.
Type: int
Required: No
NumProviderWorkers
Specifies the number of worker instances to use for provider specific
packet processing.
Type: int
Required: No
NumKaetzchenWorkers
Specifies the number of worker instances to use for Kaetzchen-specific
packet processing.
Type: int
Required: No
SchedulerExternalMemoryQueue
If true, the experimental disk-backed external memory
queue is enabled.
Type: bool
Required: No
SchedulerQueueSize
Specifies the maximum scheduler queue size before random entries will
start getting dropped. A value less than or equal to zero is treated as
unlimited.
Type: int
Required: No
SchedulerMaxBurst
Specifies the maximum number of packets that will be dispatched per
scheduler wakeup event.
Type:
Required: No
UnwrapDelay
Specifies the maximum unwrap delay due to queueing in
milliseconds.
Type: int
Required: No
GatewayDelay
Specifies the maximum gateway node worker delay due to queueing in milliseconds.
Type: int
Required: No
ServiceDelay
Specifies the maximum provider delay due to queueing in
milliseconds.
Type: int
Required: No
KaetzchenDelay
Specifies the maximum kaetzchen delay due to queueing in
milliseconds.
Type: int
Required: No
SchedulerSlack
Specifies the maximum scheduler slack due to queueing and/or
processing in milliseconds.
Type: int
Required: No
SendSlack
Specifies the maximum send-queue slack due to queueing and/or
congestion in milliseconds.
Type: int
Required: No
DecoySlack
Specifies the maximum decoy sweep slack due to external
delays such as latency before a loop decoy packet will be considered
lost.
Type: int
Required: No
ConnectTimeout
Specifies the maximum time a connection can take to establish a
TCP/IP connection in milliseconds.
Type: int
Required: No
HandshakeTimeout
Specifies the maximum time a connection can take for a link-protocol
handshake in milliseconds.
Type: int
Required: No
ReauthInterval
Specifies the interval at which a connection will be reauthenticated
in milliseconds.
Type: int
Required: No
SendDecoyTraffic
If true, decoy traffic is enabled.
This parameter is experimental and untuned,
and is disabled by default.
Note
This option will be removed once decoy traffic is fully implemented.
Type: bool
Required: No
DisableRateLimit
If true, the per-client rate limiter is disabled.
Note
This option should only be used for testing.
Type: bool
Required: No
GenerateOnly
If true, the server immediately halts
and cleans up after long-term key generation.
Type: bool
Required: No
Configuring gateway nodes
The following configuration is drawn from the reference implementation in
katzenpost/docker/dirauth_mixnet/gateway1/katzenpost.toml.
In a real-world mixnet, the component hosts would not be sharing a single IP
address. For more information about the test mixnet, see ???.
Specifies the human-readable identifier for a node, and must be unique per mixnet.
The identifier can be an FQDN but does not have to be.
Type: string
Required: Yes
WireKEM
WireKEM specifies the key encapsulation mechanism (KEM) scheme
for the PQ
Noise-based wire protocol (link layer) that nodes use
to communicate with each other. PQ Noise is a post-quantum variation of
the Noise protocol
framework, which algebraically transforms ECDH handshake
patterns into KEM encapsulate/decapsulate operations.
This configuration option supports the optional use of
hybrid post-quantum cryptography to strengthen security. The following KEM
schemes are supported:
Classical: "x25519", "x448"
Note
X25519 and X448 are actually non-interactive key-exchanges
(NIKEs), not KEMs. Katzenpost uses
a hashed ElGamal cryptographic construction
to convert them from NIKEs to KEMs.
Specifies the cryptographic signature scheme that will be used by all
components of the mix network when interacting with the PKI system. Mix
nodes sign their descriptors using this signature scheme, and dirauth nodes
similarly sign PKI documents using the same scheme.
Specifies a list of one or more address URLs in a format that contains the
transport protocol, IP address, and port number that the server will bind to
for incoming connections. Katzenpost supports URLs with that start with
either "tcp://" or "quic://" such as: ["tcp://192.168.1.1:30001"] and
["quic://192.168.1.1:40001"].
Type: []string
Required: Yes
BindAddresses
If true, allows setting of listener
addresses that the server will bind to and accept connections on. These
addresses are not advertised in the PKI.
Type: bool
Required: No
MetricsAddress
Specifies the address/port to bind the Prometheus metrics endpoint
to.
Type: string
Required: No
DataDir
Specifies the absolute path to a node's state directory. This is where
persistence.db is written to disk and where a node stores its cryptographic
key materials when started with the "-g" commmand-line option.
Type: string
Required: Yes
IsGatewayNode
If true, the server is a gateway
node.
Type: bool
Required: No
IsServiceNode
If true, the server is a service
node.
Type: bool
Required: No
Gateway node: Logging section
The Logging configuration section controls logging behavior across Katzenpost.
The Gateway section of the configuration is required for configuring a Gateway
node. The section must contain UserDB and SpoolDB
definitions. Bolt is an
embedded database library for the Go programming language that Katzenpost
has used in the past for its user and spool databases. Because Katzenpost
currently persists data on Service nodes instead of Gateways, these databases
will probably be deprecated in favour of in-memory concurrency structures. In
the meantime, it remains necessary to configure a Gateway node as shown below,
only changing the file paths as needed:
Specifies the human-readable identifier for a node, which must be unique per mixnet.
The identifier can be an FQDN but does not have to be.
Type: string
Required: Yes
IdentityPublicKey
String containing the node's public identity key in PEM format.
IdentityPublicKey is the node's permanent identifier
and is used to verify cryptographic signatures produced by its private
identity key.
Type: string
Required: Yes
PKISignatureScheme
Specifies the cryptographic signature scheme that will be used by all
components of the mix network when interacting with the PKI system. Mix
nodes sign their descriptors using this signature scheme, and dirauth nodes
similarly sign PKI documents using the same scheme.
Type: string
Required: Yes
LinkPublicKey
String containing the peer's public link-layer key in PEM format.
LinkPublicKey must match the specified
WireKEMScheme.
Type: string
Required: Yes
WireKEMScheme
The name of the wire protocol key-encapsulation mechanism (KEM) to use.
Type: string
Required: Yes
Addresses
Specifies a list of one or more address URLs in a format that contains the
transport protocol, IP address, and port number that the server will bind to
for incoming connections. Katzenpost supports URLs that start with
either "tcp://" or "quic://" such as: ["tcp://192.168.1.1:30001"] and
["quic://192.168.1.1:40001"].
Type: []string
Required: Yes
Gateway node: Management section
The Management section specifies
connectivity information for the Katzenpost control protocol which can be used to make run-time
configuration changes. A configuration resembles the following:
Specifies the path to the management interface socket. If left empty, then management_sock
is located in the configuration's defined DataDir>.
Type: string
Required: No
Gateway node: SphinxGeometry section
The SphinxGeometry section defines parameters for the Sphinx
encrypted nested-packet format used internally by Katzenpost.
Warning
The values in the SphinxGeometry configuration section must
be programmatically generated by gensphinx. Many of the
parameters are interdependent and cannot be individually modified. Do not
modify the these values by hand.
The settings in this section are generated by the gensphinx
utility, which computes the Sphinx geometry based on the following user-supplied
directives:
The number of mix node layers (not counting gateway and service
nodes)
The length of the application-usable packet payload
The selected NIKE or KEM scheme
The output in TOML should then be pasted unchanged into the node's configuration
file, as shown below. For more information, see ???.
The number of hops a Sphinx packet takes through the mixnet. Because
packet headers hold destination information for each hop, the size of the
header increases linearly with the number of hops.
Type: int
Required: Yes
HeaderLength
The total length of the Sphinx packet header in bytes.
Type: int
Required: Yes
RoutingInfoLength
The total length of the routing information portion of the Sphinx packet
header.
Type: int
Required: Yes
PerHopRoutingInfoLength
The length of the per-hop routing information in the Sphinx packet
header.
Type: int
Required: Yes
SURBLength
The length of a single-use reply block (SURB).
Type: int
Required: Yes
SphinxPlaintextHeaderLength
The length of the plaintext Sphinx packet header.
Type: int
Required: Yes
PayloadTagLength
The length of the payload tag.
Type: int
Required: Yes
ForwardPayloadLength
The total size of the payload.
Type: int
Required: Yes
UserForwardPayloadLength
The size of the usable payload.
Type: int
Required: Yes
NextNodeHopLength
The NextNodeHopLength is derived from the largest routing-information
block that we expect to encounter. Other packets have
NextNodeHop + NodeDelay sections, or a Recipient section, both of which
are shorter.
Type: int
Required: Yes
SPRPKeyMaterialLength
The length of the strong pseudo-random permutation (SPRP) key.
Type: int
Required: Yes
NIKEName
The name of the non-interactive key exchange (NIKE) scheme used by Sphinx
packets.
NIKEName and KEMName are mutually
exclusive.
Type: string
Required: Yes
KEMName
The name of the key encapsulation mechanism (KEM) used by Sphinx
packets.
NIKEName and KEMName are mutually exclusive.
Type: string
Required: Yes
Gateway node: Debug section
The Debug section is the Katzenpost server debug configuration
for advanced tuning.
Specifies the number of worker instances to use for inbound Sphinx
packet processing.
Type: int
Required: No
NumProviderWorkers
Specifies the number of worker instances to use for provider specific
packet processing.
Type: int
Required: No
NumKaetzchenWorkers
Specifies the number of worker instances to use for Kaetzchen-specific
packet processing.
Type: int
Required: No
SchedulerExternalMemoryQueue
If true, the experimental disk-backed external memory
queue is enabled.
Type: bool
Required: No
SchedulerQueueSize
Specifies the maximum scheduler queue size before random entries will
start getting dropped. A value less than or equal to zero is treated as
unlimited.
Type: int
Required: No
SchedulerMaxBurst
Specifies the maximum number of packets that will be dispatched per
scheduler wakeup event.
Type:
Required: No
UnwrapDelay
Specifies the maximum unwrap delay due to queueing in
milliseconds.
Type: int
Required: No
GatewayDelay
Specifies the maximum gateway node worker delay due to queueing in milliseconds.
Type: int
Required: No
ServiceDelay
Specifies the maximum provider delay due to queueing in
milliseconds.
Type: int
Required: No
KaetzchenDelay
Specifies the maximum kaetzchen delay due to queueing in
milliseconds.
Type: int
Required: No
SchedulerSlack
Specifies the maximum scheduler slack due to queueing and/or
processing in milliseconds.
Type: int
Required: No
SendSlack
Specifies the maximum send-queue slack due to queueing and/or
congestion in milliseconds.
Type: int
Required: No
DecoySlack
Specifies the maximum decoy sweep slack due to external
delays such as latency before a loop decoy packet will be considered
lost.
Type: int
Required: No
ConnectTimeout
Specifies the maximum time a connection can take to establish a
TCP/IP connection in milliseconds.
Type: int
Required: No
HandshakeTimeout
Specifies the maximum time a connection can take for a link-protocol
handshake in milliseconds.
Type: int
Required: No
ReauthInterval
Specifies the interval at which a connection will be reauthenticated
in milliseconds.
Type: int
Required: No
SendDecoyTraffic
If true, decoy traffic is enabled.
This parameter is experimental and untuned,
and is disabled by default.
Note
This option will be removed once decoy traffic is fully implemented.
Type: bool
Required: No
DisableRateLimit
If true, the per-client rate limiter is disabled.
Note
This option should only be used for testing.
Type: bool
Required: No
GenerateOnly
If true, the server immediately halts
and cleans up after long-term key generation.
Type: bool
Required: No
Configuring service nodes
The following configuration is drawn from the reference implementation in
katzenpost/docker/dirauth_mixnet/servicenode1/authority.toml.
In a real-world mixnet, the component hosts would not be sharing a single IP
address. For more information about the test mixnet, see ???.
Specifies the human-readable identifier for a node, and must be unique per mixnet.
The identifier can be an FQDN but does not have to be.
Type: string
Required: Yes
WireKEM
WireKEM specifies the key encapsulation mechanism (KEM) scheme
for the PQ
Noise-based wire protocol (link layer) that nodes use
to communicate with each other. PQ Noise is a post-quantum variation of
the Noise protocol
framework, which algebraically transforms ECDH handshake
patterns into KEM encapsulate/decapsulate operations.
This configuration option supports the optional use of
hybrid post-quantum cryptography to strengthen security. The following KEM
schemes are supported:
Classical: "x25519", "x448"
Note
X25519 and X448 are actually non-interactive key-exchanges
(NIKEs), not KEMs. Katzenpost uses
a hashed ElGamal cryptographic construction
to convert them from NIKEs to KEMs.
Specifies the cryptographic signature scheme that will be used by all
components of the mix network when interacting with the PKI system. Mix
nodes sign their descriptors using this signature scheme, and dirauth nodes
similarly sign PKI documents using the same scheme.
Specifies a list of one or more address URLs in a format that contains the
transport protocol, IP address, and port number that the server will bind to
for incoming connections. Katzenpost supports URLs with that start with
either "tcp://" or "quic://" such as: ["tcp://192.168.1.1:30001"] and
["quic://192.168.1.1:40001"].
Type: []string
Required: Yes
BindAddresses
If true, allows setting of listener
addresses that the server will bind to and accept connections on. These
addresses are not advertised in the PKI.
Type: bool
Required: No
MetricsAddress
Specifies the address/port to bind the Prometheus metrics endpoint
to.
Type: string
Required: No
DataDir
Specifies the absolute path to a node's state directory. This is where
persistence.db is written to disk and where a node stores its cryptographic
key materials when started with the "-g" commmand-line option.
Type: string
Required: Yes
IsGatewayNode
If true, the server is a gateway
node.
Type: bool
Required: No
IsServiceNode
If true, the server is a service
node.
Type: bool
Required: No
Service node: Logging section
The Logging configuration section controls logging behavior across Katzenpost.
The ServiceNode section contains configurations for each network
service that Katzenpost supports.
Services, termed Kaetzchen, can be divided into built-in and external services.
External services are provided through the CBORPlugin, a Go programming language implementation of the Concise Binary Object
Representation (CBOR), a binary data serialization format. While
native services need simply to be activated, external services are invoked by a
separate command and connected to the mixnet over a Unix socket. The plugin
allows mixnet services to be added in any programming language.
Specifies the protocol capability exposed by the agent.
Type: string
Required: Yes
Endpoint
Specifies the provider-side Endpoint where the agent will accept
requests. While not required by the specification, this server only
supports Endpoints that are
lower-case
local parts of an email address.
Type: string
Required: Yes
Command
Specifies the full path to the external plugin program that implements
this Kaetzchen service.
Type: string
Required: Yes
MaxConcurrency
Specifies the number of worker goroutines to start for this
service.
Type: int
Required: Yes
Config
Specifies extra per-agent arguments to be passed to the agent's
initialization routine.
Type: map[string]interface{}
Required: Yes
Disable
If true, disables a configured
agent.
Type: bool
Required: No
Per-service parameters:
echo
The internal echo service must be enabled on every
service node of a production mixnet for decoy traffic to work
properly.
spool
The spool service supports the catshadow
storage protocol,
which
is required by the Katzen chat client. The
example configuration above shows spool enabled with the setting:
Disable = false
Note
Spool, properly memspool, should
not be confused with the spool database on gateway
nodes.
data_store
Specifies the full path to the service database
file.
Type: string
Required: Yes
log_dir
Specifies the path to the node's log directory.
Type: string
Required: Yes
pigeonhole
The pigeonhole courier service supports the
Blinding-and-Capability scheme (BACAP)-based unlinkable messaging
protocols detailed in Place-holder for research paper link. Most of our future protocols
will use the pigeonhole courier service.
db
Specifies the full path to the service database
file.
Type: string
Required: Yes
log_dir
Specifies the path to the node's log directory.
Type: string
Required: Yes
panda
The panda storage and authentication service
currently does not work properly.
fileStore
Specifies the full path to the service database
file.
The http service is completely optional, but allows
the mixnet to be used as an HTTP proxy. This may be useful for
integrating with existing software systems.
Specifies the human-readable identifier for a node, which must be unique per mixnet.
The identifier can be an FQDN but does not have to be.
Type: string
Required: Yes
IdentityPublicKey
String containing the node's public identity key in PEM format.
IdentityPublicKey is the node's permanent identifier
and is used to verify cryptographic signatures produced by its private
identity key.
Type: string
Required: Yes
PKISignatureScheme
Specifies the cryptographic signature scheme that will be used by all
components of the mix network when interacting with the PKI system. Mix
nodes sign their descriptors using this signature scheme, and dirauth nodes
similarly sign PKI documents using the same scheme.
Type: string
Required: Yes
LinkPublicKey
String containing the peer's public link-layer key in PEM format.
LinkPublicKey must match the specified
WireKEMScheme.
Type: string
Required: Yes
WireKEMScheme
The name of the wire protocol key-encapsulation mechanism (KEM) to use.
Type: string
Required: Yes
Addresses
Specifies a list of one or more address URLs in a format that contains the
transport protocol, IP address, and port number that the server will bind to
for incoming connections. Katzenpost supports URLs that start with
either "tcp://" or "quic://" such as: ["tcp://192.168.1.1:30001"] and
["quic://192.168.1.1:40001"].
Type: []string
Required: Yes
Service node: Management section
The Management section specifies
connectivity information for the Katzenpost control protocol which can be used to make run-time
configuration changes. A configuration resembles the following:
Specifies the path to the management interface socket. If left empty, then management_sock
is located in the configuration's defined DataDir>.
Type: string
Required: No
Service node: SphinxGeometry section
The SphinxGeometry section defines parameters for the Sphinx
encrypted nested-packet format used internally by Katzenpost.
Warning
The values in the SphinxGeometry configuration section must
be programmatically generated by gensphinx. Many of the
parameters are interdependent and cannot be individually modified. Do not
modify the these values by hand.
The settings in this section are generated by the gensphinx
utility, which computes the Sphinx geometry based on the following user-supplied
directives:
The number of mix node layers (not counting gateway and service
nodes)
The length of the application-usable packet payload
The selected NIKE or KEM scheme
The output in TOML should then be pasted unchanged into the node's configuration
file, as shown below. For more information, see ???.
The number of hops a Sphinx packet takes through the mixnet. Because
packet headers hold destination information for each hop, the size of the
header increases linearly with the number of hops.
Type: int
Required: Yes
HeaderLength
The total length of the Sphinx packet header in bytes.
Type: int
Required: Yes
RoutingInfoLength
The total length of the routing information portion of the Sphinx packet
header.
Type: int
Required: Yes
PerHopRoutingInfoLength
The length of the per-hop routing information in the Sphinx packet
header.
Type: int
Required: Yes
SURBLength
The length of a single-use reply block (SURB).
Type: int
Required: Yes
SphinxPlaintextHeaderLength
The length of the plaintext Sphinx packet header.
Type: int
Required: Yes
PayloadTagLength
The length of the payload tag.
Type: int
Required: Yes
ForwardPayloadLength
The total size of the payload.
Type: int
Required: Yes
UserForwardPayloadLength
The size of the usable payload.
Type: int
Required: Yes
NextNodeHopLength
The NextNodeHopLength is derived from the largest routing-information
block that we expect to encounter. Other packets have
NextNodeHop + NodeDelay sections, or a Recipient section, both of which
are shorter.
Type: int
Required: Yes
SPRPKeyMaterialLength
The length of the strong pseudo-random permutation (SPRP) key.
Type: int
Required: Yes
NIKEName
The name of the non-interactive key exchange (NIKE) scheme used by Sphinx
packets.
NIKEName and KEMName are mutually
exclusive.
Type: string
Required: Yes
KEMName
The name of the key encapsulation mechanism (KEM) used by Sphinx
packets.
NIKEName and KEMName are mutually exclusive.
Type: string
Required: Yes
Service node: Debug section
The Debug section is the Katzenpost server debug configuration
for advanced tuning.
Specifies the number of worker instances to use for inbound Sphinx
packet processing.
Type: int
Required: No
NumProviderWorkers
Specifies the number of worker instances to use for provider specific
packet processing.
Type: int
Required: No
NumKaetzchenWorkers
Specifies the number of worker instances to use for Kaetzchen-specific
packet processing.
Type: int
Required: No
SchedulerExternalMemoryQueue
If true, the experimental disk-backed external memory
queue is enabled.
Type: bool
Required: No
SchedulerQueueSize
Specifies the maximum scheduler queue size before random entries will
start getting dropped. A value less than or equal to zero is treated as
unlimited.
Type: int
Required: No
SchedulerMaxBurst
Specifies the maximum number of packets that will be dispatched per
scheduler wakeup event.
Type:
Required: No
UnwrapDelay
Specifies the maximum unwrap delay due to queueing in
milliseconds.
Type: int
Required: No
GatewayDelay
Specifies the maximum gateway node worker delay due to queueing in milliseconds.
Type: int
Required: No
ServiceDelay
Specifies the maximum provider delay due to queueing in
milliseconds.
Type: int
Required: No
KaetzchenDelay
Specifies the maximum kaetzchen delay due to queueing in
milliseconds.
Type: int
Required: No
SchedulerSlack
Specifies the maximum scheduler slack due to queueing and/or
processing in milliseconds.
Type: int
Required: No
SendSlack
Specifies the maximum send-queue slack due to queueing and/or
congestion in milliseconds.
Type: int
Required: No
DecoySlack
Specifies the maximum decoy sweep slack due to external
delays such as latency before a loop decoy packet will be considered
lost.
Type: int
Required: No
ConnectTimeout
Specifies the maximum time a connection can take to establish a
TCP/IP connection in milliseconds.
Type: int
Required: No
HandshakeTimeout
Specifies the maximum time a connection can take for a link-protocol
handshake in milliseconds.
Type: int
Required: No
ReauthInterval
Specifies the interval at which a connection will be reauthenticated
in milliseconds.
Type: int
Required: No
SendDecoyTraffic
If true, decoy traffic is enabled.
This parameter is experimental and untuned,
and is disabled by default.
Note
This option will be removed once decoy traffic is fully implemented.
Type: bool
Required: No
DisableRateLimit
If true, the per-client rate limiter is disabled.
Note
This option should only be used for testing.
Type: bool
Required: No
GenerateOnly
If true, the server immediately halts
and cleans up after long-term key generation.
Katzenpost provides a ready-to-deploy Docker
image for developers who need a non-production test environment for developing
and testing client applications and server side plugins. By running this image on a single computer, you avoid the
need to build and manage a complex multi-node mix net. The image can also be run using Podman
The test mix network includes the following components:
If both Docker and Podman are present on your system, Katzenpost uses
Podman. Podman is a drop-in daemonless equivalent to Docker that does not
require superuser privileges to run.
On Debian, these software requirements can be installed with the following commands
(running as superuser). Apt will pull in the needed
dependencies.
Complete the following procedure to obtain, build, and deploy the Katzenpost test
network.
Install the Katzenpost code repository, hosted at https://github.com/katzenpost. The main Katzenpost
repository contains code for the server components as well as the docker image.
Clone the repository with the following command (your directory location may
vary):
Set up and start the Podman server (as superuser).
$ podman system service -t 0 $DOCKER_HOST &$ systemctl --user enable --now podman.socket
Operating the test mixnet
Navigate to katzenpost/docker. The Makefile
contains target operations to create, manage, and test the self-contained Katzenpost
container network. To invoke a target, run a command with the using the following
pattern:
~/katzenpost/docker$ make target
Running make with no target specified returns a list of available
targets.
Table 1. Table 1: Makefile targets
[none]
Display this list of targets.
start
Run the test network in the background.
stop
Stop the test network.
wait
Wait for the test network to have consensus.
watch
Display live log entries until Ctrl-C.
status
Show test network consensus status.
show-latest-vote
Show latest consensus vote.
run-ping
Send a ping over the test network.
clean-bin
Stop all components and delete binaries.
clean-local
Stop all components, delete binaries, and delete data.
clean-local-dryrun
Show what clean-local would delete.
clean
Same as clean-local, but also
deletes go_deps image.
Starting and monitoring the mixnet
The first time that you run make start, the Docker image is
downloaded, built, installed, and started. This takes several minutes. When the
build is complete, the command exits while the network remains running in the
background.
~/katzenpost/docker$ make start
Subsequent runs of make start either start or restart the
network without building the components from scratch. The exception to this is when
you delete any of the Katzenpost binaries (dirauth.alpine, server.alpine, etc.).
In that case, make start rebuilds just the parts of the network
dependent on the deleted binary. For more information about the files created during
the Docker build, see the section called “Network topology and components”.
Note
When running make start , be aware of the following
considerations:
If you intend to use Docker, you need to run make
as superuser. If you are using sudo to elevate your
privileges, you need to edit
katzenpost/docker/Makefile to prepend
sudo to each command contained in it.
If you have Podman installed on your system and you nonetheless want
to run Docker, you can override the default behavior by adding the
argument docker=docker to the command as in the
following:
~/katzenpost/docker$ make run docker=docker
After the make start command exits, the mixnet runs in the
background, and you can run make watch to display a live log of
the network activity.
~/katzenpost/docker$ make watch
...
<output>
...
When installation is complete, the mix servers vote and reach a consensus. You can
use the wait target to wait for the mixnet to get consensus and
be ready to use. This can also take several minutes:
~/katzenpost/docker$ make wait
...
<output>
...
You can confirm that installation and configuration are complete by issuing the
status command from the same or another terminal. When the
network is ready for use, status begins returning consensus
information similar to the following:
~/katzenpost/docker$ make status
...
00:15:15.003 NOTI state: Consensus made for epoch 1851128 with 3/3 signatures: &{Epoch: 1851128 GenesisEpoch: 1851118
...
Testing the mixnet
At this point, you should have a locally running mix network. You can test whether
it is working correctly by using run-ping, which launches a
packet into the network and watches for a successful reply. Run the following
command:
~/katzenpost/docker$ make run-ping
If the network is functioning properly, the resulting output contains lines
similar to the following:
19:29:53.541 INFO gateway1_client: sending loop decoy
!19:29:54.108 INFO gateway1_client: sending loop decoy
19:29:54.632 INFO gateway1_client: sending loop decoy
19:29:55.160 INFO gateway1_client: sending loop decoy
!19:29:56.071 INFO gateway1_client: sending loop decoy
!19:29:59.173 INFO gateway1_client: sending loop decoy
!Success rate is 100.000000 percent 10/10)
lf run-ping fails to receive a reply, it eventually times out
with an error message. If this happens, try the command again.
Note
If you attempt use run-ping too quickly after
starting the mixnet, and consensus has not been reached, the utility may crash
with an error message or hang indefinitely. If this happens, issue (if
necessary) a Ctrl-C key sequence to abort, check the
consensus status with the status command, and then retry
run-ping.
Shutting down the mixnet
The mix network continues to run in the terminal where you started it until you
issue a Ctrl-C key sequence, or until you issue the following
command in another terminal:
~/katzenpost/docker$ make stop
When you stop the network, the binaries and data are left in place. This allows
for a quick restart.
Uninstalling and cleaning up
Several command targets can be used to uninstall the Docker image and restore your
system to a clean state. The following examples demonstrate the commands and their
output.
clean-bin
To stop the network and delete the compiled binaries, run the following
command:
This command leaves in place the cryptographic keys, the state data, and
the logs.
clean-local-dryrun
To diplay a preview of what clean-local would remove,
without actually deleting anything, run the following command:
~/katzenpost/docker$ make clean-local-dryrun
clean-local
To delete both compiled binaries and data, run the following
command:
~/katzenpost/docker$ make clean-local
[ -e voting_mixnet ] && cd voting_mixnet && DOCKER_HOST=unix:///run/user/1000/podman/podman.sock docker-compose down --remove-orphans; rm -fv running.stamp
Removing voting_mixnet_mix2_1 ... done
Removing voting_mixnet_auth1_1 ... done
Removing voting_mixnet_auth2_1 ... done
Removing voting_mixnet_gateway1_1 ... done
Removing voting_mixnet_mix1_1 ... done
Removing voting_mixnet_auth3_1 ... done
Removing voting_mixnet_mix3_1 ... done
Removing voting_mixnet_servicenode1_1 ... done
Removing voting_mixnet_metrics_1 ... done
removed 'running.stamp'
rm -vf ./voting_mixnet/*.alpine
removed './voting_mixnet/echo_server.alpine'
removed './voting_mixnet/fetch.alpine'
removed './voting_mixnet/memspool.alpine'
removed './voting_mixnet/panda_server.alpine'
removed './voting_mixnet/pigeonhole.alpine'
removed './voting_mixnet/reunion_katzenpost_server.alpine'
removed './voting_mixnet/server.alpine'
removed './voting_mixnet/voting.alpine'
git clean -f -x voting_mixnet
Removing voting_mixnet/
git status .
On branch main
Your branch is up to date with 'origin/main'.
clean
To stop the the network and delete the binaries, the data, and the go_deps
image, run the following command as superuser:
~/katzenpost/docker$ sudo make clean
Network topology and components
The Docker image deploys a working mixnet with all components and component groups
needed to perform essential mixnet functions:
message mixing (including packet reordering, timing randomization, injection
of decoy traffic, obfuscation of senders and receivers, and so on)
service provisioning
internal authentication and integrity monitoring
interfacing with external clients
Warning
While suited for client development and testing, the test mixnet omits performance
and security redundancies. Do not use it in production.
The following diagram illustrates the components and their network interactions. The
gray blocks represent nodes, and the arrows represent information transfer.
Figure 1. Test network topology
On the left, the Client transmits a message (shown by
purple arrows) through the Gateway node, across three
mix node layers, to the Service node. The Service node
processes the request and responds with a reply (shown by the green arrows) that
traverses the mix node layers before exiting the mixnet
via the Gateway node and arriving at the Client.
On the right, directory authorities Dirauth 1,
Dirauth 2, and Dirauth
3 provide PKI services. The directory authorities receive mix descriptors from the other nodes, collate these into a
consensus document containing validated network
status and authentication materials , and make that available to the other nodes.
The elements in the topology diagram map to the mixnet's component nodes as shown in
the following table. Note that all nodes share the same IP address (127.0.0.1, i.e.,
localhost), but are accessed through different ports. Each node type links to additional
information in ???.
The following tree
output shows the location, relative to the katzenpost
repository root, of the files created by the Docker build. During testing and use,
you would normally touch only the TOML configuration file associated with each node,
as highlighted in the listing. For help in understanding these files and a complete
list of configuration options, follow the links in Table 2: Test mixnet
hosts.
As an aid to adminstrators implementing a Katzenpost mixnet, this appendix provides
lightly edited examples of configuration files for each Katzenpost node type. These
files are drawn from a built instance of the Docker test
mixnet. These code listings are meant to be used as a reference alongside the
detailed configuration documentation in ???. You cannot use these
listings as a drop-in solution in your own mixnets for reasons explained in the ??? section of the Docker test mixnet documentation.
This document defines the Katzenpost Mix Network Wire Protocol for
use in all network communications to, from, and within the
Katzenpost Mix Network.
1. Introduction
The Katzenpost Mix Network Wire Protocol (KMNWP) is the custom
wire protocol for all network communications to, from, and within
the Katzenpost Mix Network. This protocol provides mutual
authentication, and an additional layer of cryptographic security
and forward secrecy.
1.1 Conventions Used in This Document
The key words “MUST”, “MUST NOT”,
“REQUIRED”, “SHALL”, “SHALL
NOT”, “SHOULD”, “SHOULD NOT”,
“RECOMMENDED”, “MAY”, and
“OPTIONAL” in this document are to be interpreted
as described in RFC2119.
The “C” style Presentation Language as described in
RFC5246 Section 4 is used to
represent data structures, except for cryptographic attributes,
which are specified as opaque byte vectors.
x | y denotes the concatenation of x and y.
1.2 Key Encapsulation Mechanism
This protocol uses ANY Key Encapsulation Mechanism. However it’s
recommended that most users select a hybrid post quantum KEM
such as Xwing. XWING
2. Core Protocol
The protocol is based on Kyber and Trevor Perrin’s Noise Protocol
Framework NOISE along with
“Post Quantum Noise” paper
PQNOISE. Older previous versions of
our transport were based on
NOISEHFS.
Our transport protocol begins with a prologue, Noise handshake,
followed by a stream of Noise Transport messages in a minimal
framing layer, over a TCP/IP connection.
Our Noise protocol is configurable via the KEM selection in the
TOML configuration files, here’s an example PQ Noise protocol
string:
Noise_pqXX_Xwing_ChaChaPoly_BLAKE2b
The protocol string is a very condensed description of our
protocol. We use the pqXX two way Noise pattern which is described
as follows:
pqXX: -> e <- ekem, s -> skem, s <- skem
The next part of the protocol string specifies the KEM,
Xwing which is a hybrid KEM where the share
secret outputs of both X25519 and MLKEM768 are combined.
Finally the ChaChaPoly_BLAKE2b parts of the
protocol string indicate which stream cipher and hash function we
are using.
As a non-standard modification to the Noise protocol, the 65535
byte message length limit is increased to 1300000 bytes. We send
very large messages over our Noise protocol because of our using
the Sphincs+ signature scheme which has signatures that are about
49k bytes.
It is assumed that all parties using the KMNWP protocol have a
fixed long or short lived Xwing keypair
XWING, the public component of which
is known to the other party in advance. How such keys are
distributed is beyond the scope of this document.
2.1 Handshake Phase
All sessions start in the Handshake Phase, in which an anonymous
authenticated handshake is conducted.
The handshake is a unmodified Noise handshake, with a fixed
prologue prefacing the initiator's first Noise handshake
message. This prologue is also used as the
prologue input to the Noise HandshakeState
Initialize() operation for both the initiator
and responder.
The prologue is defined to be the following structure:
As all Noise handshake messages are fixed sizes, no additional
framing is required for the handshake.
Implementations MUST preserve the Noise handshake hash
[h] for the purpose of implementing
authentication (Section 2.3).
Implementations MUST reject handshake attempts by terminating
the session immediately upon any Noise protocol handshake
failure and when, as a responder, they receive a Prologue
containing an unknown protocol_version value.
Implementations SHOULD impose reasonable timeouts for the
handshake process, and SHOULD terminate sessions that are taking
too long to handshake.
2.1.1 Handshake Authentication
Mutual authentication is done via exchanging fixed sized
payloads as part of the pqXX handshake
consisting of the following structure:
ad_len - The length of the optional
additional data.
additional_data - Optional additional
data, such as a username, if any.
unix_time - 0 for the initiator, the
approximate number of seconds since 1970-01-01 00:00:00 UTC
for the responder.
The initiator MUST send the
AuthenticateMessage after it has received the
peer's response (so after -> s, se in
Noise parlance).
The contents of the optional additional_data
field is deliberately left up to the implementation, however it
is RECOMMENDED that implementations pad the field to be a
consistent length regardless of contents to avoid leaking
information about the authenticating identity.
To authenticate the remote peer given an AuthenticateMessage,
the receiving peer must validate the s
component of the Noise handshake (the remote peer's long term
public key) with the known value, along with any of the
information in the additional_data field such
as the user name, if any.
If the validation procedure succeeds, the peer is considered
authenticated. If the validation procedure fails for any reason,
the session MUST be terminated immediately.
Responders MAY add a slight amount (+- 10 seconds) of random
noise to the unix_time value to avoid leaking precise load
information via packet queueing delay.
2.2 Data Transfer Phase
Upon successfully concluding the handshake the session enters
the Data Transfer Phase, where the initiator and responder can
exchange KMNWP messages.
A KMNWP message is defined to be the following structure:
The ciphertext_length field includes the
Noise protocol overhead of 16 bytes, for the Noise Transport
message containing the Ciphertext.
All outgoing Message(s) are preceded by a Noise Transport
Message containing a CiphertextHeader,
indicating the size of the Noise Transport Message transporting
the Message Ciphertext. After generating both Noise Transport
Messages, the sender MUST call the Noise CipherState
Rekey() operation.
To receive incoming Ciphertext messages, first the Noise
Transport Message containing the CiphertextHeader is consumed
off the network, authenticated and decrypted, giving the
receiver the length of the Noise Transport Message containing
the actual message itself. The second Noise Transport Message is
consumed off the network, authenticated and decrypted, with the
resulting message being returned to the caller for processing.
After receiving both Noise Transport Messages, the receiver MUST
call the Noise CipherState Rekey() operation.
Implementations MUST immediately terminate the session any of
the DecryptWithAd() operations fails.
Implementations MUST immediately terminate the session if an
unknown command is received in a Message, or if the Message is
otherwise malformed in any way.
Implementations MAY impose a reasonable idle timeout, and
terminate the session if it expires.
3. Predefined Commands
3.1 The no_op Command
The no_op command is a command that
explicitly is a No Operation, to be used to implement
functionality such as keep-alives and or application layer
padding.
Implementations MUST NOT send any message payload accompanying
this command, and all received command data MUST be discarded
without interpretation.
3.2 The disconnect Command
The disconnect command is a command that is
used to signal explicit session termination. Upon receiving a
disconnect command, implementations MUST interpret the command
as a signal from the peer that no additional commands will be
sent, and destroy the cryptographic material in the receive
CipherState.
While most implementations will likely wish to terminate the
session upon receiving this command, any additional behavior is
explicitly left up to the implementation and application.
Implementations MUST NOT send any message payload accompanying
this command, and MUST not send any further traffic after
sending a disconnect command.
3.3 The send_packet Command
The send_packet command is the command that
is used by the initiator to transmit a Sphinx Packet over the
network. The command’s message is the Sphinx Packet destined for
the responder.
Initiators MUST terminate the session immediately upon reception
of a send_packet command.
4. Command Padding
We use traffic padding to hide from a passive network observer
which command has been sent or received.
Among the set of padded commands we exclude the
Consensus command because it’s contents are a
very large payload which is usually many times larger than our
Sphinx packets. Therefore we only pad these commands:
However we split them up into two directions, client to server and
server to client because they differ in size due to the difference
in size between SendPacket and
Message:
The GetConsensus command is a special case
because we only want to pad it when it’s sent over the mixnet. We
don’t want to pad it when sending to the dirauths. Although it
would not be so terrible if it’s padded when sent to the dirauths…
it would just needlessly take up bandwidth without providing any
privacy benefits.
5. Anonymity Considerations
Adversaries being able to determine that two parties are
communicating via KMNWP is beyond the threat model of this
protocol. At a minimum, it is trivial to determine that a KMNWP
handshake is being performed, due to the length of each handshake
message, and the fixed positions of the various public keys.
6. Security Considerations
It is imperative that implementations use ephemeral keys for every
handshake as the security properties of the Kyber KEM are totally
lost if keys are ever reused.
Kyber was chosen as the KEM algorithm due to it’s conservative
parameterization, simplicty of implementation, and high
performance in software. It is hoped that the addition of a
quantum resistant algorithm will provide forward secrecy even in
the event that large scale quantum computers are applied to
historical intercepts.
7. Acknowledgments
I would like to thank Trevor Perrin for providing feedback during
the design of this protocol, and answering questions regarding
Noise.
Appendix A. References
Appendix A.1 Normative References
Appendix A.2 Informative References
Appendix B. Citing This Document
Appendix B.1 Bibtex Entry
Note that the following bibtex entry is in the IEEEtran bibtex
style as described in a document called “How to Use the
IEEEtran BIBTEX Style”.
@online{KatzMixWire,
title = {Katzenpost Mix Network Wire Protocol Specification},
author = {Yawning Angel},
url = {https://github.com/katzenpost/katzenpost/blob/master/docs/specs/wire-protocol.rst},
year = {2017}
}
XWING. Manuel Barbosa, Deirdre Connolly, João Diogo Duarte, Aaron Kaiser, Peter Schwabe,
Karoline Varner, Bas Westerbaan, “X-Wing: The Hybrid KEM You’ve Been Looking
For”. https://eprint.iacr.org/2024/039.pdf
PQNOISE. Yawning Angel, Benjamin Dowling, Andreas Hülsing, Peter Schwabe and Florian Weber,
“Post Quantum Noise”, September 2023. https://eprint.iacr.org/2022/539.pdf
RFC2119. Bradner, S., “Key words for use in RFCs to Indicate Requirement Levels”,
BCP 14, RFC 2119, DOI 10.17487/RFC2119, March 1997. https://www.rfc-editor.org/info/rfc2119
RFC5246. Dierks, T. and E. Rescorla, “The Transport Layer Security (TLS) Protocol Version
1.2”, RFC 5246, DOI 10.17487/RFC5246, August 2008. https://www.rfc-editor.org/info/rfc5246
RFC7748. Langley, A., Hamburg, M., and S. Turner, “Elliptic Curves for Security”,
RFC 7748, DOI 10.17487/RFC7748, January 2016. http://www.rfc-editor.org/info/rfc7748
2.2 - Katzenpost Certificate Specification
Abstract
This document proposes a certificate format that Katzenpost mix
server, directory authority server and clients will use.
1. Introduction
Mixes and Directory Authority servers need to have key agility in the
sense of operational abilities such as key rotation and key revocation.
That is, we wish for mixes and authorities to periodically utilize a
long-term signing key for generating certificates for new short-term
signing keys.
Yet another use-case for these certificate is to replace the use of
JOSE RFC7515 in the voting Directory Authority
system KATZMIXPKI for the multi-signature
documents exchanged for voting and consensus.
1.1 Conventions Used in This
Document
The key words “MUST”, “MUST NOT”, “REQUIRED”, “SHALL”, “SHALL NOT”,
“SHOULD”, “SHOULD NOT”, “RECOMMENDED”, “MAY”, and “OPTIONAL” in this
document are to be interpreted as described in RFC2119.
1.2 Terminology
Tbw…
2. Document Format
The CBOR RFC7049 serialization format is used
to serialize certificates:
Signature is a cryptographic signature which has an associated signer
ID.
type Signature struct {
// Identity is the identity of the signer.
Identity []byte
// Signature is the actual signature value.
Signature []byte
}
Certificate structure for serializing certificates.
type certificate struct {
// Version is the certificate format version.
Version uint32
// Expiration is seconds since Unix epoch.
Expiration int64
// KeyType indicates the type of key
// that is certified by this certificate.
KeyType string
// Certified is the data that is certified by
// this certificate.
Certified []byte
// Signatures are the signature of the certificate.
Signatures []Signature
}
That is, one or more signatures sign the certificate. However the
Certified field is not the only information that is signed.
The Certified field along with the other non-signature
fields are all concatenated together and signed. Before serialization
the signatures are sorted by their identity so that the output is binary
deterministic.
2.1 Certificate Types
The certificate type field indicates the type of
certificate. So far we have only two types:
identity key certificate
directory authority certificate
Both mixes and directory authority servers have a secret, long-term
identity key. This key is ideally stored encrypted and offline, it’s
used to sign key certificate documents. Key certificates contain a
medium-term signing key that is used to sign other documents. In the
case of an “authority signing key”, it is used to sign vote and
consensus documents whereas the “mix singing key” is used to sign mix
descriptors which are uploaded to the directory authority servers.
2.2. Certificate Key Types
It’s more practical to continue using Ed25519 ED25519 keys but it’s also possible that in the
future we could upgrade to a stateless hash based post quantum
cryptographic signature scheme such as SPHINCS-256 or SPHINCS+. SPHINCS256
Our golang implementation is agnostic to the specific cryptographic
signature scheme which is used. Cert can handle single and multiple
signatures per document and has a variety of helper functions that ease
use for multi signature use cases.
4. Acknowledgments
This specification was inspired by Tor Project’s certificate format
specification document:
Angel, Y., Piotrowska, A., Stainton, D.,
"Katzenpost Mix Network Public Key Infrastructure Specification",
December 2017,
https://github.com/katzenpost/katzenpost/blob/master/docs/specs/pki.md
RFC2119
Bradner, S.,
"Key words for use in RFCs to Indicate Requirement Levels",
BCP 14, RFC 2119, DOI 10.17487/RFC2119,
March 1997,
http://www.rfc-editor.org/info/rfc2119
RFC7049
C. Bormannm, P. Hoffman,
"Concise Binary Object Representation (CBOR)",
Internet Engineering Task Force (IETF),
October 2013,
https://tools.ietf.org/html/rfc7049
RFC7515
Jones, M., Bradley, J., Sakimura, N.,
"JSON Web Signature (JWS)",
May 2015,
https://tools.ietf.org/html/rfc7515
RFC7693
Saarinen, M-J., Ed., and J-P. Aumasson,
"The BLAKE2 Cryptographic Hash and Message Authentication Code (MAC)",
RFC 7693, DOI 10.17487/RFC7693,
November 2015,
http://www.rfc-editor.org/info/rfc7693
SPHINCS256
Bernstein, D., Hopwood, D., Hulsing, A., Lange, T., Niederhagen, R., Papachristodoulou, L., Schwabe, P., Wilcox O' Hearn, Z.,
"SPHINCS: practical stateless hash-based signatures",
http://sphincs.cr.yp.to/sphincs-20141001.pdf
2.3 - Katzenpost Client2 Specification
Abstract
This document describes the design of the new Katzenpost mix network
client known as client2. In particular we discuss it’s multiplexing and
privilege separation design elements as well as the protocol used by the
thin client library.
1. Introduction
A Katzenpost mixnet client has several responsibilities at
minimum:
compose Sphinx packets
decrypt SURB replies
send and receive Noise protocol messages
keep up to date with the latest PKI document
2. Overview
Client2 is essentially a long running daemon process that listens on
an abstract unix domain socket for incoming thin client library
connections. Many client applications can use the same client2 daemon.
Those connections are in a sense being multiplexed into the daemon’s
single connection to the mix network.
Therefore applications will be integrated with Katzenpost using the
thin client library which gives them the capability to talk with the
client2 daemon and in that way interact with the mix network. The reason
we call it a thin client library is because it does not do any mixnet
related cryptography since that is already handled by the client2
daemon. In particular, the PKI document is stripped by the daemon before
it’s passed on to the thin clients. Likewise, thin clients don’t decrypt
SURB replies or compose Sphinx packets, instead all the that Noise,
Sphinx and PKI related cryptography is handled by the daemon.
3. Thin client and daemon
protocol
Note that the thin client daemon protocol uses abstract unix domain
sockets in datagram packet mode. The socket is of type SOCK_SEQPACKET
which is defined as:
SOCK_SEQPACKET (since Linux 2.6.4), is a
connection-oriented socket that preserves message boundaries and
delivers messages in the order that they were sent.
In golang this is referred to by the “unixpacket” network string.
3.1 Client socket naming
convention
Thin clients MUST randomize their abstract unix domain socket name
otherwise the static name will prevent multiplexing because the kernel
requires that the connection be between uniquely nameed socket pairs.
The Katzenpost reference implementation of the thin client library
selects a socket name with four random hex digits appended to the end of
the name like so:
@katzenpost_golang_thin_client_DEADBEEF
3.2 Daemon socket naming
convention
The client2 daemon listens on an abstract unix domain socket with the
following name:
@katzenpost
3.3 Protocol messages
Note that there are two protocol message types and they are always
CBOR encoded. We do not make use of any prefix length encoding because
the socket type preserves message boundaries for us. Therefore we simply
send over pure CBOR encoded messages.
The daemon sends the Response message which is defined
in golang as a struct containing an app ID and one of four possible
events:
type Response struct {
// AppID must be a unique identity for the client application
// that is receiving this Response.
AppID *[AppIDLength]byte `cbor:app_id`
ConnectionStatusEvent *ConnectionStatusEvent `cbor:connection_status_event`
NewPKIDocumentEvent *NewPKIDocumentEvent `cbor:new_pki_document_event`
MessageSentEvent *MessageSentEvent `cbor:message_sent_event`
MessageReplyEvent *MessageReplyEvent `cbor:message_reply_event`
}
type ConnectionStatusEvent struct {
IsConnected bool `cbor:is_connected`
Err error `cbor:err`
}
type NewPKIDocumentEvent struct {
Payload []byte `cbor:payload`
}
type MessageReplyEvent struct {
MessageID *[MessageIDLength]byte `cbor:message_id`
SURBID *[sConstants.SURBIDLength]byte `cbor:surbid`
Payload []byte `cbor:payload`
Err error `cbor:err`
}
type MessageSentEvent struct {
MessageID *[MessageIDLength]byte `cbor:message_id`
SURBID *[sConstants.SURBIDLength]byte `cbor:surbid`
SentAt time.Time `cbor:sent_at`
ReplyETA time.Duration `cbor:reply_eta`
Err error `cbor:err`
}
The client sends the Request message which is defined in
golang as:
type Request struct {
// ID is the unique identifier with respect to the Payload.
// This is only used by the ARQ.
ID *[MessageIDLength]byte `cbor:id`
// WithSURB indicates if the message should be sent with a SURB
// in the Sphinx payload.
WithSURB bool `cbor:with_surb`
// SURBID must be a unique identity for each request.
// This field should be nil if WithSURB is false.
SURBID *[sConstants.SURBIDLength]byte `cbor:surbid`
// AppID must be a unique identity for the client application
// that is sending this Request.
AppID *[AppIDLength]byte `cbor:app_id`
// DestinationIdHash is 32 byte hash of the destination Provider's
// identity public key.
DestinationIdHash *[32]byte `cbor:destination_id_hash`
// RecipientQueueID is the queue identity which will receive the message.
RecipientQueueID []byte `cbor:recipient_queue_id`
// Payload is the actual Sphinx packet.
Payload []byte `cbor:payload`
// IsSendOp is set to true if the intent is to send a message through
// the mix network.
IsSendOp bool `cbor:is_send_op`
// IsARQSendOp is set to true if the intent is to send a message through
// the mix network using the naive ARQ error correction scheme.
IsARQSendOp bool `cbor:is_arq_send_op`
// IsEchoOp is set to true if the intent is to merely test that the unix
// socket listener is working properly; the Response payload will be
// contain the Request payload.
IsEchoOp bool `cbor:is_echo_op`
// IsLoopDecoy is set to true to indicate that this message shall
// be a loop decoy message.
IsLoopDecoy bool `cbor:is_loop_decoy`
// IsDropDecoy is set to true to indicate that this message shall
// be a drop decoy message.
IsDropDecoy bool `cbor:is_drop_decoy`
}
3.4 Protocol description
Upon connecting to the daemon socket the client must wait for two
messages. The first message received must have it’s
is_status field set to true. If so then it’s
is_connected field indicates whether or not the daemon has
a mixnet PQ Noise protocol connection to an entry node.
Next the client awaits the second message which contains the PKI
document in it’s payload field. This marks the end of the
initial connection sequence. Note that this PKI document is stripped of
all cryptographic signatures.
In the next protocol phase, the client may send Request
messages to the daemon in order to cause the daemon to encapsulate the
given payload in a Sphinx packet and send it to the entry node. Likewise
the daemon my send the client Response messages at any time
during this protocol phase. These Response messages may
indicated a connection status change, a new PKI document or a message
sent or reply event.
3.5 Request message fields
There are several Request fields that we need to
discuss.
Firstly, each Request message sent by a thin client
needs to have the app_id field set to an ID that is unique
among the applications using thin clients. The app_id is
used by the daemon to route Response messages to the
correct thin client socket.
The rest of the fields we are concerned with are the following:
with_surb is set to true if a Sphinx packet with a
SURB in it’s payload should be sent.
surbid is used to uniquely identify the reponse to a
message sent with the with_surb field set to true. It
should NOT be set if using the built-in ARQ for reliability and optional
retransmissions.
is_send_op must be set to true.
payload must be set to the message payload being
sent.
destination_id_hash is 32 byte hash of the
destination entry node’s identity public key.
recipient_queue_id is the destination queue
identity. This is the destination the message will be delivered
to.
If a one way message should be sent with no SURB then
with_surb should be set to false and surbid
may be nil. If however the thin client wishes to send a reliable message
using the daemon’s ARQ, then the following fields must be set:
id the message id which uniquely identifies this
message and it’s eventual reply.
with_surb set to true
is_arq_send_op set to true
payload set to the message payload, as
usual.
destination_id_hash set to the destination service
node’s identity public key 32 byte hash.
recipient_queue_id is the destination queue
identity. This is the destination the message will be delivered
to.
3.6 Response message fields
A thin client connection always begins with the daemon sendings the
client two messages, a connection status followed by a PKI document.
After this connection sequence phase, the daemon may send the thin
client a connection status or PKI document update at any time.
Thin clients recieve four possible events inside of
Response messages:
connection status event
is_connected indicated whether the client is connected
or not.
err may contain an error indicating why connection
status changed.
new PKI document event
payload is the CBOR serialied PKI document, stripped of
all the cryptographic signatures.
message sent event
message_id is a unique message ID
surb_id is the SURB ID
sent_at is the time the message was sent
replay_eta is the time we expect a reply
err is the optional error we received when attempting
to send
message reply event
message_id is a unique message ID
surb_id is a the SURB ID
payload is the replay payload
err is the error, if any.
2.4 - Katzenpost Kaetzchen Specification
Abstract
1. Introduction
This interface is meant to provide support for various autoresponder
agents “Kaetzchen” that run on Katzenpost provider instances, thus
bypassing the need to run a discrete client instance to provide
functionality. The use-cases for such agents include, but are not
limited to, user identity key lookup, a discard address, and a loop-back
responder for the purpose of cover traffic.
1.1 Conventions Used in This
Document
The key words “MUST”, “MUST NOT”, “REQUIRED”, “SHALL”, “SHALL NOT”,
“SHOULD”, “SHOULD NOT”, “RECOMMENDED”, “MAY”, and “OPTIONAL” in this
document are to be interpreted as described in RFC2119.
1.2. Terminology
SURB - “single use reply block” SURBs are used to
achieve recipient anonymity, that is to say, SURBs function as a
cryptographic delivery token that you can give to another client entity
so that they can send you a message without them knowing your identity
or location on the network. See SPHINXSPEC and
SPHINX.
BlockSphinxPlaintext - The payload structure which is
encapsulated by the Sphinx body. It is described in
KATZMIXE2E, section
Client and Provider processing of received packets
2. Extension Overview
Each Kaetzchen agent will register as a potential recipient on its
Provider. When the Provider receives a forward packet destined for a
Kaetzchen instance, it will hand off the fully unwrapped packet along
with its corresponding SURB to the agent, which will then act on the
packet and optionally reply utilizing the SURB.
3. Agent Requirements
Each agent operation MUST be idempotent.
Each agent operation request and response MUST fit within one Sphinx
packet.
Each agent SHOULD register a recipient address that is prefixed with
(Or another standardized delimiter, agreed to by all participating
providers in a given mixnet).
Each agent SHOULD register a recipient address that consists of
a
RFC5322 dot-atom value, and MUST register recipient addresses that
are at most 64 octets in length.
The first byte of the agent's response payload MUST be 0x01 to allow
clients to easily differentiate between SURB-ACKs and agent
responses.
3.1 Mix Message Formats
Messages from clients to Kaetzchen use the following payload format
in the forward Sphinx packet:
struct {
uint8_t flags;
uint8_t reserved; /* Set to 0x00. */
select (flags) {
case 0:
opaque padding[sizeof(SphinxSURB)];
case 1:
SphinxSURB surb;
}
opaque plaintext[];
} KaetzchenMessage;
The plaintext component of a KaetzchenMessage MUST be
padded by appending “0x00” bytes to make the final total size of a
KaetzchenMessage equal to that of a
BlockSphinxPlaintext.
Messages (replies) from the Kaetzchen to client use the following
payload format in the SURB generated packet::
struct {
opaque plaintext[];
} KaetzchenReply;
The plaintext component of a KaetzchenReply MUST be
padded by appending “0x00” bytes to make the final total size of a
KaetzchenReply equal to that of a
BlockSphinxPlaintext
4. PKI Extensions
Each provider SHOULD publish the list of publicly accessible
Kaetzchen agent endpoints in its MixDescriptor, along with any other
information required to utilize the agent.
Provider should make this information available in the form of a map
in which the keys are the label used to identify a given service, and
the value is a map with arbitrary keys.
Valid service names refer to the services defined in extensions to
this specification. Every service MUST be implemented by one and only
one Kaetzchen agent.
For each service, the provider MUST advertise a field for the
endpoint at which the Kaetzchen agent can be reached, as a key value
pair where the key is endpoint, and the value is the
provider side endpoint identifier.
In the event that the mix keys for the entire return path are
compromised, it is possible for adversaries to unwrap the SURB and
determine the final recipient of the reply.
Depending on what sort of operations a given agent implements, there
may be additional anonymity impact that requires separate
consideration.
Clients MUST NOT have predictable retranmission otherwise this makes
active confirmations attacks possible which could be used to discover
the ingress Provider of the client.
6. Security Considerations
It is possible to use this mechanism to flood a victim with unwanted
traffic by constructing a request with a SURB destined for the
target.
Depending on the operations implemented by each agent, the added
functionality may end up being a vector for Denial of Service attacks in
the form of CPU or network overload.
Unless the agent implements additional encryption, message integrity
and privacy is limited to that which is provided by the base Sphinx
packet format and parameterization.
7. Acknowledgments
The inspiration for this extension comes primarily from a design by
Vincent Breitmoser.
Appendix A. References
Appendix A.1 Normative
References
Appendix A.2 Informative
References
Appendix B. Citing This
Document
Appendix B.1 Bibtex Entry
Note that the following bibtex entry is in the IEEEtran bibtex style
as described in a document called “How to Use the IEEEtran BIBTEX
Style”.
@online{Kaetzchen,
title = {Katzenpost Provider-side Autoresponder Extension},
author = {Yawning Angel and Kali Kaneko and David Stainton},
url = {https://github.com/katzenpost/katzenpost/blob/main/docs/specs/kaetzchen.md},
year = {2018}
}
KATZMIXE2E
Angel, Y., Danezis, G., Diaz, C., Piotrowska, A., Stainton, D.,
"Katzenpost Mix Network End-to-end Protocol Specification",
July 2017,
https://github.com/katzenpost/katzenpost/blob/main/docs/specs/old/end_to_end.md
KATZMIXPKI
Angel, Y., Piotrowska, A., Stainton, D.,
"Katzenpost Mix Network Public Key Infrastructure Specification",
December 2017,
https://github.com/katzenpost/katzenpost/blob/main/docs/specs/pki.md
RFC2119
Bradner, S.,
"Key words for use in RFCs to Indicate Requirement Levels",
BCP 14, RFC 2119, DOI 10.17487/RFC2119,
March 1997,
http://www.rfc-editor.org/info/rfc2119
RFC5322
Resnick, P., Ed.,
"Internet Message Format",
RFC 5322, DOI 10.17487/RFC5322,
October 2008,
https://www.rfc-editor.org/info/rfc5322
SPHINX
Danezis, G., Goldberg, I.,
"Sphinx: A Compact and Provably Secure Mix Format",
DOI 10.1109/SP.2009.15,
May 2009,
http://research.microsoft.com/en-us/um/people/gdane/papers/sphinx-eprint.pdf
SPHINXSPEC
Angel, Y., Danezis, G., Diaz, C., Piotrowska, A., Stainton, D.,
"Sphinx Mix Network Cryptographic Packet Format Specification"
July 2017,
https://github.com/katzenpost/katzenpost/blob/main/docs/specs/sphinx.md
Here I present a modification of the Sphinx cryptographic packet
format that uses a KEM instead of a NIKE whilst preserving the
properties of bitwise unlinkability, constant packet size and route
length hiding.
1. Introduction
We’ll express our KEM Sphinx header in pseudo code. The Sphinx
body will be exactly the same as
SPHINXSPEC Our basic KEM API
has three functions:
PRIV_KEY, PUB_KEY = GEN_KEYPAIR(RNG)
ct, ss = ENCAP(PUB_KEY) - Encapsulate
generates a shared secret, ss, for the public key and
encapsulates it into a ciphertext.
ss = DECAP(PRIV_KEY, ct) - Decapsulate
computes the shared key, ss, encapsulated in the ciphertext,
ct, for the private key.
Additional notation includes:
|| = concatenate two binary blobs together
PRF = pseudo random function, a
cryptographic hash function, e.g. Blake2b.
Therefore we must embed these KEM ciphertexts in the KEMSphinx
header, one KEM ciphertext per mix hop.
2. Post Quantum Hybrid KEM
Special care must be taken in order correctly compose a hybrid
post quantum KEM that is IND-CCA2 robust.
The hybrid post quantum KEMs found in Cloudflare’s circl library
are suitable to be used with Noise or TLS but not with KEM Sphinx
because they are not IND-CCA2 robust. Noise and TLS achieve
IND-CCA2 security by mixing in the public keys and ciphertexts
into the hash object and therefore do not require an IND-CCA2 KEM.
Firstly, our post quantum KEM is IND-CCA2 however we must
specifically take care to make our NIKE to KEM adapter have
semantic security. Secondly, we must make a security preserving
KEM combiner.
2.1 NIKE to KEM adapter
We easily achieve our IND-CCA2 security by means of hashing
together the DH shared secret along with both of the public
keys:
The KEM Combiners paper KEMCOMB
makes the observation that if a KEM combiner is not security
preserving then the resulting hybrid KEM will not have IND-CCA2
security if one of the composing KEMs does not have IND-CCA2
security. Likewise the paper points out that when using a
security preserving KEM combiner, if only one of the composing
KEMs has IND-CCA2 security then the resulting hybrid KEM will
have IND-CCA2 security.
Our KEM combiner uses the split PRF design from the paper when
combining two KEM shared secrets together we use a hash function
to also mix in the values of both KEM ciphertexts. In this
pseudo code example we are hashing together the two shared
secrets from the two underlying KEMs, ss1 and ss2. Additionally
the two ciphertexts from the underlying KEMs, cct1 and cct2, are
also hashed together:
MAC for this hop (authenticates header fields 1 thru 4)
KEM Sphinx header elements:
Version number (MACed but not encrypted)
One KEM ciphertext for use with the next hop
Encrypted per routing commands AND KEM ciphtertexts, one for
each additional hop
MAC for this hop (authenticates header fields 1 thru 4)
We can say that KEMSphinx differs from NIKE Sphinx by replacing
the header’s group element (e.g. an X25519 public key) field with
the KEM ciphertext. Subsequent KEM ciphertexts for each hop are
stored inside the Sphinx header “routing information”
section.
First we must have a data type to express a mix hop, and we can
use lists of these hops to express a route:
type PathHop struct {
public_key kem.PublicKey
routing_commands Commands
}
Here’s how we construct a KEMSphinx packet header where path is a
list of PathHop, and indicates the route through the network:
Derive the KEM ciphertexts for each hop.
route_keys = []
route_kems = []
for i := 0; i < num_hops; i++ {
kem_ct, ss := ENCAP(path[i].public_key)
route_kems += kem_ct
route_keys += ss
}
Derive the routing_information keystream and encrypted padding
for each hop.
Same as in SPHINXSPEC except for
the fact that each routing info slot is now increased by the size
of the KEM ciphertext.
Create the routing_information block.
Here we modify the Sphinx implementation to pack the next KEM
ciphertext into each routing information block. Each of these
blocks is decrypted for each mix mix hop which will decrypt the
KEM ciphertext for the next hop in the route.
Assemble the completed Sphinx Packet Header and Sphinx Packet
Payload SPRP key vector. Same as in
SPHINXSPEC except the
kem_element field is set to the first KEM
ciphertext, route_kems[0]:
var sphinx_header SphinxHeader
sphinx_header.additional_data = version
sphinx_header.kem_element = route_kems[0]
sphinx_header.routing_info = routing_info
sphinx_header.mac = mac
2. KEMSphinx Unwrap Operation
Most of the design here will be exactly the same as in
SPHINXSPEC. However there are a
few notable differences:
The shared secret is derived from the KEM ciphertext instead
of a DH.
Next hop’s KEM ciphertext stored in the encrypted routing
information.
3. Acknowledgments
I would like to thank Peter Schwabe for the original idea of
simply replacing the Sphinx NIKE with a KEM and for answering
all my questions. I’d also like to thank Bas Westerbaan for
answering questions.
SPHINXSPEC. Angel, Y., Danezis, G., Diaz, C., Piotrowska, A., Stainton, D., "Sphinx Mix
Network Cryptographic Packet Format Specification" July 2017. https://katzenpost.network/docs/specs/sphinx/
2.6 - Mix Decoy Stats Propagation
Abstract
In the context of continuous time mixing stategies such as the
memoryless mix used by Katzenpost, n-1 attacks may use strategic
packetloss. Nodes can also fail for benign reasons. Determining whether
or not it’s an n-1 attack is outside the scope of this work.
This document describes how we will communicate statistics from mix
nodes to mix network directory authorities which tells them about the
packetloss they are observing.
1. Design Overview
Nodes (mixes, gateways, and providers) need upload packet-loss
statistics to the directory authorities, so that authorities can label
malfunctioning nodes as such in the consensus in the next epoch.
Nodes currently sign and upload a Descriptor in each epoch.
In the future, they would instead upload a “UploadDescStats”
containing: * Descriptor * Stats * Signature
Stats contains: * a map from pairs-of-mixes to the ratio of
count-of-loops-sent vs count-of-loops-received
refer to our non-existent document on Provider orignated
deocy loop traffic design discussion
1.3 Terminology
wire protocol - refers to our PQ Noise based
protocol which currently uses TCP but in the near future will optionally
use QUIC. This protocol has messages known as wire protocol
commands, which are used for various mixnet functions such
as sending or retrieving a message, dirauth voting etc. For more
information, please see our design doc: wire
protocol specification
Providers - refers to a set of node on the edge of
the network which have two roles, handle incoming client connections and
run mixnet services. Soon we should get rid of Providers
and replace it with two different sets, gateway nodes and
service nodes.
Epoch - The Katzenpost epoch is currently set to a
20 minute duration. Each new epoch there is a new PKI document published
containing public key material that will only be valid for that
epoch.
2. Tracking
Packet Loss and Detecting Faulty Mixes
Katzenpost lets different elements in the network track whether other
elements are functioning correctly. A node A will do this by sending
packets in randomly generated loops through the network, and tracking
whether the loop comes back or not. When it comes back, it will mark
that as evidence, that the nodes on the path of that loop are
functioning correctly.
Experimental setup, node A:
Data: each network node A collects a record of emitted
test loops in a certain epoch, their paths and whether they returned or
not. Importantly, each loop is the same length and includes l
steps.
A segment is defined as a possible connection from a device in the
network to another, for example from a node in the layer k
to a node in the layer k+1. Each loop is a sequence of such
segments.
Each node A will create 3 hashmaps,
sent_loops_A, completed_loops_A and
ratios_A. Each of these will use a pair of concatenated
mixnode ID’s as the key. The ordering of the ID’s will be from lesser
topology layer to greater, e.g. the two-tuple (n, n+1) which is
represented here as a 64 byte array:
var sent_loops_A map[[64]byte]int
var completed_loops_A map[[64]byte]int
var ratios_A map[[64]byte]float64
Every time the node A sends out a test loop, for each segment in the
loop path, it will increment the value in
sent_loops_A.
When a test loop returns, for each step in the loop path, it will
increment the value in completed_loops_A.
Generate a new map entry in ratios_A for each
mix-node-pair p, if sent_loops_A[p]==0 set
ratios_A[p]=1. Else
ratios_A[p] = completed_loops_A[p]/sent_loops_A[p]
Plot the resulting distribution, and calculate the standard
deviation to detect anomalies. Have the node report significant
anomalies after a sufficient time period as to not leak information on
the route of individual loops.
Anomalies may have to be discarded if the corresponding
sent_loops_A[p] is small.
You would expect the distribution of values in
completed_loops to approximate a binomial distribution. In
an absence of faulty nodes, ratios should be 1, and when
there are some faulty nodes values at faulty nodes should approach 0 (if
the node doesn’t work at all), and be binomially distributed at nodes
that can share a loop with faulty nodes.
Therefore each mix node generates a statistics report to upload to
the dirauth nodes, of the struct type:
The report is subsequently uploaded to the directory authorities,
which combine the reports of individual nodes into a health status of
the network and arrive at a consensus decision about the topology of the
network.
3. Uploading Stats to Dirauths
Stats reports are uploaded along with the mix descriptor every Epoch.
A cryptographic signature covers both of these fields:
Statistics reports collected during the XXX period of time, that is,
the time between descriptor N+1 upload and descriptor N+2 upload, are
what will affect the topology choices in epoch N+2 if the dirauths
collectively decide to act on the very latest statistics reports in
order to determine for example if a mix node should be removed from the
network:
This document describes the high level architecture and detailed
protocols and behavior required of mix nodes participating in the
Katzenpost Mix Network.
1. Introduction
This specification provides the design of a mix network meant provide
an anonymous messaging protocol between clients and public mixnet
services.
Various system components such as client software, end to end
messaging protocols, Sphinx cryptographic packet format and wire
protocol are described in their own specification documents.
1.1 Terminology
A KiB is defined as 1024 8 bit octets.
Mixnet - A mixnet also known as a mix network is a
network of mixes that can be used to build various privacy preserving
protocols.
Mix - A cryptographic router that is used to compose
a mixnet. Mixes use a cryptographic operation on messages being routed
which provides bitwise unlinkability with respect to input versus output
messages. Katzenpost is a decryption mixnet that uses the Sphinx
cryptographic packet format.
Node - A Mix. Client's are NOT considered nodes in
the mix network. However note that network protocols are often layered;
in our design documents we describe "mixnet hidden services" which can
be operated by mixnet clients. Therefore if you are using node in some
adherence to methematical termonology one could conceivably designate a
client as a node. That having been said, it would not be appropriate to
the discussion of our core mixnet protocol to refer to the clients as
nodes.
Entry mix, Entry node - An entry mix is
a mix that has some additional features:
An entry mix is always the first hop in routes where the message
originates from a client.
An entry mix authenticates client’s direct connections via the
mixnet’s wire protocol.
An entry mix queues reply messages and allows clients to retrieve
them later.
Service mix - A service mix is a mix that has some
additional features:
A service mix is always the last hop in routes where the message
originates from a client.
A service mix runs mixnet services which use a Sphinx SURB based
protocol.
User - An agent using the Katzenpost
system.
Client - Software run by the User on its local
device to participate in the Mixnet. Again let us reiterate that a
client is not considered a "node in the network" at the level of
analysis where we are discussing the core mixnet protocol in this here
document.
Katzenpost - A project to design many improved
decryption mixnet protocols.
Classes of traffic - We distinguish the following classes of
traffic:
SURB Replies (also sometimes referred to as ACKs)
Forward messages
Packet - Also known as a Sphinx packet. A nested
encrypted packet that, is routed through the mixnet and
cryptographically transformed at each hop. The length of the packet is
fixed for every class of traffic. Packet payloads encapsulate
messages.
Payload - The payload, also known as packet payload,
is a portion of a Packet containing a message, or part of a message, to
be delivered anonymously.
Message - A variable-length sequence of octets sent
anonymously through the network. Short messages are sent in a single
packet; long messages are fragmented across multiple packets.
MSL - Maximum Segment Lifetime, 120
seconds.
1.2 Conventions Used in This
Document
The key words “MUST”, “MUST NOT”, “REQUIRED”, “SHALL”, “SHALL NOT”,
“SHOULD”, “SHOULD NOT”, “RECOMMENDED”, “MAY”, and “OPTIONAL” in this
document are to be interpreted as described in RFC2119
2. System Overview
The presented system design is based on LOOPIX
Below, we present the system overview.
The entry mixes are responsible for authenticating clients, accepting
packets from the client, and forwarding them to the mix network, which
then relays packets to the destination service mix. Our network design
uses a strict topology where forward message traverse the network from
entry mix to service mix. Service mixes can optionally reply if the
forward message contained a Single Use Reply Block (see SPHINXSPEC.
The PKI system that handles the distribution of various network wide
parameters, and information required for each participant to participate
in the network such as IP address/port combinations that each node can
be reached at, and cryptographic public keys. The specification for the
PKI is beyond the scope of this document and is instead covered in KATZMIXPKI.
The mix network provides neither reliable nor in-order delivery
semantics. The described mix network is neither a user facing messaging
system nor is it an application. It is intended to be a low level
protocol which can be composed to form more elaborate mixnet protocols
with stronger more useful privacy notions.
2.1 Threat Model
Here we cannot present the threat model to the higher level mixnet
protocols. However this low level core mixnet protocol does have it’s
own threat model which we attempt to illucidate here.
We assume that the clients only talk to mixnet services. These
services make use of a client provided delivery token known as a SURB
(Single Use Reply Block) to send their replies to the client without
knowing the client’s entry mix. This system guarantees third-party
anonymity, meaning that no parties other than client and the service are
able to learn that the client and service are communicating. Note that
this is in contrast with other designs, such as Mixminion, which provide
sender anonymity towards recipients as well as anonymous replies.
Mixnet clients will randomly select an entry node to use and may
reconnect if disconnected for under a duration threshold. The entry mix
can determine the approximate message volume originating from and
destined to a given client. We consider the entry mix follows the
protocol and might be an honest-but-curious adversary.
External local network observers can not determine the number of
Packets traversing their region of the network because of the use of
decoy traffic sent by the clients. Global observers will not be able to
de-anonymize packet paths if there are enough packets traversing the mix
network. Longer term statistical disclosure attacks are likely possible
in order to link senders and receivers.
A malicious mix only has the ability to remember which input packets
correspond to the output packets. To discover the entire path all of the
mixes in the path would have to be malicious. Moreover, the malicious
mixes can drop, inject, modify or delay the packets for more or less
time than specified.
2.2 Network Topology
The Katzenpost Mix Network uses a layered topology consisting of a
fixed number of layers, each containing a set of mixes. At any given
time each Mix MUST only be assigned to one specific layer. Each Mix in a
given layer N is connected to every other Mix in the previous and next
layer, and or every participating Provider in the case of the mixes in
layer 0 or layer N (first and last layer). :
Note: Multiple distinct connections are collapsed in the figure for
sake of brevity/clarity.
The network topology MUST also maximize the number of security
domains traversed by the packets. This can be achieved by not allowing
mixes from the same security domain to be in different layers.
Requirements for the topology:
Should allow for non-uniform throughput of each mix (Get bandwidth
weights from the PKI).
Should maximize distribution among security domains, in this case
the mix descriptor specified family field would indicate the security
domain or entity operating the mix.
Other legal jurisdictional region awareness for increasing the cost
of compulsion attacks.
3. Packet Format Overview
For the packet format of the transported messages we use the Sphinx
cryptographic packet format. The detailed description of the packet
format, construction, processing and security / anonymity considerations
see SPHINXSPEC, “The Sphinx Mix Network
Cryptographic Packet Format Specification”.
As the Sphinx packet format is generic, the Katzenpost Mix Network
must provide a concrete instantiation of the format, as well as
additional Sphinx per-hop routing information commands.
3.1 Sphinx Cryptographic
Primitives
For the current version of the Katzenpost Mix Network, let the
following cryptographic primitives be used as described in the Sphinx
specification.
H(M) - As the output of this primitive is only used
locally to a Mix, any suitable primitive may be used.
MAC(K, M) - HMAC-SHA256 RFC6234,
M_KEY_LENGTH of 32 bytes (256 bits), and MAC_LENGTH of 32 bytes (256
bits).
KDF(SALT, IKM) - HKDF-SHA256, HKDF-Expand only, with
SALT used as the info parameter.
S(K, IV) - CTR-AES256 [SP80038A], S_KEY_LENGTH of 32 bytes (256 bits),
and S_IV_LENGTH of 12 bytes (96 bits), using a 32 bit counter.
SPRP_Encrypt(K, M)/SPRP_Decrypt(K, M) - AEZv5 AEZV5, SPRP_KEY_LENGTH of 48 bytes (384 bits). As
there is a disconnect between AEZv5 as specified and the Sphinx usage,
let the following be the AEZv5 parameters:
nonce - 16 bytes, reusing the per-hop Sphinx header IV.
additional_data - Unused.
tau - 0 bytes.
EXP(X, Y) - X25519 RFC7748
scalar multiply, GROUP_ELEMENT_LENGTH of 32 bytes (256 bits), G is the
X25519 base point.
3.2 Sphinx Packet Parameters
The following parameters are used as for the Katzenpost Mix Network
instantiation of the Sphinx Packet Format:
AD_SIZE - 2 bytes.
SECURITY_PARAMETER - 32 bytes. (except for our SPRP
which we plan to upgrade)
PER_HOP_RI_SIZE - (XXX/ya: Addition is hard, let's go
shopping.)
NODE_ID_SIZE - 32 bytes, the size of the Ed25519 public
key, used as Node identifiers.
RECIPIENT_ID_SIZE - 64 bytes, the maximum size of
local-part component in an e-mail address.
SURB_ID_SIZE - Single Use Reply Block ID size, 16
bytes.
MAX_HOPS - 5, the ingress provider, a set of three
mixes, and the egress provider.
PAYLOAD_SIZE - (XXX/ya: Subtraction is hard, let's go
shopping.)
KDF_INFO - The byte string
Katzenpost-kdf-v0-hkdf-sha256.
The Sphinx Packet Header additional_data field is
specified as follows:
Double check to ensure that this causes the rest of the packet header
to be 4 byte aligned, when wrapped in the wire protocol command and
framing. This might need to have 3 bytes reserved instead.
All nodes MUST reject Sphinx Packets that have
additional_data that is not as specified in the header.
Design decision.
We can eliminate a trial decryption step per packet around the epoch
transitions by having a command that rewrites the AD on a per-hop basis
and including an epoch identifier.
I am uncertain as to if the additional complexity is worth it for a
situation that can happen for a few minutes out of every epoch.
3.3 Sphinx
Per-hop Routing Information Extensions
The following extensions are added to the Sphinx Per-Hop Routing
Information commands.
Let the following additional routing commands be defined in the
extension RoutingCommandType range (0x80 -
0xff):
enum {
mix_delay(0x80),
} KatzenpostCommandType;
The mix_delay command structure is as follows:
struct {
uint32_t delay_ms;
} NodeDelayCommand;
4. Mix Node Operation
All Mixes behave in the following manner:
Accept incoming connections from peers, and open persistent
connections to peers as needed
Section 4.1 <4.1>.
Periodically interact with the PKI to publish Identity and Sphinx
packet public keys, and to obtain information about the peers it should
be communicating with, along with periodically rotating the Sphinx
packet keys for forward secrecy
Section 4.2 <4.2>.
Process inbound Sphinx Packets, delay them for the specified time
and forward them to the appropriate Mix and or Provider
Section 4.3 <4.3>.
All Nodes are identified by their link protocol signing key, for the
purpose of the Sphinx packet source routing hop identifier.
All Nodes participating in the Mix Network MUST share a common view
of time, via NTP or similar time synchronization mechanism.
4.1 Link Layer Connection
Management
All communication to and from participants in the Katzenpost Mix
Network is done via the Katzenpost Mix Network Wire Protocol KATZMIXWIRE.
Nodes are responsible for establishing the connection to the next
hop, for example, a mix in layer 0 will accept inbound connections from
all Providers listed in the PKI, and will proactively establish
connections to each mix in layer 1.
Nodes MAY accept inbound connections from unknown Nodes, but MUST not
relay any traffic until they became known via listing in the PKI
document, and MUST terminate the connection immediately if
authentication fails for any other reason.
Nodes MUST impose an exponential backoff when reconnecting if a link
layer connection gets terminated, and the minimum retry interval MUST be
no shorter than 5 seconds.
Nodes MAY rate limit inbound connections as required to keep load and
or resource use at a manageable level, but MUST be prepared to handle at
least one persistent long lived connection per potentially eligible peer
at all times.
4.2 Sphinx Mix and
Provider Key Rotation
Each Node MUST rotate the key pair used for Sphinx packet processing
periodically for forward secrecy reasons and to keep the list of seen
packet tags short. The Katzenpost Mix Network uses a fixed interval
(epoch), so that key rotations happen simultaneously
throughout the network, at predictable times.
Let each epoch be exactly 10800 seconds (3 hours) in
duration, and the 0th Epoch begin at 2017-06-01 00:00 UTC.
For more details see our “Katzenpost Mix Network Public Key
Infrastructure Specification” document. KATZMIXPKI
4.3 Sphinx Packet Processing
The detailed processing of the Sphinx packet is described in the
Sphinx specification: “The Sphinx Mix Network Cryptographic Packet
Format Specification”. Below, we present an overview of the steps which
the node is performing upon receiving the packet:
Records the time of reception.
Perform a Sphinx_Unwrap operation to authenticate and
decrypt a packet, discarding it immediately if the operation fails.
Apply replay detection to the packet, discarding replayed packets
immediately.
Act on the routing commands.
All packets processed by Mixes MUST contain the following
commands.
NextNodeHopCommand, specifying the next Mix or Provider
that the packet will be forwarded to.
NodeDelayCommand, specifying the delay in milliseconds
to be applied to the packet, prior to forwarding it to the Node
specified by the NextNodeHopCommand, as measured from the time of
reception.
Mixes MUST discard packets that have any commands other than a
NextNodeHopCommand or a NodeDelayCommand. Note
that this does not apply to Providers or Clients, which have additional
commands related to recipient and
SURB (Single Use Reply Block) processing.
Nodes MUST continue to accept the previous epoch’s key for up to 1MSL
past the epoch transition, to tolerate latency and clock skew, and MUST
start accepting the next epoch’s key 1MSL prior to the epoch transition
where it becomes the current active key.
Upon the final expiration of a key (1MSL past the epoch transition),
Nodes MUST securely destroy the private component of the expired Sphinx
packet processing key along with the backing store used to maintain
replay information associated with the expired key.
Nodes MAY discard packets at any time, for example to keep congestion
and or load at a manageable level, however assuming the
Sphinx_Unwrap operation was successful, the packet MUST be
fed into the replay detection mechanism.
Nodes MUST ensure that the time a packet is forwarded to the next
Node is around the time of reception plus the delay specified in
NodeDelayCommand. Since exact millisecond processing is
unpractical, implementations MAY tolerate a small window around that
time for packets to be forwarded. That tolerance window SHOULD be kept
minimal.
Nodes MUST discard packets that have been delayed for significantly
more time than specified by the NodeDelayCommand.
5. Anonymity Considerations
5.1 Topology
Layered topology is used because it offers the best level of
anonymity and ease of analysis, while being flexible enough to scale up
traffic. Whereas most mixnet papers discuss their security properties in
the context of a cascade topology, which does not scale well, or a
free-route network, which quickly becomes intractable to analyze when
the network grows, while providing slightly worse anonymity than a
layered topology. MIXTOPO10
Important considerations when assigning mixes to layers, in order of
decreasing importance, are:
Security: do not allow mixes from one security domain to be in
different layers to maximise the number of security domains traversed by
a packet
Performance: arrange mixes in layers to maximise the capacity of the
layer with the lowest capacity (the bottleneck layer)
Security: arrange mixes in layers to maximise the number of
jurisdictions traversed by a packet (this is harder to do really well
than it seems, requires understanding of legal agreements such as
MLATs).
5.2 Mixing strategy
As a mixing technique the Poisson mix strategy LOOPIX and KESDOGAN98 is
used, which REQUIRES that a packet at each hop in the route is delayed
by some amount of time, randomly selected by the sender from an
exponential distribution. This strategy allows to prevent the timing
correlation of the incoming and outgoing traffic from each node.
Additionally, the parameters of the distribution used for generating the
delay can be tuned up and down depending on the amount of traffic in the
network and the application for which the system is deployed.
6. Security Considerations
The source of all authority in the mixnet system comes from the
Directory Authority system which is also known as the mixnet PKI. This
system gives the mixes and clients a consistent view of the network
while allowing human intervention when needed. All public mix key
material and network connection information is distributed by this
Directory Authority system.
Appendix A. References
Appendix A.1 Normative
References
Appendix A.2 Informative
References
Appendix B. Citing This
Document
Appendix B.1 Bibtex Entry
Note that the following bibtex entry is in the IEEEtran bibtex style
as described in a document called “How to Use the IEEEtran BIBTEX
Style”.
@online{KatzMixnet,
title = {Katzenpost Mix Network Specification},
author = {Yawning Angel and George Danezis and Claudia Diaz and Ania Piotrowska and David Stainton},
url = {https://github.com/katzenpost/katzenpost/blob/main/docs/specs/mixnet.rst},
year = {2017}
}
AEZV5
Hoang, V., Krovetz, T., Rogaway, P.,
"AEZ v5: Authenticated Encryption by Enciphering",
March 2017,
http://web.cs.ucdavis.edu/~rogaway/aez/aez.pdf
KATZMIXE2E
Angel, Y., Danezis, G., Diaz, C., Piotrowska, A., Stainton, D.,
"Katzenpost Mix Network End-to-end Protocol Specification",
July 2017,
https://github.com/katzenpost/katzenpost/blob/main/docs/specs/old/end_to_end.md
KATZMIXPKI
Angel, Y., Piotrowska, A., Stainton, D.,
"Katzenpost Mix Network Public Key Infrastructure Specification",
December 2017,
https://github.com/katzenpost/katzenpost/blob/master/docs/specs/pki.md
Kesdogan, D., Egner, J., and Büschkes, R.,
"Stop-and-Go-MIXes Providing Probabilistic Anonymity in an Open System."
Information Hiding, 1998,
https://www.freehaven.net/anonbib/cache/stop-and-go.pdf
LOOPIX
Piotrowska, A., Hayes, J., Elahi, T., Meiser, S., Danezis, G.,
"The Loopix Anonymity System",
USENIX, August, 2017
https://arxiv.org/pdf/1703.00536.pdf
MIXTOPO10
Diaz, C., Murdoch, S., Troncoso, C.,
"Impact of Network Topology on Anonymity and Overhead in Low-Latency Anonymity Networks",
PETS, July 2010,
https://www.esat.kuleuven.be/cosic/publications/article-1230.pdf
RFC2119
Bradner, S.,
"Key words for use in RFCs to Indicate Requirement Levels",
BCP 14, RFC 2119, DOI 10.17487/RFC2119,
March 1997,
http://www.rfc-editor.org/info/rfc2119
RFC5246
Dierks, T. and E. Rescorla,
"The Transport Layer Security (TLS) Protocol Version 1.2",
RFC 5246, DOI 10.17487/RFC5246,
August 2008,
https://www.rfc-editor.org/info/rfc5246
RFC6234
Eastlake 3rd, D. and T. Hansen,
"US Secure Hash Algorithms (SHA and SHA-based HMAC and HKDF)\"
RFC 6234, DOI 10.17487/RFC6234,
May 2011,
https://www.rfc-editor.org/info/rfc6234
RFC7748
Langley, A., Hamburg, M., and S. Turner,
"Elliptic Curves for Security",
RFC 7748,
January 2016.
SP80038A
Dworkin, M.,
"Recommendation for Block Cipher Modes of Operation",
SP800-38A, 10.6028/NIST.SP.800,
December 2001,
https://doi.org/10.6028/NIST.SP.800-38A
SPHINXSPEC
Angel, Y., Danezis, G., Diaz, C., Piotrowska, A., Stainton, D.,
"Sphinx Mix Network Cryptographic Packet Format Specification"
July 2017,
https://github.com/katzenpost/katzenpost/blob/master/docs/specs/sphinx.md
2.8 - Katzenpost PKI Specification
Abstract
1. Introduction
Mixnets are designed with the assumption that a Public Key
Infrastructure (PKI) exists and it gives each client the same view of
the network. This specification is inspired by the Tor and Mixminion
Directory Authority systems MIXMINIONDIRAUTHTORDIRAUTH whose main features are precisely what
we need for our PKI. These are decentralized systems meant to be
collectively operated by multiple entities.
The mix network directory authority system (PKI) is essentially a
cooperative decentralized database and voting system that is used to
produce network consensus documents which mix clients periodically
retrieve and use for their path selection algorithm when creating Sphinx
packets. These network consensus documents are derived from a voting
process between the Directory Authority servers.
This design prevents mix clients from using only a partial view of
the network for their path selection so as to avoid fingerprinting and
bridging attacks FINGERPRINTING, BRIDGING, and LOCALVIEW.
The PKI is also used by Authority operators to specify network-wide
parameters, for example in the Katzenpost Decryption Mix Network KATZMIXNET the Poisson mix strategy is used and,
therefore, all clients must use the same lambda parameter for their
exponential distribution function when choosing hop delays in the path
selection. The Mix Network Directory Authority system, aka PKI, SHALL be
used to distribute such network-wide parameters in the network consensus
document that have an impact on security and performance.
1.1 Conventions Used in This
Document
The key words “MUST”, “MUST NOT”, “REQUIRED”, “SHALL”, “SHALL NOT”,
“SHOULD”, “SHOULD NOT”, “RECOMMENDED”, “MAY”, and “OPTIONAL” in this
document are to be interpreted as described in RFC2119.
The “C” style Presentation Language as described in RFC5246 Section 4 is used to represent data
structures for additional cryptographic wire protocol commands. KATZMIXWIRE
1.2 Terminology
PKI - Public Key Infrastructure
Directory Authority system - refers to specific PKI
schemes used by
Mixminion and Tor
MSL - maximum segment lifetime
mix descriptor - A database record which describes a
component mix
family - Identifier of security domains or entities
operating one or more mixes in the network. This is used to inform the
path selection algorithm.
nickname - simply a nickname string that is unique in
the consensus document, see “Katzenpost Mix Network Specification”
section “2.2. Network Topology”.
layer - The layer indicates which network topology layer
a particular mix resides in.
Provider - A service operated by a third party that
Clients communicate directly with to communicate with the Mixnet. It is
responsible for Client authentication, forwarding outgoing messages to
the Mixnet, and storing incoming messages for the Client. The Provider
MUST have the ability to perform cryptographic operations on the relayed
messages.
1.3 Security Properties
Overview
This Directory Authority system has the following feature goals and
security properties:
All Directory Authority servers must agree with each other on the
set of Directory Authorities.
All Directory Authority servers must agree with each other on the
set of mixes.
This system is intentionally designed to provide identical network
consensus documents to each mix client. This mitigates epistemic attacks
against the client path selection algorithm such as fingerprinting and
bridge attacks FINGERPRINTINGBRIDGING.
This system is NOT byzantine-fault-tolerant, it instead allows for
manual intervention upon consensus fault by the Directory Authority
operators. Further, these operators are responsible for expelling bad
acting operators from the system.
This system enforces the network policies such as mix join policy
wherein intentionally closed mixnets will prevent arbitrary hosts from
joining the network by authenticating all descriptor signatures with a
list of allowed public keys.
The Directory Authority system for a given mix network is
essentially the root of all authority.
1.4
Differences from Tor and Mixminion Directory Authority systems
In this document we specify a Directory Authority system which is
different from that of Tor's and Mixminion’s in a number of ways:
The list of valid mixes is expressed in an allowlist. For the time
being there is no specified “bandwidth authority” system which verifies
the health of mixes (Further research required in this area).
There’s no non-directory channel to inform clients that a node is
down, so it will end up being a lot of packet loss, since clients will
continue to include the missing node in their path selection until keys
published by the node expire and it falls out of the consensus.
The schema of the mix descriptors is different from that used in
Mixminion and Tor, including a change which allows our mix descriptor to
express n Sphinx mix routing public keys in a single mix
descriptor whereas in the Tor and Mixminion Directory Authority systems,
n descriptors are used.
The serialization format of mix descriptors is different from that
used in Mixminion and Tor.
The shared random number computation is performed every voting
round, and is required for a vote to be accepted by each authority. The
shared random number is used to deterministically generate the network
topology.
2. Overview of Mix PKI
Interaction
Each Mix MUST rotate the key pair used for Sphinx packet processing
periodically for forward secrecy reasons and to keep the list of seen
packet tags short. SPHINX09SPHINXSPEC The Katzenpost Mix Network uses a
fixed interval (epoch), so that key rotations happen
simultaneously throughout the network, at predictable times.
Each Directory Authority server MUST use some time synchronization
protocol in order to correctly use this protocol. This Directory
Authority system requires time synchronization to within a few
minutes.
Let each epoch be exactly 1200 seconds (20 minutes) in
duration, and the 0th Epoch begin at
2017-06-01 00:00 UTC.
To facilitate smooth operation of the network and to allow for delays
that span across epoch boundaries, Mixes MUST publish keys to the PKI
for at least 3 epochs in advance, unless the mix will be otherwise
unavailable in the near future due to planned downtime.
At an epoch boundary, messages encrypted to keys from the previous
epoch are accepted for a grace period of 2 minutes.
Thus, at any time, keys for all Mixes for the Nth through N + 2nd
epoch will be available, allowing for a maximum round trip (forward
message + SURB) delay + transit time of 40 minutes. SURB lifetime is
limited to a single epoch because of the key rotation epoch, however
this shouldn’t present any useability problems since SURBs are only used
for sending ACK messages from the destination Provider to the sender as
described in KATZMIXE2E.
2.1 PKI Protocol Schedule
There are two main constraints to Authority schedule:
There MUST be enough key material extending into the future so
that clients are able to construct Sphinx packets with a forward and
reply paths.
All participants should have enough time to participate in the
protocol; upload descriptors, vote, generate documents, download
documents, establish connections for user traffic.
The epoch duration of 20 minutes is more than adequate for these two
constraints.
NOTE: perhaps we should make it shorter? but first let’s do some
scaling and bandwidth calculations to see how bad it gets…
2.1.1 Directory Authority
Server Schedule
Directory Authority server interactions are conducted according to
the following schedule, where T is the beginning of the
current epoch, and P is the length of the epoch period.
T - Epoch begins
T + P/2 - Vote exchange
T + (5/8)*P - Reveal exchange
T + (6/8)*P - Tabulation and signature exchange
T + (7/8)*P - Publish consensus
2.1.2 Mix Schedule
Mix PKI interactions are conducted according to the following
schedule, where T is the beginning of the current epoch.
T + P/8 - Deadline for publication of all mixes
documents for the next epoch.
T + (7/8)*P - This marks the beginning of the period
where mixes perform staggered fetches of the PKI consensus document.
T + (8/9)*P - Start establishing connections to the new
set of relevant mixes in advance of the next epoch.
T + P - 1MSL - Start accepting new Sphinx packets
encrypted to the next epoch’s keys.
T + P + 1MSL - Stop accepting new Sphinx packets
encrypted to the previous epoch’s keys, close connections to peers no
longer listed in the PKI documents and erase the list of seen packet
tags.
Mix layer changes are controlled by the Directory Authorities and
therefore a mix can be reassigned to a different layer in our stratified
topology at any new epoch. Mixes will maintain incoming and outgoing
connections to the various nodes until all mix keys have expired, iff
the node is still listed anywhere in the current document.
3. Voting for Consensus
Protocol
In our Directory Authority protocol, all the actors conduct their
behavior according to a common schedule as outlined in section "2.1 PKI
Protocol Schedule". The Directory Authority servers exchange messages to
reach consensus about the network. Other tasks they perform include
collecting mix descriptor uploads from each mix for each key rotation
epoch, voting, shared random number generation, signature exchange and
publishing of the network consensus documents.
3.1 Protocol Messages
There are only two document types in this protocol:
mix_descriptor: A mix descriptor describes a mix.
directory: A directory contains a list of descriptors
and other information that describe the mix network.
Mix descriptor and directory documents MUST be properly signed.
3.1.1 Mix Descriptor and
Directory Signing
Mixes MUST compose mix descriptors which are signed using their
private identity key, an ed25519 key. Directories are signed by one or
more Directory Authority servers using their authority key, also an
ed25519 key. In all cases, signing is done using JWS RFC7515.
3.2 Vote Exchange
As described in section “2.1 PKI Protocol Schedule”, the Directory
Authority servers begin the voting process 1/8 of an epoch period after
the start of a new epoch. Each Authority exchanges vote directory
messages with each other.
Authorities archive votes from other authorities and make them
available for retreival. Upon receiving a new vote, the authority
examines it for new descriptors and includes any valid descriptors in
its view of the network.
Each Authority includes in its vote a hashed value committing to a
choice of a random number for the vote. See section 4.3 for more
details.
3.2.1 Voting Wire Protocol Commands
The Katzenpost Wire Protocol as described in KATZMIXWIRE
is used by Authorities to exchange votes. We define additional wire
protocol commands for sending votes:
enum {
: vote(22), vote_status(23),
} Command;
The structures of these commands are defined as follows:
The vote command is used to send a PKI document to a peer Authority
during the voting period of the PKI schedule.
The payload field contains the signed and serialized PKI document
representing the sending Authority’s vote. The public_key field contains
the public identity key of the sending Authority which the receiving
Authority can use to verify the signature of the payload. The
epoch_number field is used by the receiving party to quickly check the
epoch for the vote before deserializing the payload.
Each authority MUST include its commit value for the shared random
computation in this phase along with its signed vote. This computation
is derived from the Tor Shared Random Subsystem, TORSRV.
3.2.3 The vote_status Command
The vote_status command is used to reply to a vote command. The
error_code field indicates if there was a failure in the receiving of
the PKI document.
enum {
: vote_ok(0), /\* None error condition. */ vote_too_early(1), /*
The Authority should try again later. */ vote_too_late(2), /*
This round of voting was missed. \*/
}
The epoch_number field of the vote struct is compared with the epoch
that is currently being voted on. vote_too_early and vote_too_late are
replied back to the voter to report that their vote was not
accepted.
3.3 Reveal Exchange
As described in section “2.1 PKI Protocol Schedule”, the Directory
Authority servers exchange the reveal values after they have exchanged
votes which contain a commit value. Each Authority exchanges reveal
messages with each other.
3.3.1 Reveal Wire Protocol Commands
The Katzenpost Wire Protocol as described in KATZMIXWIRE is used by Authorities to exchange
reveal values previously commited to in their votes. We define
additional wire protocol commands for exchanging reveals:
The reveal command is used to send a reveal value to a peer authority
during the reveal period of the PKI schedule.
The payload field contains the signed and serialized reveal value.
The public_key field contains the public identity key of the sending
Authority which the receiving Authority can use to verify the signature
of the payload. The epoch_number field is used by the receiving party to
quickly check the epoch for the reveal before deserializing the
payload.
3.3.3 The reveal_status Command
The reveal_status command is used to reply to a reveal command. The
error_code field indicates if there was a failure in the receiving of
the shared random reveal value.
enum {
: reveal_ok(8), /* None error condition. */ reveal_too_early(9),
/* The Authority should try again later. */
reveal_not_authorized(10), /* The Authority was rejected. */
reveal_already_received(11), /* The Authority has already revealed
this round. */ reveal_too_late(12) /* This round of revealing was
missed. */
} Errorcodes;
The epoch_number field of the reveal struct is compared with the
epoch that is currently being voted on. reveal_too_early and
reveal_too_late are replied back to the authority to report their reveal
was not accepted. The status code reveal_not_authorized is used if the
Authority is rejected. The reveal_already_received is used to report
that a valid reveal command was already received for this round.
3.4 Cert Exchange
The Cert command is the same as a Vote but contains the set of Reveal
values as seen by the voting peer. In order to ensure that a
misconfigured or malicious Authority operator cannot amplify their
ability to influence the threshold voting process, after Reveal messages
have been exchanged, Authorities vote again, including the Reveals seen
by them. Authorities may not introduce new MixDescriptors at this phase
in the protocol.
Otherwise, a consensus partition can be obtained by witholding Reveal
values from a threshold number of Peers. In the case of an even-number
of Authorities, a denial of service by a single Authority was
observed.
3.5 Vote Tabulation
for Consensus Computation
The main design constraint of the vote tabulation algorithm is that
it MUST be a deterministic process that produces the same result for
each directory authority server. This result is known as a network
consensus file.
A network consensus file is a well formed directory struct where the
status field is set to consensus and contains
0 or more descriptors, the mix directory is signed by 0 or more
directory authority servers. If signed by the full voting group then
this is called a fully signed consensus.
Validate each vote directory:
that the liveness fields correspond to the following epoch
status is vote
version number matches ours
Compute a consensus directory:
Here we include a modified section from the Mixminion PKI spec MIXMINIONDIRAUTH:
For each distinct mix identity in any vote directory:
If there are multiple nicknames for a given identity, do not
include any descriptors for that identity.
If half or fewer of the votes include the identity, do not
include any descriptors for the identity. This also guarantees that
there will be only one identity per nickname.
If we are including the identity, then for each distinct
descriptor that appears in any vote directory:
Do not include the descriptor if it will have expired on the date
the directory will be published.
Do not include the descriptor if it is superseded by other
descriptors for this identity.
Do not include the descriptor if it not valid in the next
epoch.
Otherwise, include the descriptor.
Sort the list of descriptors by the signature field so that
creation of the consensus is reproducible.
Set directory status field to
consensus.
Compute a shared random number from the values revealed in the
“Reveal” step. Authorities whose reveal value does not verify their
commit value MUST be excluded from the consensus round. Authorities
ensure that their peers MUST participate in Commit-and-Reveal, and MUST
use correct Reveal values obtained from other Peers as part of the
“Cert” exchange.
Generate or update the network topology using the shared random
number as a seed to a deterministic random number generator that
determines the order that new mixes are placed into the
topology.
3.6 Signature Collection
Each Authority signs their view of consensus, and exchanges detached
Signatures with each other. Upon receiving each Signature it is added to
the signatures on the Consensus if it validates the Consensus. The
Authority SHOULD warn the administrator if network partition is
detected.
If there is disagreement about the consensus directory, each
authority collects signatures from only the servers which it agrees with
about the final consensus.
// TODO: consider exchanging peers votes amongst authorities (or
hashes thereof) to // ensure that an authority has distributed one and
only unique vote amongst its peers.
3.7 Publication
If the consensus is signed by a majority of members of the voting
group then it's a valid consensus and it is published.
4. PKI Protocol Data
Structures
4.1 Mix Descriptor Format
Note that there is no signature field. This is because mix
descriptors are serialized and signed using JWS. The
IdentityKey field is a public ed25519 key. The
MixKeys field is a map from epoch to public X25519 keys
which is what the Sphinx packet format uses.
Note
XXX David: replace the following example with a JWS example:
After the votes are collected from the voting round, and before
signature exchange, the Shared Random Value field of the consensus
document is the output of H over the input string calculated as
follows:
Validated Reveal commands received including the authorities own
reveal are sorted by reveal value in ascending order and appended to the
input in format IdentityPublicKeyBytes_n | RevealValue_n
However instead of the Identity Public Key bytes we instead encode
the Reveal with the blake2b 256 bit hash of the public key bytes.
If a SharedRandomValue for the previous epoch exists, it is appended
to the input string, otherwise 32 NUL (x00) bytes are used.
The Katzenpost Wire Protocol as described in KATZMIXWIRE is used by both clients and by
Directory Authority peers. In the following section we describe
additional wire protocol commands for publishing mix descriptors, voting
and consensus retrieval.
5.1 Mix Descriptor publication
The following commands are used for publishing mix descriptors and
setting mix descriptor status:
The vote command is used to send a PKI document to a
peer Authority during the voting period of the PKI schedule.
The payload field contains the signed and serialized PKI document
representing the sending Authority’s vote. The public_key field contains
the public identity key of the sending Authority which the receiving
Authority can use to verify the signature of the payload. The
epoch_number field is used by the receiving party to quickly check the
epoch for the vote before deserializing the payload.
5.2.2 The vote_status Command
The vote_status command is used to reply to a vote
command. The error_code field indicates if there was a failure in the
receiving of the PKI document.
enum {
vote_ok(0), /* None error condition. */
vote_too_early(1), /* The Authority should try again later. */
vote_too_late(2), /* This round of voting was missed. */
vote_not_authorized(3), /* The voter's key is not authorized. */
vote_not_signed(4), /* The vote signature verification failed */
vote_malformed(5), /* The vote payload was invalid */
vote_already_received(6), /* The vote was already received */
vote_not_found(7), /* The vote was not found */
}
The epoch_number field of the vote struct is compared with the epoch
that is currently being voted on. vote_too_early and vote_too_late are
replied back to the voter to report that their vote was not
accepted.
5.2.3 The get_vote Command
The get_vote command is used to request a PKI document
(vote) from a peer Authority. The epoch field contains the epoch from
which to request the vote, and the public_key field contains the public
identity key of the Authority of the requested vote. A successful query
is responded to with a vote command, and queries that fail are responded
to with a vote_status command with error_code vote_not_found(7).
5.3 Retrieval of Consensus
Providers in the Katzenpost mix network system KATZMIXNET may cache validated network consensus
files and serve them to clients over the mix network's link layer wire
protocol KATZMIXWIRE. We define additional
wire protocol commands for requesting and sending PKI consensus
documents:
enum {
/* Extending the wire protocol Commands. */
get_consensus(18),
consensus(19),
} Command;
The structures of these commands are defined as follows:
The get_consensus command is a command that is used to retrieve a
recent consensus document. If a given get_consensus command contains an
Epoch value that is either too big or too small then a reply consensus
command is sent with an empty payload. Otherwise if the consensus
request is valid then a consensus command containing a recent consensus
document is sent in reply.
Initiators MUST terminate the session immediately upon reception of a
get_consensus command.
5.3.2 The consensus Command
The consensus command is a command that is used to send a recent
consensus document. The error_code field indicates if there was a
failure in retrieval of the PKI consensus document.
enum {
consensus_ok(0), /* None error condition and SHOULD be accompanied with
a valid consensus payload. */
consensus_not_found(1), /* The client should try again later. */
consensus_gone(2), /* The consensus will not be available in the future. */
} ErrorCodes;
5.4.1 The Cert Command
The cert command is used to send a PKI document to a
peer Authority during the voting period of the PKI schedule. It is the
same as the vote command, but must contain the set of
SharedRandomCommit and SharedRandomReveal values as seen by the
Authority during the voting process.
5.4.2 The CertStatus Command
The cert_status command is the response to a
cert command, and is the same as a vote_status
response, other than the command identifier. Responses are CertOK,
CertTooEarly, CertNotAuthorized, CertNotSigned, CertAlreadyReceived,
CertTooLate
5.5 Signature Exchange
Signatures exchange is the final round of the consensus protocol and
consists of the Sig and SigStatus commands.
5.5.1 The Sig Command
The sig command contains a detached Signature from
PublicKey of Consensus for Epoch.
5.5.2 The SigStatus Command
The sig_status command is the response to a
sig command. Responses are SigOK, SigNotAuthorized,
SigNotSigned, SigTooEarly, SigTooLate, SigAlreadyReceived, and
SigInvalid.
6. Scalability Considerations
TODO: notes on scaling, bandwidth usage etc.
7. Future Work
byzantine fault tolerance
PQ crypto signatures for all PKI documents: mix descriptors and
directories. SPHINCS256 could be used, we
already have a golang implementation:
https://github.com/Yawning/sphincs256/
Make a Bandwidth Authority system to measure health of the network.
Also perform load balancing as described in PEERFLOW?
Implement byzantine attack defenses as described in MIRANDA and MIXRELIABLE
where mix link performance proofs are recorded and used in a reputation
system.
Choose a different serialization/schema language?
Use a append only merkle tree instead of this voting protocol.
8. Anonymity Considerations
This system is intentionally designed to provide identical network
consensus documents to each mix client. This mitigates epistemic attacks
against the client path selection algorithm such as fingerprinting and
bridge attacks FINGERPRINTING, BRIDGING.
If consensus has failed and thus there is more than one consensus
file, clients MUST NOT use this compromised consensus and refuse to
run.
We try to avoid randomizing the topology because doing so splits the
anonymity sets on each mix into two. That is, packets belonging to the
previous topology versus the current topology are trivially
distinguishable. On the other hand if enough mixes fall out of consensus
eventually the mixnet will need to be rebalanced to avoid an attacker
compromised path selection. One example of this would be the case where
the adversary controls the only mix is one layer of the network
topology.
9. Security Considerations
The Directory Authority / PKI system for a given mix network is
essentially the root of all authority in the system. The PKI controls
the contents of the network consensus documents that mix clients
download and use to inform their path selection. Therefore if the PKI as
a whole becomes compromised then so will the rest of the system in terms
of providing the main security properties described as traffic analysis
resistance. Therefore a decentralized voting protocol is used so that
the system is more resiliant when attacked, in accordance with the
principle of least authority. SECNOTSEP
Short epoch durations make it is more practical to make corrections
to network state using the PKI voting rounds.
Fewer epoch keys published in advance is a more conservative
security policy because it implies reduced exposure to key compromise
attacks.
A bad acting Directory Authority who lies on each vote and votes
inconsistently can trivially cause a denial of service for each voting
round.
10. Acknowledgements
We would like to thank Nick Mathewson for answering design questions
and thorough design review.
Appendix A. References
Appendix A.1 Normative
References
Appendix A.2 Informative
References
Appendix B. Citing This
Document
Appendix B.1 Bibtex Entry
Note that the following bibtex entry is in the IEEEtran bibtex style
as described in a document called “How to Use the IEEEtran BIBTEX
Style”.
@online{KatzMixPKI,
title = {Katzenpost Mix Network Public Key Infrastructure Specification},
author = {Yawning Angel and Ania Piotrowska and David Stainton},
url= {https://github.com/katzenpost/katzenpost/blob/main/docs/specs/pki.rst},
year = {2017}
}
BRIDGING
Danezis, G., Syverson, P., “Bridging and Fingerprinting: Epistemic
Attacks on Route Selection”, In the Proceedings of PETS 2008, Leuven,
Belgium, July 2008,
https://www.freehaven.net/anonbib/cache/danezis-pet2008.pdf
FINGERPRINTING
Danezis, G., Clayton, R., “Route Finger printing in Anonymous
Communications”, https://www.cl.cam.ac.uk/~rnc1/anonroute.pdf
KATZMIXE2E
Angel, Y., Danezis, G., Diaz, C., Piotrowska, A., Stainton, D.,
“Katzenpost Mix Network End-to-end Protocol Specification”, July 2017,
https://github.com/katzenpost/katzenpost/blob/main/docs/specs/old/end_to_end.md
KATZMIXNET
Angel, Y., Danezis, G., Diaz, C., Piotrowska, A., Stainton, D.,
“Katzenpost Mix Network Specification”, June 2017,
https://github.com/katzenpost/katzenpost/blob/main/docs/specs/mixnet.md
KATZMIXWIRE
Angel, Y. “Katzenpost Mix Network Wire Protocol Specification”, June
2017,
https://github.com/katzenpost/katzenpost/blob/main/docs/specs/wire-protocol.md
LOCALVIEW
Gogolewski, M., Klonowski, M., Kutylowsky, M., “Local View Attack on
Anonymous Communication”,
https://www.freehaven.net/anonbib/cache/esorics05-Klonowski.pdf
MIRANDA
Leibowitz, H., Piotrowska, A., Danezis, G., Herzberg, A., 2017, “No
right to ramain silent: Isolating Malicious Mixes”
https://eprint.iacr.org/2017/1000.pdf
MIXMINIONDIRAUTH
Danezis, G., Dingledine, R., Mathewson, N., “Type III (Mixminion) Mix
Directory Specification”, December 2005,
https://www.mixminion.net/dir-spec.txt
MIXRELIABLE
Dingledine, R., Freedman, M., Hopwood, D., Molnar, D., 2001 “A
Reputation System to Increase MIX-Net Reliability”, In Information
Hiding, 4th International Workshop
https://www.freehaven.net/anonbib/cache/mix-acc.pdf
PEERFLOW
Johnson, A., Jansen, R., Segal, A., Syverson, P., “PeerFlow: Secure
Load Balancing in Tor”, Preceedings on Privacy Enhancing Technologies,
July 2017,
https://petsymposium.org/2017/papers/issue2/paper12-2017-2-source.pdf
RFC2119
Bradner, S., “Key words for use in RFCs to Indicate Requirement
Levels”, BCP 14, RFC 2119, DOI 10.17487/RFC2119, March 1997,
https://www.rfc-editor.org/info/rfc2119
RFC5246
Dierks, T. and E. Rescorla, “The Transport Layer Security (TLS)
Protocol Version 1.2”, RFC 5246, DOI 10.17487/RFC5246, August 2008,
http://www.rfc-editor.org/info/rfc5246
RFC7515
Jones, M., Bradley, J., Sakimura, N., “JSON Web Signature (JWS)”, May
2015, https://tools.ietf.org/html/rfc7515
SECNOTSEP
Miller, M., Tulloh, B., Shapiro, J., “The Structure of Authority: Why
Security Is not a Separable Concern”,
http://www.erights.org/talks/no-sep/secnotsep.pdf
SPHINCS256
Bernstein, D., Hopwood, D., Hulsing, A., Lange, T., Niederhagen, R.,
Papachristodoulou, L., Schwabe, P., Wilcox O’ Hearn, Z., “SPHINCS:
practical stateless hash-based signatures”,
http://sphincs.cr.yp.to/sphincs-20141001.pdf
SPHINX09
Danezis, G., Goldberg, I., “Sphinx: A Compact and Provably Secure Mix
Format”, DOI 10.1109/SP.2009.15, May 2009,
http://research.microsoft.com/en-us/um/people/gdane/papers/sphinx-eprint.pdf
SPHINXSPEC
Angel, Y., Danezis, G., Diaz, C., Piotrowska, A., Stainton, D.,
“Sphinx Mix Network Cryptographic Packet Format Specification” July
2017,
https://github.com/katzenpost/katzenpost/blob/main/docs/specs/sphinx.md
TORDIRAUTH
“Tor directory protocol, version 3”,
https://gitweb.torproject.org/torspec.git/tree/dir-spec.txt
TORSRV
“Tor Shared Random Subsystem Specification”,
https://gitweb.torproject.org/torspec.git/tree/srv-spec.txt
2.9 - Sphinx Specification
Abstract
This document defines the Sphinx cryptographic packet format for
decryption mix networks, and provides a parameterization based around
generic cryptographic primitives types. This document does not introduce
any new crypto, but is meant to serve as an implementation guide.
1. Introduction
The Sphinx cryptographic packet format is a compact and provably
secure design introduced by George Danezis and Ian Goldberg SPHINX09. It supports a full set of security
features: indistinguishable replies, hiding the path length and relay
position, detection of tagging attacks and replay attacks, as well as
providing unlinkability for each leg of the packet’s journey over the
network.
1.1 Terminology
Message - A variable-length sequence of octets sent
anonymously through the network.
Packet - A fixed-length sequence of octets transmitted
anonymously through the network, containing the encrypted message and
metadata for routing.
Header - The packet header consisting of several
components, which convey the information necessary to verify packet
integrity and correctly process the packet.
Payload - The fixed-length portion of a packet
containing an encrypted message or part of a message, to be delivered
anonymously.
Group - A finite set of elements and a binary operation
that satisfy the properties of closure, associativity, invertability,
and the presence of an identity element.
Group element - An individual element of the
group.
Group generator - A group element capable of generating
any other element of the group, via repeated applications of the
generator and the group operation.
1.2 Conventions Used in This
Document
The key words “MUST”, “MUST NOT”, “REQUIRED”, “SHALL”, “SHALL NOT”,
“SHOULD”, “SHOULD NOT”, “RECOMMENDED”, “MAY”, and “OPTIONAL” in this
document are to be interpreted as described in RFC2119.
The “C” style Presentation Language as described in RFC5246 Section 4 is used to represent data
structures, except for cryptographic attributes, which are specified as
opaque byte vectors.
x | y denotes the concatenation of x and y.
x ^ y denotes the bitwise XOR of x and y.
byte an 8-bit octet.
x[a:b] denotes the sub-vector of x where a/b denote the
start/end byte indexes (inclusive-exclusive); a/b may be omitted to
signify the start/end of the vector x respectively.
x[y] denotes the y'th element of list x.
x.len denotes the length of list x.
ZEROBYTES(N) denotes N bytes of 0x00.
RNG(N) denotes N bytes of cryptographic random
data.
LEN(N) denotes the length in bytes of N.
CONSTANT_TIME_CMP(x, y) denotes a constant time
comparison between the byte vectors x and y, returning true iff x and y
are equal.
2. Cryptographic Primitives
This specification uses the following cryptographic primitives as the
foundational building blocks for Sphinx:
H(M) - A cryptographic hash function which takes an
octet array M to produce a digest consisting of a
HASH_LENGTH byte octet array. H(M) MUST be
pre-image and collision resistant.
MAC(K, M) - A cryptographic message authentication
code function which takes a M_KEY_LENGTH byte octet array
key K and arbitrary length octet array message
M to produce an authentication tag consisting of a
MAC_LENGTH byte octet array.
KDF(SALT, IKM) - A key derivation function which
takes an arbitrary length octet array salt SALT and an
arbitrary length octet array initial key IKM, to produce an
octet array of arbitrary length.
S(K, IV) - A pseudo-random generator (stream cipher)
which takes a S_KEY_LENGTH byte octet array key
K and a S_IV_LENGTH byte octet array
initialization vector IV to produce an octet array key
stream of arbitrary length.
SPRP_Encrypt(K, M)/SPRP_Decrypt(K, M) - A strong
pseudo-random permutation (SPRP) which takes a
SPRP_KEY_LENGTH byte octet array key K and
arbitrary length message M, and produces the encrypted
ciphertext or decrypted plaintext respectively.
When used with the default payload authentication mechanism, the SPRP
MUST be "fragile" in that any amount of modifications to M
results in a large number of unpredictable changes across the whole
message upon a SPRP_Encrypt() or
SPRP_Decrypt() operation.
EXP(X, Y) - An exponentiation function which takes
the GROUP_ELEMENT_LENGTH byte octet array group elements
X and Y, and returns X ^^ Y as a
GROUP_ELEMENT_LENGTH byte octet array.
Let G denote the generator of the group, and
EXP_KEYGEN() return a GROUP_ELEMENT_LENGTH
byte octet array group element usable as private key.
The group defined by G and EXP(X, Y) MUST
satisfy the Decision Diffie-Hellman problem.
EXP_KEYGEN() - Returns a new "suitable" private key
for EXP().
2.1 Sphinx Key Derivation
Function
Sphinx Packet creation and processing uses a common Key Derivation
Function (KDF) to derive the required MAC and symmetric cryptographic
keys from a per-hop shared secret.
The output of the KDF is partitioned according to the following
structure:
The Sphinx Packet Format is parameterized by the implementation based
on the application and security requirements.
AD_LENGTH - The constant amount of per-packet
unencrypted additional data in bytes.
PAYLOAD_TAG_LENGTH - The length of the message payload
authentication tag in bytes. This SHOULD be set to at least 16 bytes
(128 bits).
PER_HOP_RI_LENGTH - The length of the per-hop Routing
Information (Section 4.1.1 <4.1.1>) in bytes.
NODE_ID_LENGTH - The node identifier length in
bytes.
RECIPIENT_ID_LENGTH - The recipient identifier length
in bytes.
SURB_ID_LENGTH - The Single Use Reply Block
(Section 7 <7.0>) identifier length in bytes.
MAX_HOPS - The maximum number of hops a packet can
traverse.
PAYLOAD_LENGTH - The per-packet message payload length
in bytes, including a PAYLOAD_TAG_LENGTH byte
authentication tag.
KDF_INFO - A constant opaque byte vector used as the
info parameter to the KDF for the purpose of domain separation.
3.2 Sphinx Packet Geometry
The Sphinx Packet Geometry is derived from the Sphinx Parameter
Constants Section 3.1. These are all derived parameters,
and are primarily of interest to implementors.
ROUTING_INFO_LENGTH - The total length of the "routing
information" Sphinx Packet Header component in bytes:
header - The packet header consists of several
components, which convey the information necessary to verify packet
integrity and correctly process the packet.
payload - The application message data.
4.1 Sphinx Packet Header
The Sphinx Packet Header refers to the block of data immediately
preceding the Sphinx Packet Payload in a Sphinx Packet.
The structure of the Sphinx Packet Header is defined as follows:
additional_data - Unencrypted per-packet Additional
Data (AD) that is visible to every hop. The AD is authenticated on a
per-hop basis.
As the additional_data is sent in the clear and traverses the network
unaltered, implementations MUST take care to ensure that the field
cannot be used to track individual packets.
group_element - An element of the cyclic group, used
to derive the per-hop key material required to authenticate and process
the rest of the SphinxHeader and decrypt a single layer of the Sphinx
Packet Payload encryption.
routing_information - A vector of per-hop routing
information, encrypted and authenticated in a nested manner. Each
element of the vector consists of a series of routing commands,
specifying all of the information required to process the packet.
The precise encoding format is specified in
Section 4.1.1 <4.1.1>.
MAC - A message authentication code tag covering the
additional_data, group_element, and routing_information.
4.1.1 Per-hop routing
information
The routing_information component of the Sphinx Packet Header
contains a vector of per-hop routing information. When processing a
packet, the per hop processing is set up such that the first element in
the vector contains the routing commands for the current hop.
The structure of the routing information is as follows:
While the NullCommand padding field is specified as
opaque, implementations SHOULD zero fill the padding. The choice of
0x00 as the terminal NullCommand is deliberate to ease
implementation, as ZEROBYTES(N) produces a valid
NullCommand RoutingCommand, resulting in “appending zero filled padding”
producing valid output.
Implementations MUST pad the routing_commands vector so that it is
exactly PER_HOP_RI_LENGTH bytes, by appending a terminal
NullCommand if necessary.
Every non-terminal hop’s routing_commands MUST include a
NextNodeHopCommand.
4.2 Sphinx Packet Payload
The Sphinx Packet Payload refers to the block of data immediately
following the Sphinx Packet Header in a Sphinx Packet.
For most purposes the structure of the Sphinx Packet Payload can be
treated as a single contiguous byte vector of opaque data.
Upon packet creation, the payload is repeatedly encrypted (unless it
is a SURB Reply, see Section 7.0 via keys derived from the
Diffie-Hellman key exchange between the packet's
group_element and the public key of each node in the
path.
Authentication of packet integrity is done by prepending a tag set to
a known value to the plaintext prior to the first encrypt operation. By
virtue of the fragile nature of the SPRP function, any alteration to the
encrypted payload as it traverses the network will result in an
irrecoverably corrupted plaintext when the payload is decrypted by the
recipient.
5. Sphinx Packet Creation
For the sake of brevity, the pseudocode for all of the operations
will take a vector of the following PathHop structure as a parameter
named path[] to specify the path a packet will traverse, along with the
per-hop routing commands and per-hop public keys.
struct {
/* There is no need for a node_id here, as
routing_commands[0].next_hop specifies that
information for all non-terminal hops. */
opaque public_key[GROUP_ELEMENT_LENGTH];
RoutingCommand routing_commands<1...2^8-1>;
} PathHop;
It is assumed that each routing_commands vector except for the
terminal entry contains at least a RoutingCommand consisting of a
partially assembled NextNodeHopCommand with the next_hop
element filled in with the identifier of the next hop.
5.1 Create a Sphinx Packet
Header
Both the creation of a Sphinx Packet and the creation of a SURB
requires the generation of a Sphinx Packet Header, so it is specified as
a distinct operation.
additional_data The Additional Data that is visible to
every node along the path in the header.
path The vector of PathHop structures in hop order,
specifying the node id, public key, and routing commands for each
hop.
Outputs: sphinx_header The resulting Sphinx Packet
Header.
payload_keys The vector of SPRP keys used to encrypt
the Sphinx Packet Payload, in hop order.
The Sphinx_Create_Header operation consists of the
following steps:
Derive the key material for each hop.
num_hops = route.len
route_keys = [ ]
route_group_elements = [ ]
priv_key = EXP_KEYGEN()
/* Calculate the key material for the 0th hop. */
group_element = EXP( G, priv_key )
route_group_elements += group_element
shared_secret = EXP( path[0].public_key, priv_key )
route_keys += Sphinx_KDF( KDF_INFO, shared_secret )
blinding_factor = keys[0].blinding_factor
/* Calculate the key material for rest of the hops. */
for i = 1; i < num_hops; ++i:
shared_secret = EXP( path[i].public_key, priv_key )
for j = 0; j < i; ++j:
shared_secret = EXP( shared_secret, keys[j].blinding_factor )
route_keys += Sphinx_KDF( KDF_INFO, shared_secret )
group_element = EXP( group_element, keys[i-1].blinding_factor )
route_group_elements += group_element
At the conclusion of the derivation process:
route_keys - A vector of per-hop SphinxKeys.
route_group_elements - A vector of per-hop group
elements.
Derive the routing_information keystream and encrypted padding for
each hop.
ri_keystream = [ ]
ri_padding = [ ]
for i = 0; i < num_hops; ++i:
keystream = ZEROBYTES( ROUTING_INFO_LENGTH + PER_HOP_RI_LENGTH ) ^
S( route_keys[i].header_encryption,
route_keys[i].header_encryption_iv )
ks_len = LEN( keystream ) - (i + 1) * PER_HOP_RI_LENGTH
padding = keystream[ks_len:]
if i > 0:
prev_pad_len = LEN( ri_padding[i-1] )
padding = padding[:prev_pad_len] ^ ri_padding[i-1] |
padding[prev_pad_len]
ri_keystream += keystream[:ks_len]
ri_padding += padding
At the conclusion of the derivation process:
ri_keystream - A vector of per-hop routing_information
encryption keystreams.
ri_padding - The per-hop encrypted routing_information
padding.
Create the routing_information block.
/* Start with the terminal hop, and work backwards. */
i = num_hops - 1
/* Encode the terminal hop's routing commands. As the
terminal hop can never have a NextNodeHopCommand, there
are no per-hop alterations to be made. */
ri_fragment = path[i].routing_commands |
ZEROBYTES( PER_HOP_RI_LENGTH - LEN( path[i].routing_commands ) )
/* Encrypt and MAC. */
ri_fragment ^= ri_keystream[i]
mac = MAC( route_keys[i].header_mac, additional_data |
route_group_elements[i] | ri_fragment |
ri_padding[i-1] )
routing_info = ri_fragment
if num_hops < MAX_HOPS:
pad_len = (MAX_HOPS - num_hops) * PER_HOP_RI_LENGTH
routing_info = routing_info | RNG( pad_len )
/* Calculate the routing info for the rest of the hops. */
for i = num_hops - 2; i >= 0; --i:
cmds_to_encode = [ ]
/* Find and finalize the NextNodeHopCommand. */
for j = 0; j < LEN( path[i].routing_commands; j++:
cmd = path[i].routing_commands[j]
if cmd.command == next_node_hop:
/* Finalize the NextNodeHopCommand. */
cmd.MAC = mac
cmds_to_encode = cmds_to_encode + cmd /* Append */
/* Append a terminal NullCommand. */
ri_fragment = cmds_to_encode |
ZEROBYTES( PER_HOP_RI_LENGTH - LEN( cmds_to_encode ) )
/* Encrypt and MAC */
routing_info = ri_fragment | routing_info /* Prepend. */
routing_info ^= ri_keystream[i]
if i > 0:
mac = MAC( route_keys[i].header_mac, additional_data |
route_group_elements[i] | routing_info |
ri_padding[i-1] )
else:
mac = MAC( route_keys[i].header_mac, additional_data |
route_group_elements[i] | routing_info )
At the conclusion of the derivation process:
routing_info - The completed routing_info block.
mac - The MAC for the 0th hop.
Assemble the completed Sphinx Packet Header and Sphinx Packet
Payload SPRP key vector.
/* Assemble the completed Sphinx Packet Header. */
SphinxHeader sphinx_header
sphinx_header.additional_data = additional_data
sphinx_header.group_element = route_group_elements[0] /* From step 1. */
sphinx_header.routing_info = routing_info /* From step 3. */
sphinx_header.mac = mac /* From step 3. */
/* Preserve the Sphinx Payload SPRP keys, to return to the
caller. */
payload_keys = [ ]
for i = 0; i < nr_hops; ++i:
payload_keys += route_keys[i].payload_encryption
At the conclusion of the assembly process:
sphinx_header - The completed sphinx_header, to be returned.
payload_keys - The vector of SPRP keys, to be returned.
Mix nodes process incoming packets first by performing the
Sphinx_Unwrap operation to authenticate and decrypt the
packet, and if applicable prepare the packet to be forwarded to the next
node.
If Sphinx_Unwrap returns an error for any given packet,
the packet MUST be discarded with no additional processing.
After a packet has been unwrapped successfully, a replay detection
tag is checked to ensure that the packet has not been seen before. If
the packet is a replay, the packet MUST be discarded with no additional
processing.
The routing commands for the current hop are interpreted and
executed, and finally the packet is forwarded to the next mix node over
the network or presented to the application if the current node is the
final recipient.
6.1 Sphinx_Unwrap Operation
The Sphinx_Unwrap operation is the majority of the
per-hop packet processing, handling authentication, decryption, and
modifying the packet prior to forwarding it to the next node.
private_routing_key A group element
GROUP_ELEMENT_LENGTH bytes in length, that serves as the unwrapping
Mix’s private key.
sphinx_packet A Sphinx packet to unwrap.
Outputs:
error Indicating a unsuccessful unwrap operation if
applicable.
sphinx_packet The resulting Sphinx packet.
routing_commands A vector of RoutingCommand, specifying
the post unwrap actions to be taken on the packet.
replay_tag A tag used to detect whether this packet was
processed before.
The Sphinx_Unwrap operation consists of the following
steps:
(Optional) Examine the Sphinx Packet Header’s Additional Data.
If the header’s additional_data element contains
information required to complete the unwrap operation, such as
specifying the packet format version or the cryptographic primitives
used examine it now.
Implementations MUST NOT treat the information in the
additional_data element as trusted until after the
completion of Step 3 (“Validate the Sphinx Packet Header”).
Calculate the hop's shared secret, and replay_tag.
Derive the various keys required for packet processing.
keys = Sphinx_KDF( KDF_INFO, shared_secret )
Validate the Sphinx Packet Header.
derived_mac = MAC( keys.header_mac, hdr.additional_data |
hdr.group_element |
hdr.routing_information )
if !CONSTANT_TIME_CMP( derived_mac, hdr.MAC):
/* MUST abort processing if the header is invalid. */
return ErrorInvalidHeader
Extract the per-hop routing commands for the current hop.
/* Append padding to preserve length-invariance, as the routing
commands for the current hop will be removed. */
padding = ZEROBYTES( PER_HOP_RI_LENGTH )
B = hdr.routing_information | padding
/* Decrypt the entire routing_information block. */
B = B ^ S( keys.header_encryption, keys.header_encryption_iv )
Parse the per-hop routing commands.
cmd_buf = B[:PER_HOP_RI_LENGTH]
new_routing_information = B[PER_HOP_RI_LENGTH:]
next_mix_command_idx = -1
routing_commands = [ ]
for idx = 0; idx < PER_HOP_RI_LENGTH {
/* WARNING: Bounds checking omitted for brevity. */
cmd_type = b[idx]
cmd = NULL
switch cmd_type {
case null: goto done /* No further commands. */
case next_node_hop:
cmd = RoutingCommand( B[idx:idx+1+LEN( NextNodeHopCommand )] )
next_mix_command_idx = i /* Save for step 7. */
idx += 1 + LEN( NextNodeHopCommand )
break
case recipient:
cmd = RoutingCommand( B[idx:idx+1+LEN( FinalDestinationCommand )] )
idx += 1 + LEN( RecipientCommand )
break
case surb_reply:
cmd = RoutingCommand( B[idx:idx+1+LEN( SURBReplyCommand )] )
idx += 1 + LEN( SURBReplyCommand )
break
default:
/* MUST abort processing on unrecognized commands. */
return ErrorInvalidCommand
}
routing_commands += cmd /* Append cmd to the tail of the list. */
}
done:
At the conclusion of the parsing step:
routing_commands - A vector of SphinxRoutingCommand, to
be applied at this hop.
new_routing_information - The routing_information block
to be sent to the next hop if any.
Upon the completion of the Sphinx_Unwrap operation,
implementations MUST take several additional steps. As the exact
behavior is mostly implementation specific, pseudocode will not be
provided for most of the post processing steps.
Apply replay detection to the packet.
The replay_tag value returned by Sphinx_Unwrap MUST be
unique across all packets processed with a given
private_routing_key.
The exact specifics of how to detect replays is left up to the
implementation, however any replays that are detected MUST be discarded
immediately.
Act on the routing commands, if any.
The exact specifics of how implementations chose to apply routing
commands is deliberately left unspecified, however in general:
If there is a NextNodeHopCommand, the packet should
be forwarded to the next node based on the next_hop field
upon completion of the post processing.
The lack of a NextNodeHopCommand indicates that the packet is
destined for the current node.
If there is a SURBReplyCommand, the packet should be
treated as a SURBReply destined for the current node, and decrypted
accordingly (See Section 7.2)
If the implementation supports multiple recipients on a single
node, the RecipientCommand command should be used to
determine the correct recipient for the packet, and the payload
delivered as appropriate.
It is possible for both a RecipientCommand and a NextNodeHopCommand
to be present simultaneously in the routing commands for a given hop.
The behavior when this situation occurs is implementation
defined.
Authenticate the packet if required.
If the packet is destined for the current node, the integrity of the
payload MUST be authenticated.
The authentication is done as follows:
derived_tag = sphinx_packet.payload[:PAYLOAD_TAG_LENGTH]
expected_tag = ZEROBYTES( PAYLOAD_TAG_LENGTH )
if !CONSTANT_TIME_CMP( derived_tag, expected_tag ):
/* Discard the packet with no further processing. */
return ErrorInvalidPayload
Remove the authentication tag before presenting the payload to the
application.
A Single Use Reply Block (SURB) is a delivery token with a short
lifetime, that can be used by the recipient to reply to the initial
sender.
SURBs allow for anonymous replies, when the recipient does not know
the sender of the message. Usage of SURBs guarantees anonymity
properties but also makes the reply messages indistinguishable from
forward messages both to external adversaries as well as the mix
nodes.
When a SURB is created, a matching reply block Decryption Token is
created, which is used to decrypt the reply message that is produced and
delivered via the SURB.
The Sphinx SURB wire encoding is implementation defined, but for the
purposes of illustrating creation and use, the following will be
used:
Structurally a SURB consists of three parts, a pre-generated Sphinx
Packet Header, a node identifier for the first hop to use when using the
SURB to reply, and cryptographic keying material by which to encrypt the
reply’s payload. All elements must be securely transmitted to the
recipient, perhaps as part of a forward Sphinx Packet's Payload, but the
exact specifics on how to accomplish this is left up to the
implementation.
When creating a SURB, the terminal routing_commands vector SHOULD
include a SURBReplyCommand, containing an identifier to ensure that the
payload can be decrypted with the correct set of keys (Decryption
Token). The routing command is left optional, as it is conceivable that
implementations may chose to use trial decryption, and or limit the
number of outstanding SURBs to solve this problem.
7.2 Decrypt a
Sphinx Reply Originating from a SURB
A Sphinx Reply packet that was generated using a SURB is externally
indistinguishable from a forward Sphinx Packet as it traverses the
network. However, the recipient of the reply has an additional
decryption step, the packet starts off unencrypted, and accumulates
layers of Sphinx Packet Payload decryption as it traverses the
network.
Determining which decryption token to use when decrypting the SURB
reply can be done via the SURBReplyCommand’s id field, if one is
included at the time of the SURB’s creation.
decryption_token The vector of keys allowing a client
to decrypt the reply ciphertext payload. This decryption_token is
generated when the SURB is created.
payload The Sphinx Packet ciphertext payload.
Outputs:
error Indicating a unsuccessful unwrap operation if
applicable.
message The plaintext message.
The Sphinx_Decrypt_SURB_Reply operation consists of the following
steps:
Encrypt the message to reverse the decrypt operations the payload
acquired as it traversed the network.
for i = LEN( decryption_token ) - 1; i > 0; --i:
payload = SPRP_Encrypt( decryption_token[i], payload )
The process for using a SURB to reply anonymously is slightly
different from the standard packet creation process, as the Sphinx
Packet Header is already generated (as part of the SURB), and there is
an additional layer of Sphinx Packet Payload encryption that must be
performed.
Depending on the mix topology, there is no hard requirement that the
per-hop routing info is padded to one fixed constant length.
For example, assuming a layered topology (referred to as stratified
topology in the literature) MIXTOPO10, where
the layer of any given mix node is public information, as long as the
following two invariants are maintained, there is no additional
information available to an adversary:
All packets entering any given mix node in a certain layer are
uniform in length.
All packets leaving any given mix node in a certain layer are
uniform in length.
The only information available to an external or internal observer is
the layer of any given mix node (via the packet length), which is
information they are assumed to have by default in such a design.
9.2 Additional Data Field
Considerations
The Sphinx Packet Construct is crafted such that any given packet is
bitwise unlinkable after a Sphinx_Unwrap operation, provided that the
optional Additional Data (AD) facility is not used. This property
ensures that external passive adversaries are unable to track a packet
based on content as it traverses the network. As the on-the-wire AD
field is static through the lifetime of a packet (ie: left unaltered by
the Sphinx_Unwrap operation), implementations and
applications that wish to use this facility MUST NOT transmit AD that
can be used to distinctly identify individual packets.
9.3 Forward Secrecy
Considerations
Each node acting as a mix MUST regenerate their asymmetric key pair
relatively frequently. Upon key rotation the old private key MUST be
securely destroyed. As each layer of a Sphinx Packet is encrypted via
key material derived from the output of an ephemeral/static
Diffie-Hellman key exchange, without the rotation, the construct does
not provide Perfect Forward Secrecy. Implementations SHOULD implement
defense-in-depth mitigations, for example by using strongly
forward-secure link protocols to convey Sphinx Packets between
nodes.
This frequent mix routing key rotation can limit SURB usage by
directly reducing the lifetime of SURBs. In order to have a strong
Forward Secrecy property while maintaining a higher SURB lifetime,
designs such as forward secure mixes SFMIX03
could be used.
9.4 Compulsion Threat
Considerations
Reply Blocks (SURBs), forward and reply Sphinx packets are all
vulnerable to the compulsion threat, if they are captured by an
adversary. The adversary can request iterative decryptions or keys from
a series of honest mixes in order to perform a deanonymizing trace of
the destination.
While a general solution to this class of attacks is beyond the scope
of this document, applications that seek to mitigate or resist
compulsion threats could implement the defenses proposed in COMPULS05 via a series of routing command
extensions.
9.5
SURB Usage Considerations for Volunteer Operated Mix Networks
Given a hypothetical scenario where Alice and Bob both wish to keep
their location on the mix network hidden from the other, and Alice has
somehow received a SURB from Bob, Alice MUST not utilize the SURB
directly because in the volunteer operated mix network the first hop
specified by the SURB could be operated by Bob for the purpose of
deanonymizing Alice.
This problem could be solved via the incorporation of a “cross-over
point” such as that described in MIXMINION, for
example by having Alice delegating the transmission of a SURB Reply to a
randomly selected crossover point in the mix network, so that if the
first hop in the SURB’s return path is a malicious mix, the only
information gained is the identity of the cross-over point.
10. Security Considerations
10.1 Sphinx Payload
Encryption Considerations
The payload encryption’s use of a fragile (non-malleable) SPRP is
deliberate and implementations SHOULD NOT substitute it with a primitive
that does not provide such a property (such as a stream cipher based
PRF). In particular there is a class of correlation attacks (tagging
attacks) targeting anonymity systems that involve modification to the
ciphertext that are mitigated if alterations to the ciphertext result in
unpredictable corruption of the plaintext (avalanche effect).
Additionally, as the PAYLOAD_TAG_LENGTH based tag-then-encrypt
payload integrity authentication mechanism is predicated on the use of a
non-malleable SPRP, implementations that substitute a different
primitive MUST authenticate the payload using a different mechanism.
Alternatively, extending the MAC contained in the Sphinx Packet
Header to cover the Sphinx Packet Payload will both defend against
tagging attacks and authenticate payload integrity. However, such an
extension does not work with the SURB construct presented in this
specification, unless the SURB is only used to transmit payload that is
known to the creator of the SURB.
Appendix A. References
Appendix A.1 Normative
References
Appendix A.2 Informative
References
Appendix B. Citing This
Document
Appendix B.1 Bibtex Entry
Note that the following bibtex entry is in the IEEEtran bibtex style
as described in a document called “How to Use the IEEEtran BIBTEX
Style”.
@online{SphinxSpec,
title = {Sphinx Mix Network Cryptographic Packet Format Specification},
author = {Yawning Angel and George Danezis and Claudia Diaz and Ania Piotrowska and David Stainton},
url = {https://github.com/katzenpost/katzenpost/blob/master/docs/specs/sphinx.rst},
year = {2017}
}
COMPULS05
Danezis, G., Clulow, J., “Compulsion Resistant Anonymous
Communications”, Proceedings of Information Hiding Workshop, June 2005,
https://www.freehaven.net/anonbib/cache/ih05-danezisclulow.pdf
MIXMINION
Danezis, G., Dingledine, R., Mathewson, N., “Mixminion: Design of a
Type III Anonymous Remailer Protocol”,
https://www.mixminion.net/minion-design.pdf
MIXTOPO10
Diaz, C., Murdoch, S., Troncoso, C., “Impact of Network Topology on
Anonymity and Overhead in Low-Latency Anonymity Networks”, PETS, July
2010,
https://www.esat.kuleuven.be/cosic/publications/article-1230.pdf
RFC2119
Bradner, S., “Key words for use in RFCs to Indicate Requirement
Levels”, BCP 14, RFC 2119, DOI 10.17487/RFC2119, March 1997,
http://www.rfc-editor.org/info/rfc2119
RFC5246
Dierks, T. and E. Rescorla, “The Transport Layer Security (TLS)
Protocol Version 1.2”, RFC 5246, DOI 10.17487/RFC5246, August 2008,
http://www.rfc-editor.org/info/rfc5246
SFMIX03
Danezis, G., “Forward Secure Mixes”, Proceedings of 7th Nordic
Workshop on Secure IT Systems, 2002,
https://www.freehaven.net/anonbib/cache/Dan:SFMix03.pdf
SPHINX09
Danezis, G., Goldberg, I., “Sphinx: A Compact and Provably Secure Mix
Format”, DOI 10.1109/SP.2009.15, May 2009,
https://cypherpunks.ca/~iang/pubs/Sphinx_Oakland09.pdf
This document defines the replay detection for any protocol that uses
Sphinx cryptographic packet format. This document is meant to serve as
an implementation guide and document the existing replay protect for
deployed mix networks.
1. Introduction
The Sphinx cryptographic packet format is a compact and provably
secure design introduced by George Danezis and Ian Goldberg SPHINX09. Although it supports replay detection,
the exact mechanism of replay detection is neither described in SPHINX09 nor is it described in our SPHINXSPEC. Therefore we shall describe in detail
how to efficiently detect Sphinx packet replay attacks.
1.1 Terminology
Epoch - A fixed time interval defined in section “4.2
Sphinx Mix and Provider Key Rotation” of KATZMIXNET.
Packet - A fixed-length sequence of bytes transmitted
through the network, containing the encrypted message and metadata for
routing.
Header - The packet header consisting of several
components, which convey the information necessary to verify packet
integrity and correctly process the packet.
Payload - The fixed-length portion of a packet
containing an encrypted message or part of a message, to be
delivered.
Group - A finite set of elements and a binary operation
that satisfy the properties of closure, associativity, invertability,
and the presence of an identity element.
Group element - An individual element of the
group.
Group generator - A group element capable of generating
any other element of the group, via repeated applications of the
generator and the group operation.
SEDA - Staged Event Driven Architecture. SEDA 1. A highly parallelizable computation model. 2. A
computational pipeline composed of multiple stages connected by queues
utilizing active queue management algorithms that can evict items from
the queue based on dwell time or other criteria where each stage is a
thread pool. 3. The only correct way to efficiently implement a software
based router on general purpose computing hardware.
1.2 Conventions Used in This
Document
The key words “MUST”, “MUST NOT”, “REQUIRED”, “SHALL”, “SHALL NOT”,
“SHOULD”, “SHOULD NOT”, “RECOMMENDED”, “MAY”, and “OPTIONAL” in this
document are to be interpreted as described in RFC2119.
2. Sphinx Cryptographic
Primitives
This specification borrows the following cryptographic primitives
constants from our SPHINXSPEC:
H(M) - A cryptographic hash function which takes an
byte array M to produce a digest consisting of a
HASH_LENGTH byte array. H(M) MUST be pre-image
and collision resistant.
EXP(X, Y) - An exponentiation function which takes
the GROUP_ELEMENT_LENGTH byte array group elements
X and Y, and returns X ^^ Y as a
GROUP_ELEMENT_LENGTH byte array.
Let G denote the generator of the group, and
EXP_KEYGEN() return a GROUP_ELEMENT_LENGTH
byte array group element usable as private key.
The group defined by G and EXP(X, Y) MUST
satisfy the Decision Diffie-Hellman problem.
2.1 Sphinx Parameter Constants
HASH_LENGTH - 32 bytes. Katzenpost currently uses
SHA-512/256. RFC6234
GROUP_ELEMENT_LENGTH - 32 bytes. Katzenpost currently
uses X25519. RFC7748
3. System Overview
Mixes as currently deployed, have two modes of operation:
Sphinx routing keys and replay caches are persisted to disk
Sphinx routing keys and replay caches are persisted to memory
These two modes of operation fundamentally represent a tradeoff
between mix server availability and notional compulsion attack
resistance. Ultimately it will be the mix operator’s decision to make
since they affect the security and availability of their mix servers. In
particular since mix networks are vulnerable to the various types of
compulsion attacks (see SPHINXSPEC section 9.4
Compulsion Threat Considerations) and therefore there is some advantage
to NOT persisting the Sphinx routing keys to disk. The mix operator can
simply poweroff the mix server before seizure rather than physically
destroying the disk in order to prevent capture of the Sphinx routing
keys. An argument can be made for the use of full disk encryption,
however this may not be practical for servers hosted in remote
locations.
On the other hand, persisting Sphinx routing keys and replay caches
to disk is useful because it allows mix operators to shutdown their mix
server for maintenance purposes without loosing these Sphinx routing
keys and replay caches. This means that as soon as the maintenance
operation is completed the mix server is able to rejoin the network. Our
current PKI system KATZMIXPKI does NOT provide
a mechanism to notify Directory Authorities of such an outage or
maintenance period. Therefore if there is loss of Sphinx routing keys
this results in a mix outage until the next epoch.
The two modes of operation both completely prevent replay attacks
after a system restart. In the case of the disk persistence, replay
attacks are prevented because all packets traversing the mix have their
replay tags persisted to disk cache. This cache is therefore once again
used to prevent replays after a system restart. In the case of memory
persistence replays are prevented upon restart because the Sphinx
routing keys are destroyed and therefore the mix will not participant in
the network until at least the next epoch rotation. However availability
of the mix may require two epoch rotations because in accordance with KATZMIXPKI mixes publish future epoch keys so
that Sphinx packets flowing through the network can seamlessly straddle
the epoch boundaries.
4. Sphinx Packet Replay Cache
4.1 Sphinx Replay Tag
Composition
The following excerpt from our SPHINXSPEC
shows how the replay tag is calculated.
However this tag is not utilized in replay detection until the rest
of the Sphinx packet is fully processed and it’s header MAC verified as
described in SPHINXSPEC.
4.2 Sphinx Replay Tag Caching
It would be sufficient to use a key value store or hashmap to detect
the presence of a duplicate replay tag however we additionaly employ a
bloom filter to increase performance. Sphinx keys must periodically be
rotated and destroyed to mitigate compulsion attacks and therefore our
replay caches must likewise be rotated. This kind of key erasure scheme
limits the window of time that an adversary can perform a compulsion
attack. See our PKI specification KATZMIXPKI
for more details regarding epoch key rotation and the grace period
before and after the epoch boundary.
We tune our bloom filter for line-speed; that is to say the bloom
filter for a given replay cache is tuned for the maximum number of
Sphinx packets that can be sent on the wire during the epoch duration of
the Sphinx routing key. This of course has to take into account the size
of the Sphinx packets as well as the maximum line speed of the network
interface. This is a conservative tuning heuristic given that there must
be more than this maximum number of Sphinx packets in order for there to
be duplicate packets.
Our bloomfilter with hashmap replay detection cache looks like
this:
Note that this diagram does NOT express the full complexity of the
replay caching system. In particular it does not describe how entries
are entered into the bloom filter and hashmap. Upon either bloom filter
mismatch or hashmap mismatch both data structures must be locked and the
replay tag inserted into each.
For the disk persistence mode of operation the hashmap can simply be
replaced with an efficient key value store. Persistent stores may use a
write back cache and other techniques for efficiency.
4.3 Epoch Boundaries
Since mixes publish future epoch keys (see KATZMIXPKI) so that Sphinx packets flowing
through the network can seamlessly straddle the epoch boundaries, our
replay detection forms a special kind of double bloom filter system.
During the epoch grace period mixes perform trial decryption of Sphinx
packets. The replay cache used will be the one that is associated with
the Sphinx routing key which was successfully used to decrypt (unwrap
transform) the Sphinx packet. This is not a double bloom filter in the
normal sense of this term since each bloom filter used is distinct and
associated with it’s own cache, furthermore, replay tags are only ever
inserted into one cache and one bloom filter.
4.4 Cost Of Checking Replays
The cost of checking a replay tag from a single replay cache is the
sum of the following operations:
Sphinx packet unwrap operation
A bloom filter lookup
A hashmap or cache lookup
Therefore these operations are roughly O(1) in complexity. However
Sphinx packets processed near epoch boundaries will not be constant time
due to trial decryption with two Sphinx routing keys as mentioned above
in section “3.3 Epoch Boundaries”.
5.
Concurrent Processing of Sphinx Packet Replay Tags
The best way to implement a software based router is with a SEDA computational pipeline. We therefore need a
mechanism to allow multiple threads to reference our rotating Sphinx
keys and associated replay caches. Here we shall describe a shadow
memory system which the mix server uses such that the individual worker
threads shall always have a reference to the current set of candidate
mix keys and associates replay caches.
5.1 PKI Updates
The mix server periodically updates it’s knowledge of the network by
downloading a new consensus document as described in KATZMIXPKI. The individual threads in the
“cryptoworker” thread pool which process Sphinx packets make use of a
MixKey data structure which consists of:
Sphinx routing key material (public and private X25519 keys)
Replay Cache
Reference Counter
Each of these “cryptoworker” thread pool has it’s own hashmap
associating epochs to a reference to the MixKey. The mix
server PKI threat maintains a single hashmap which associates the epochs
with the corresponding MixKey. We shall refer to this
hashmap as MixKeys. After a new MixKey is
added to MixKeys, a “reshadow” operation is performed for
each “cryptoworker” thread. The “reshadow” operation performs two
tasks:
Removes entries from each “cryptoworker” thread's hashmap that are
no longer present in MixKeys and decrements the
MixKey reference counter.
Adds entries present in MixKeys but are not present in
the thread’s hashmap and increments the MixKey reference
counter.
Once a given MixKey reference counter is decremented to
zero, the MixKey and it’s associated on disk data is
purged. Note that we do not discuss synchronization primitives, however
it should be obvious that updating the replay cache should likely make
use of a mutex or similar primitive to avoid data races between
“cryptoworker” threads.
Appendix A. References
Appendix A.1 Normative
References
Appendix A.2 Informative
References
Appendix B. Citing This
Document
Appendix B.1 Bibtex Entry
Note that the following bibtex entry is in the IEEEtran bibtex style
as described in a document called “How to Use the IEEEtran BIBTEX
Style”.
@online{SphinxReplay,
title = {Sphinx Packet Replay Detection Specification},
author = {David Stainton},
url = {https://github.com/katzenpost/katzenpost/blob/main/docs/specs/sphinx_replay_detection.rst},
year = {2019}
}
COMPULS05
Danezis, G., Clulow, J., “Compulsion Resistant Anonymous
Communications”, Proceedings of Information Hiding Workshop, June 2005,
https://www.freehaven.net/anonbib/cache/ih05-danezisclulow.pdf
KATZMIXNET
Angel, Y., Danezis, G., Diaz, C., Piotrowska, A., Stainton, D.,
“Katzenpost Mix Network Specification”, June 2017,
https://github.com/katzenpost/katzenpost/blob/main/docs/specs/mixnet.md
KATZMIXPKI
Angel, Y., Piotrowska, A., Stainton, D., “Katzenpost Mix Network
Public Key Infrastructure Specification”, December 2017,
https://github.com/katzenpost/katzenpost/blob/main/docs/specs/pki.md
RFC2119
Bradner, S., “Key words for use in RFCs to Indicate Requirement
Levels”, BCP 14, RFC 2119, DOI 10.17487/RFC2119, March 1997,
http://www.rfc-editor.org/info/rfc2119
RFC6234
Eastlake 3rd, D. and T. Hansen, “US Secure Hash Algorithms (SHA and
SHA-based HMAC and HKDF)”, RFC 6234, DOI 10.17487/RFC6234, May 2011,
https://www.rfc-editor.org/info/rfc6234
RFC7748
Langley, A., Hamburg, M., and S. Turner, “Elliptic Curves for
Security”, RFC 7748, January 2016.
SEDA
Welsh, M., Culler, D., Brewer, E., “SEDA: An Architecture for
Well-Conditioned, Scalable Internet Services”, ACM Symposium on
Operating Systems Principles, 2001,
http://www.sosp.org/2001/papers/welsh.pdf
SPHINX09
Danezis, G., Goldberg, I., “Sphinx: A Compact and Provably Secure Mix
Format”, DOI 10.1109/SP.2009.15, May 2009,
https://cypherpunks.ca/~iang/pubs/Sphinx_Oakland09.pdf
SPHINXSPEC
Angel, Y., Danezis, G., Diaz, C., Piotrowska, A., Stainton, D.,
“Sphinx Mix Network Cryptographic Packet Format Specification” July
2017,
https://github.com/katzenpost/katzenpost/blob/main/docs/specs/sphinx.md
3 - Audio Engineering Considerations for a Modern Mixnet
This work was supported by a grant from the Wau Holland
Foundation.
Privacy-enhancing technologies have always faced a challenge in
balancing security guarantees with user experience that would bring
people to the service. In the contemporary communication landscape, most
users expect their messengers to allow for some form of audio
communication. We would therefore like to meet that demand without
compromising anonymity.
The most ambitious private messengers being built today are those
based on modern Mixnet designs. They introduce padding, in addition to a
network of relays, and have to contend with latency. They distinguish
themselves by considering realistic powerful adversaries and endeavoring
to protect both the content of the communication and the metadata. This
turns out to be crucial in the case of audio communication, since in
most common implementations the metadata leaks content, so you can’t
have one without the other. We will therefore discuss implementation
recommendations from an audio engineering point of view in adding audio
communication to a Mixnet. We will look at how we can balance efficiency
and user experience in this unique setting while upholding security
guarantees.
A modern Mixnet typically uses padding in order to mitigate traffic
analysis. This means
that a constant amount of traffic is being sent by a user at all times.
The amount of traffic it allows for is also an upper limit to how much
data a user can send. It is not realistic to set this limit too high,
since it would quickly add up to a significant drain on a user’s
resources. Therefore we should expect any audio communication to either
have to be compressed to a low bitrate, or take a long time to travel
through the network, which can make real-time audio communication
impossible. Therefore we will consider non-synchronous push-to-talk
messaging as an important mode of audio communication in this
setting.
Content
leaks in common encrypted VoIP implementations
It has been demonstrated that encrypted real-time VoIP communications
can produce devastating data leaks by not accounting for the fact that
different phonemes are connected to different bitrates. Already in 2011,
researchers were able to reconstruct sections of conversations from
encrypted connections based on packet traffic patterns alone. Today, the threat is even
more dire due to increased prevalence of Machine Learning and therefore
the automation of powerful statistical analysis. And yet popular communication software
does not address this.
One can typically mitigate this problem by either using constant
bit-rate encoding, therefore increasing the overall file size, or opting
for push-to-talk messaging instead of real-time communication, where you
process the entire recording. This goes back to the Anonymity Trilemma
: to maintain
security guarantees, you have to compromise either on the overhead or on
latency. But a Mixnet with padding has already made these compromises
and so it faces these challenges by design.
Recommendations
for audio encoding and decoding
Encoding and decoding audio in a Mixnet has a unique set of
challenges. We are particularly interested in efficiency, as there may
be a strict bandwidth limit, but at the same time we have access to the
computing power of a modern device and modern audio encoding and
decoding tricks. We can reasonably expect to deal with some latency, and
depending on the transport-layer protocol used we may have to consider
some packet loss. We can also expect a modern Mixnet to prioritize
security and to require its components to use licensing that supports
freedom.
We will assume that padding and/or non-synchronicity allow for an
implementation of VBR encoding without leaking content. We should still
keep in mind security risks that come from haphazard implementations of
VBR decoding, as sometimes they might allow for injection of malicious
code. We should make sure the decoder we’re using has been audited. An
example of a decoder written with security in mind is Rust-based
Symphonia .
The following table is a comparison of file sizes in kBs generated by
VBR encoding in various codecs, starting with a lossless wav file. The
cells are clickable, so the reader can verify the sound quality. For the
HTML version of this table, visit https://brettpreston.github.io/mixnet-samples.
The key takeaways from this table can be summarized as follows. Opus
delivers the best quality of the three codecs, and is the only one that
can handle more than speech. With Opus we observe diminishing returns
with quality above 12 kbps. Frequency band width has a small impact on
the file size. Algorithm complexity has negligible impact on the file
size, it primarily impacts local processing requirements. Codec2
delivers very, very small file size but can’t handle noise or music at
all. We go into detail on all of these points below.
Audio codecs
We have selected audio codecs that can be candidates for use in this
setting. They deliver either impressive sound quality at low bitrates,
or good sound quality and impressively low bitrates, and each has
different strengths. Opus and Speex are under a BSD license, and Codec2
under LGPL.
Opus and Speex
By far the most popular codec today, Opus is used in most modern VoIP
systems. It is a versatile audio codec that is known for its
high-quality, crisp sound reproduction, suitable for a wide range of
applications, including voice communication and music streaming. It has
ready implementations in many programming languages, including Go and Rust . It is also well
documented, simplifying its potential integration into various projects.
Another fine codec for speech compression is Speex, which was popular
in VoIP systems before the rise of Opus. It delivers clear and bright
speech, but is not meant to be used for other sound. There are extensive
resources which compare Opus to Speex , and it is clear that Opus is both more
efficient and more versatile.
Codec2
For this special use case of Mixnets, we may also consider Codec2
because it is capable of extremely efficient encoding. Its efficiency
relies heavily on sinusoidal coding and a narrow frequency band, which
means that we quickly lose clarity and some distinguishing features of
the original speech recording. The simplified harmonic content encodes
less information compared to the popular codecs.
However, in this particular context both the extremely small file
size and the voice masking1 provided by the loss of
distinguishing features may be desirable. It appears that Codec2 was
originally intended for radio broadcast, in which case some of its
shortcomings would be somewhat mitigated by post processing typically
used for radio broadcast. If we were to implement Codec2 but still
wanted to improve clarity, we could consider the following
adjustments:
Pre-processing: implementing a noise filter before encoding. A
demonstration of the effect denoising can have on Codec2 can be found in
section 3 of this analysis.
If trade-offs in the codec itself are acceptable, a wider
frequency band on the higher end and improved handling of noise, both in
noise reduction and encoding of consonants, would go a long way to
improve the sound quality. Out-of-the-box Codec2 uses a very narrow
subset of human voice frequencies and doesn’t handle consonants well,
which means it could have trouble with some consonant-heavy languages.
These adjustments would come at the cost of compression
efficiency.
Post-processing: as it stands now, we can mitigate the losses
after decoding by boosting what little of the higher frequency range
survived. An equalizer is the most resource-efficient way to address
this problem.
Implementing a neural network-based decoder, akin to WaveNet, for
frequency "reconstruction." This is very resource intensive, and so may
not be feasible on most personal devices. One should also keep in mind
that these tools don’t reconstruct the original voice, they create a
clean simulacrum of a human voice which may not sound like the original
speaker.
It should be emphasized that Codec2 is unlikely to provide user
experience on par with Opus and so it is only applicable in a limited
set of use cases.
Bitrate
Naturally, we would like to find an optimal point where the bitrate
is low and the sound quality is good. As can be heard in the samples
above, as well as quantified in , we experience diminishing returns above
12kbps with Opus when it comes to speech compression. As long as
minimizing the file size is a priority, either 12 or 16kbps appears to
be a fine choice. While we are encoding speech, it also doesn’t make
sense to encode multiple channels.
The priorities change if we are encoding music - then higher bitrates
make a big difference, as can be heard in the provided samples. If we
were planning for music streaming we would also be hoping to allow for
(at least) stereo. This is unlikely to come into play in our use case
however, and so we will settle on 12kbps, mono encoding. The following
figure comes from .
If we choose Codec2 there is little reason to go below 3200bps, as
the quality at lower bitrates is not competitive in the modern VoIP
landscape, and recordings with background noise or music result in a
jumbled mess.
Bandwidth
We should use wide band compression. In most settings it has little
impact on the file size, but is a huge gain in audio quality and
clarity.
Frame size
Encoding with Opus, frames lengths under 20ms at low bit rates have
audible distortions as well as frames sizes over 80ms. This
demonstration uses 6VBR wide band, in order to accentuate the
distortion: opus-frames.
In practice, this is a lower bitrate than we would use and so the result
would sound better.
Algorithm complexity
The algorithm complexity impacts the processing power required on a
device more than it does the file size. In our use case, where we expect
the system to be used on modern devices there is no reason not to opt
for higher complexity, since it’s an easy gain in quality without
compromising on the resources that are scarce. The recommendation is
complexity 10, unless we expect to work on older mobile devices with a
real-time audio stream, in which case we may want to choose 5.
Signal
processing, noise reduction and equalizing
Background noise tends to be an issue with voice recordings, we could
implement a noise filter with a small processing footprint, as well as a
dynamic range compressor/limiter before encoding the message. This will
improve the likelihood of a clear and appropriately loudness balanced
audio message.
Noise reduction is recommended pre-encoding for maximum clarity.
State-of-the-art noise suppressors tend to be based on neural networks,
such as . This is somewhat resource
intensive, but not out the question on mobile devices. There is an
analysis of XIPH’s efficacy and efficiency at https://jmvalin.ca/demo/rnnoise/.
A demonstration of this process can be found here.
A potential alternative is an automated Fast Fourier Transform noise
suppressor, however, that is likely to involve extensive customization
of available tools.
When it comes to Codec2 at 1200bit/s to 3200bit/s, an EQ boost of
several decibels at 3kHz is a simple and effective way to improve
consonant clarity. Noise suppression and equalizing can make a big
difference when making Codec2 viable as demonstrated here.
Compare with the original sample here.
Without these adjustments, Codec2 may not meet the quality expectations
of today’s users.
Recommendations summary
Before encoding: noise suppression with XIPH or a similar plugin,
or a heavily customized Fast Fourier Transform process.
In most use cases, encoding with Opus, 12kbps VBR, mono, wide
band, complexity 10, frame size 20ms.
If processing power is scarce, complexity 5.
For extreme file compression, Codec2 3200bps, "Natural" setting. The
quality in Codec2 could be improved by widening the band on the higher
end. Noise suppression is crucial.
A security conscious decoder such as or .
After decoding: EQ boost of several decibels at 3kHz, especially
with Codec2.
Acknowledgments
Special thanks to EJ Infeld for help with formatting and editing this
analysis, as well as providing valuable context about Mixnets, and to
the Wau Holland Foundation for funding this work.
4 - Threat Model
The purpose and
structure of this document
This threat model document is unique in the privacy technology
landscape for its detailed treatment of realistic adversary
capabilities. It is not a description of a superficial, theoretical
system, but rather of complex, real-life software that is being
interrogated and constantly re-designed to provide the best possible
security. We examine it from the point of view of both theoretical
design, networking choices and practical pitfalls.
And still, it is not and will likely never be comprehensive. Various
attacks and countermeasure strategies will be added to this document in
the future, as it keeps evolving. However, we feel that it already
provides a valuable, systematic view of the challenges faced by mixnet
technology.
There exists a rich body of academic work analyzing how one might
disrupt the functioning of a Mixnet or circumvent its security and
privacy guarantees. We have endeavored to compile these decades research
and summarized these attacks in the table on page 3. The table on page 4
focuses on networking security threats that are specific to Katzenpost
protocol choices.
We then delve into the countermeasures employed by Katzenpost and
discuss their efficacy. A special care is taken to discuss the details
of post-quantum cryptographic primitives that we have introduced in
several places of the design.
Introducing the adversary
It is no longer controversial to say that in the modern world, we
face incredibly powerful surveillance adversaries. These could be state,
corporate or criminal actors, vying for our information to use as means
of making profit, manipulating us and others, gaining leverage,
strengthening their authority, or as means of persecution. In many
contexts, we have little hope for non-technical solutions due to lack of
sufficiently powerful pressure in favor of privacy.
And so in a quest for technical solutions, we need equally powerful
tools. In the case of communication tools, the Internet’s bread and
butter, we would like to allow users to interact and exchange
information with reasonable expectation of both the content and metadata
of their communication, and personal information such as a user’s social
graph, being protected from such adversaries. Therefore, we consider an
adversary capable of the following:
The adversary can see the connections of the entire global
internet and is capable of intricate statistical analysis of gathered
data.
The adversary can disable parts of the network.
The adversary can plant or take over some devices in the network
to inject malicious code and manipulate the functioning of the network
or to gain access to the information available to them. The takeover
could happen by technical means or by exercising force outside of the
network.
The adversary has very large (but not infinite) computational
resources, and is capable of cryptanalysis on par with frontier
research.
The adversary has access to a quantum computer, or will have
access to a quantum computer in the near future.
The adversary can supplement collected data with rich context of
already gathered data on all users from other sources.
If we hope for our work to be relevant in the modern world, we can no
longer settle for weak threat models. That is the bar we set for
ourselves at Katzenpost.
Katzenpost mixnet threat
model summary
Firstly, assumptions about the user:
The user acts reasonably and in good faith.
The user obtains an authentic copy of the Katzenpost client and
the mixnet client configuration file.
Secondly, assumptions about the user’s computer:
The computer operates correctly and is not compromised by
malware.
Thirdly, assumptions about the mixnet:
The mixnet only provides internal services and does not have any
"exit nodes" or anything that resembles a proxy service or VPN.
All mixnet protocols are protocols which do not force
interaction.
All mixnet protocols are low bandwidth and latency
tolerant.
Finally, assumptions about the world:
The three core protocols of Katzenpost are configured to use modern
cryptographic primitives which are valid and considered impossible to
break, for example:
PKI Signature Scheme using Edd25519-Sphincs+
NIKE Sphinx using X25519
PQ Noise with pqXX pattern using Xwing
What the user’s Gateway can achieve keeping in mind that typically a
fair sized mixnet will have more than one gateway node:
A Gateway node learns when a given client is online.
A Gateway node learns the client’s IP address.
A Gateway node learns how many messages the client sends and
receives.
A Gateway node does NOT learn the sent message destinations or
the received message origins.
A Gateway node does NOT learn if a given sent or received message
is a decoy or not.
A Gateway node can drop or correupt any sent or received
message.
A Gateway node can spam a user with invalid messages.
A Gateway node can duplicate old messages. However duplicate
outbound messages will be dropped by the first hop as per Sphinx packet
deduplication cache.
What a sufficiently global, passive adversary can achieve:
A GPA can learn who is using the mixnet and where their Gateway
nodes are located.
What a local network attacker can achieve:
A local network can observe when a user is using
Katzenpost.
A local network can block Katzenpost.
What a compromise of the user’s computer can achieve:
After an endpoint device is compromised, an attacker can impersonate
that user, receiving and sending messages. The attacker does NOT learn
the communication correspondent network locations.
What a Service Node can achieve:
A Service Node on the mix network does not know from whence it’s service
request message came. Therefore in general, absent some clever attack,
the Service Nodes learn nothing about the clients that interact with
them.
What a contact can achieve:
A contact can spam a user with messages.
A contact can, to some extent, prove to a third-party that a
message came from a user
A contact can retain messages from a user, forever.
What a random person on the Internet can achieve:
A random person can attempt to DoS the mix network or a specific service
on the mixnet.
A
summary of theoretical security concerns in a Mixnet
Mixnet
attack type
Attack
description
Necessary adversary
capabilities
Intersection, Statistical Disclosure
Attacks
Over time, adversary can glean statistical
information that makes the probability distribution of who Alice is
communicating with non-uniform. Law of Large Numbers implies the
anonymity set tends to the set of clients with identical probability in
the long run to the actual recipient.
The adversary must typically be able to
see messages entering and leaving the network. This is customarily
treated as a PGA, despite only requiring a view of the network’s
perimeter. The adversary must be able to distinguish messages from dummy
traffic, or observe when users are active.
n − 1 Attack
The adversary causes the mix to contain
only messages sent by the adversary, except one. In the context of
continuous time mixing such as with the Poisson mix, this means that the
adversary drops or delays other messages until the mix is empty before
the target message enters the mix. The adversary sees the target message
exit the mix to its next destination.
The adversary must compromise routers
which are upstream from a target mix node so as to be able to block
incoming messages, send messages, as well as be able to tell when a
target message passes through them.
Epistemic Attack
The fact that a client is issued only a
subset of the mix nodes’ directory and encryption keys can leak
information to the adversary.
The adversary has knowledge of the target
client’s view of the network which distinguishes them among clients.
This could happen via a zero day or a design flaw such as not
implementing PIR for discovery.
Denial of Service Attack
The adversary is able to disrupt the
functioning of the service, often by overwhelming its resources.
The adversary has sufficient network and
computational resources to overwhelm the network.
Sybil Attack
The adversary plants a large number of
malicious nodes, and is therefore able to glean partial or complete
information to follow a message through the mix and disrupt the
network.
The adversary has sufficient resources to
take over the network, and the network’s design allows for the creation
of a large number of malicious nodes.
Compulsion Attack
The adversary compels enough honest node
operators to disclose information to follow a message through the mix
network .
The adversary has the necessary force to
compel a sufficient number of honest actors to do the adversary’s
bidding.
Timing Attack
An active adversary manipulates the timing
of the packets passing through compromised routers, or passive adversary
exploits timing information that is leaked despite padding.
The passive attack could happen via a zero
day or design flaw. The efficacy of the active attack needs to be
analyzed with respect to the specific design.
Cryptographic Attacks
The adversary is able to forge a
signature, generate a second hash preimage, decrypt cyphertext or do
other damage assumed to be prevented by the use of cryptography.
The adversary can break the security of
one or more cryptographic primitives through a cryptographic zero day or
sufficient computational resources, or exploit a flaw in their
implementation.
Endpoint Security Attacks
The adversary breaches the security of a
user’s device via an attack not directly related to the mixnet.
The adversary is able to exploit a
technical flaw in the user’s device or compel the user to grant him
access.
Predecessor Attack
The adversary compromises at least one mix
node in each routing topology layer. Eventually a client will randomly
select a bad route where every mix node in the route is
compromised.
The adversary must have the capability to
operate or compromise mix nodes, at least one in each routing topology
layer. See countermeasure section for more details.
Networking security
concerns in Katzenpost
Mixnet
attack type
Attack
description
Necessary adversary
capabilities
Tagging Attack
The adversary exploits some kind of
cryptographic malleability property of the Sphinx packet format in order
to violate the privacy notions of the mix network.
The adversary must be able to witness the
Sphinx payload decryption to determine if it was tagged or not. This
means compromising a Provider for forward packets and compromising a
client’s endpoint device for SURB replies.
Replay Confirmation Attack
If a Sphinx packet is able to be replayed
then the adversary may send the packet many times concurrently in order
to observe the traffic burst in another part of the network.
The mix nodes maintain Sphinx replay
caches in order to prevent replays; the attack is therefore only
possible if there is a replay cache malfunction.
SURB Confirmation Attack
If a client sends many SURBs1 to
another entity on the network, that entity may choose to send out ALL
the SURBs at once in order to observe the traffic burst in another part
of the network.
The adversary is a global passive observer
of the network and participant in the network; additionally the
adversary must be in possession of multiple SURBs created by another
entity on the network.
ARQ Confirmation Attack
The adversary’s goal is to find a specific
ARQ2 client who is currently interacting
on the network by causing targeted outages of entry Providers after the
target service receives a protocol message. To start, half of the entry
Providers are allowed to receive messages. If the adversary observes a
retransmission then it confirms the client is in the group of entry
Providers that we blocked messages to. The adversary continues the
binary search and finds the client’s entry Provider in log(n) time.
The adversary must have access to a target
mixnet service so as to distinguish a message transmission versus a
retransmission. The adversary must also be able block messages from
going to specific mixnet nodes, in this example, entry Providers.
Attack Countermeasures
Here we describe the attack countermeasures currently used by the
Katzenpost mix network software design.
Intersection Attacks
Attack description:
Intersection attacks, also known as long term statistical disclosure
attacks have two basic categories:
The Adversary learns to whom Alice sends messages.
The Adversary learns who sends Alice messages.
Statistical disclosure attacks work to some extent on all anonymous
communication networks. The Katzenpost client and Katzen messaging
protocol is designed to provide partial defense against long-term
intersection attacks as well as sufficient defence against short-term
timing correlation attacks.
The simplest form of this attack assumes a global passive adversary
who watches Alice’s interactions with the mix network. Whenever Alice
sends a message, a set of potential recipients are noted by observing
which clients receive a message shortly after Alice sends her message.
After many hours, days or weeks of noting these sets of potential
recipients, an intersection among these sets may reveal the set of
recipients Alice sends messages to.
The classical mix network literature has described intersection
attacks in terms of a mix network where a passive network observer can
watch individual clients receive messages. This assumption can be
otherwise stated that the adversary observes all the inputs and outputs
of the mix network and thus receives a high granularity of statistical
information.
countermeasure
Katzenpost and the Katzen messaging protocol are designed to provide
partial defense against intersection attacks. Complete defense is not
practical because user behavior is often repetitive and they cannot stay
connected to the mixnet forever. Attack success depends largely on the
adversary’s ability to predict user behavior. If user’s behavior is
overly repetitive this may lead to the success of such attacks.
Although the Katzenpost continuous time mixing strategy provides
defense against short term timing correlation attacks, additional
defense mechanisms are required to defend against longer term
attacks:
async message queueing and retrieval at the network edge
traffic padded message retrieval
loop decoy traffic
uniform traffic patterns (all sent messages result in a SURB
reply)
The Katzenpost chat protocol known as Katzen, uses an additional
network route to provide another indirection to protect the network
location of clients. In other words, while Katzen clients connect to the
mixnet using a randomly selected entry Provider, they retrieve messages
from a different Provider mix node on the network; message retrieval is
done by means of a Sphinx SURB, single use reply block which is sent to
the messaging queue service so that a reply containing a message payload
can be sent back to the client, anonymously. All sent messages result in
a SURB reply being sent back to the client.
Katzenpost clients periodically send loop decoy messages; these
Sphinx packets are sent to a randomly selected Provider whose echo
service sends the client’s packet payload back to the client via the
attached SURB. However, loop decoy messages are only distinguishable
from normal messages to the client that receives them. Passive network
observers will not be able to tell the difference. These decoy loops are
uniformly distributed among all of the Providers (AKA service/exit mix
nodes).
Whenever clients retrieve messages from their locally connected entry
Provider, they do so using a traffic padded protocol that either sends
them 0 or 1 message where both outcomes are indistinguishable from the
perspective of a passive network observer.
n − 1
Attacks
attack description:
An n − 1 attack is a multi
stage attack where the adversary observes a target message enter the
mixnet and must perform the attack in order to follow the message to the
next hop. The n − 1 attack is
performed repeatedly for each hop in the route in order to discover the
final destination.
Although the adversary could simply compromise each mix node in the
route starting with the first hop, that is the compulsion attack
category and is a distinct attack category from the n − 1 attack category. The n − 1 attack is performed by the
adversary compromising upstream routers so that they have the capability
of watching messages enter the target mix, blocking any of those
messages if they choose to, and sending messages of their own into the
target mix node. By using these capabilities the adversary is able to
manipulate mix nodes so that they only contain the target message and
messages sent by the adversary.
For a good introduction to n − 1 attacks, please see . In the
context of continuous time mixing strategies like "Stop and Go" and
Poisson , the n − 1 attack is
performed by the adversary blocking or delaying (although delaying
obviously wouldn’t work for Stop and Go) incoming messages ahead of time
so that they are reasonably certain the mix is empty before the target
message enters the mix.
When the target message enters the empty mix, it is artificially
delayed by the mixing strategy and then routed to the next hop. The
adversary gets to observe where the message is going for it’s next hop
because they are reasonably sure that the message exiting the mix,
although it is bitwise unlinkable because of the cryptographic
transformation, it must be the same message.
countermeasure
: Katzenpost currently does not have any countermeasures in place for
n − 1 attacks. See Future
Countermeasures section below.
Epistemic Attack: route
fingerprinting
attack description:
A route fingerprinting attack is when the adversary is able to
identify a client by the specific route being used.
countermeasure:
Katzenpost doesn’t allow clients to have a partial view of the
network. The directory authority system publishes the full network view
to be cached by the edge nodes, Providers, so that clients can retrieve
them.
Denial of Service
attack description:
Sending many packets into the mix network can cause the mix nodes to
become overwhelmed and begin dropping packets. The logical conclusion to
this scenario is that there is effectively a network outage until the
adversary stops sending so much traffic.
countermeasure:
Rate limiting individual clients is the current countermeasure.
However this only stops the DOS attack from being conducted by a single
client entity. However the adversary could still DOS the network by
using many clients to send packets.
Sybil Attack
attack description:
The adversary plants a large number of malicious nodes, and is
therefore able to glean partial or complete information to follow a
message through the mix network.
countermeasure:
We mitigate Sybil attacks by preventing mix nodes from automatically
joining the network. A prerequesite for joining the network is to have
all the directory authorities add the new mix node’s connection
information and public cryptographic key material to their
configuration. Please see the Future Countermeasures section below for a
discussion of additional directory authority features including a
reputation system.
Compulsion Attack
attack description:
The adversary compels enough honest node operators to disclose
information to follow a message through the mix network.
countermeasure:
Our current countermeasure for the compulsion attack is frequent mix
key rotation, every 20 minutes. See Future Countermeasures section
below.
Timing Attacks
attack description:
An active adversary manipulates the timing of the packets passing
through compromised routers, or passive adversary exploits timing
information that is leaked despite padding.
Currently, there are no known timing attacks against any Katzenpost
protocols. Timing correlation attacks are already covered in the
intersection attack category. And although all mix network protocols
leak statistical information no matter what countermeasures are used, we
posit that this leaked statistical information isn’t really the same
thing as traditional timing attacks against a cryptographic system. In
fact, the mix network is actively preventing timing attacks injecting
latency into the system.
countermeasure:
No known timing attacks and therefore no countermeasure.
Cryptographic Attacks
attack description:
There are no known cryptographic attacks against Katzenpost core
protocols (sphinx, noise, dirauth). However we explore theoretical
cryptographic attacks in the Cryptographic Protocols section below.
countermeasure:
All core Katzenpost protocols make use of hybrid post quantum
cryptographic constructions which in theory protect against active
quantum adversaries.
Endpoint Security Attacks
attack description:
The adversary breaches the security of a user’s device via an attack
not directly related to the mixnet.
countermeasure:
There are no countermeasures provided by Katzenpost for endpoint
security because it’s considered an orthogonal concern.
Tagging Attack
attack description:
The Sphinx cryptographic packet format allows for a one bit tagging
attack under certain circumstances. The reason for allowing the design
to have this security defect is to allow for the Single Use Reply Block.
The Sphinx header is MAC’ed but the packet body is not. Instead, the
body is encrypted with a wide-block cipher (an SPRP). This ensures that
an expected verification block in the beginning of the plaintext can be
used to verify the plaintext in the final decryption. If a bit in the
payload ciphertext gets flipped then the payload decryption will yield
garbled results and the expected verification block will not be present.
Therefore in order to make use of this to perform a tagging attack, the
adversary must have access to the result of the payload decryption as
well as the ability to tag the packet some number of hops earlier in the
route. We call this a one bit tagging attack because it yield one bit of
information: Either the verification block was destroyed or not.
In Katzenpost there are two ways to use Sphinx to send a payload.
Forwards packets and SURB reply packets. Both of these Sphinx packet
types are susceptible to a one bit tagging attack:
tagging attack
against forward Sphinx packets:
Clients send forwards Sphinx packets to mixnet services which reply
via a SURB in the payload. Let’s say an adversary "tags" a forward
Sphinx packet sent by Alice. The adversary would have to compromise or
collude with the service Providers on the mixnet in order to witness the
forward packet payload decryption failure which indicates the tag.
tagging
attack against SURB replay Sphinx packets:
If an adversary "tags" a SURB reply which a mixnet service sends to a
client, then only the client will be able to witness the packet payload
decryption failure. The adversary would have to compromise the client’s
endpoint device to witness this event (or to compromise the key
materials allowing them to compute the failed payload decryption
themselves).
countermeasure:
In the context of a forward Sphinx tagging attack on Katzenpost, the
adversary must compromise or collude with the destination service
Provider. If that’s the case then attack allows the adversary to learn
which Provider node and service the packets was destined for. Although
this is valuable information in the context of the current Katzen
protocol, see the Future Countermeasures section below for a discussion
of how we plan to mitigate intersection attacks in the future because it
also carries over to much greater defense against this forward payload
tagging attack.
countermeasure:
We could encode the last hop’s Sphinx routing command, inside the
Sphinx payload instead of the header. This would provide short term
plausible deniability in the sense that an adversary conducting a
tagging attack would be destroying the routing information so that they
cannot know if the packet was a decoy or not.
Replay Confirmation Attack
attack description:
If a Sphinx packet were allowed to be replayed then the adversary may
send the packet many times concurrently in order to observe the traffic
burst in another section of the network.
countermeasure:
Katzenpost mix nodes maintain a relay cache which prevents Sphinx
packets from being replayed. This cache doesn’t grow forever since it’s
only kept until the end of the epoch which are currently only a 20
minute duration.
SURB Confirmation Attack
attack description:
If a client sends many SURBs to another entity on the network, that
entity may choose to send out ALL the SURBs at once in order to observe
the traffic burst in another part of the network. This works as an entry
node discovery attack.
Although currently, all Katzenpost protocols only send one SURB at a
time, this attack still applies if the adversary accumulates enough
SURBs to form a visible traffic burst within the mix network.
countermeasure:
No countermeasure. See Future Attack Countermeasure section below for
the discussion of how to countermeasure this attack.
ARQ Confirmation Attack
attack description:
See above table entry for ARQ confirmation attack description.
countermeasure:
Currently, no countermeasure.
Predecessor Attack
attack description:
A bad route is defined as a route in which every node is compromised.
The goal of such an attack is to link a given client with a specific
destination or service on the destination node. This attack is also
known as the Predecessor Attack and is detailed in with many variations
for all the different types of anonymous communication networks. In the
context of the Katzenpost mixnet, the Predecessor Attack is performed by
the adversary compromising at least one node in each routing topology
layer. Clients using the mixnet will eventually select a bad route.
countermeasure:
Fundamentally, we have two choices, either we have clients select a
new route for each message sent or they select one route and use that
for some time duration. In the former, every time a message is sent, the
probability of selecting a bad route is increased. Whilst in the later,
if a client selects a bad route they use it many times, but the
probability of selecting a bad route is reduced.
Yet another countermeasure is to design the mixnet protocols such
that they use a new destination for each message using some kind of
private deterministic permutation achieving a uniform distribution of
message amongst the destination mixnet nodes and their message slots. We
have chosen this last countermeasure for Katzenpost and it will be
detailed elsewhere in our literature.
Future Countermeasures
Intersection Attacks
The new Katzen protocol is sometimes referred to as scatter queue.
Two communicating parties each exchange shared secrets which they use to
determine a new "mailbox" for each message. To be clear, this new
protocol is an improved revision of the previous Katzen protocol where
each party chooses their own "mailbox" (queue Provider + queue ID); the
difference here is that instead of the two parties exchanging mailbox
locations they exchange seeds which are used to determinically generate
mailbox locations for each message.
This new protocol still uses all four previously mentioned mechanisms
to achieve countermeasure against intersection attacks however the new
"scatter queue" design drastically reduces the amount of metadata which
can be collected by the operators of the mailbox Provider mix nodes. We
think this is a huge improvement to the threat model. But it would be
great if we could quantify the improvement using various anonymity
metrics. Firstly, Shannon entropy seems applicable here because we can
make statements like "compared to the old protocol, scatter queue
increases the entropy on Providers where malicious adversaries are
trying to correlate communicating party sets with messages arriving at
specific mailboxes"; the new protocol makes this infeasible.
Therefore we can say that the new Katzen messaging protocol mitigates
or partially mitigates intersection attacks by means of five
mechanisms:
async message queueing and retrieval at the network edge
traffic padded message retrieval
loop decoy traffic
uniform traffic patterns (all sent messages result in a SURB
reply)
scatter queue
n − 1
Attacks
Here we will attempt to describe a partial countermeasure wherein
clients receive statistical information from the network which is
cryptographically signed by it’s authors. Client use this data to decide
if there’s an ongoing n − 1
attack, if there is they disconnect from the network and try again
later.
There are two sources of information about n − 1 attacks:
mix loops
client loops
Mix loops vs client loops
In theory mix loops can detect n − 1 attacks in the context of a
continuous time mix. Such an attack means the adversary is dropping or
delaying messages before they enter the mix. Therefore the mix
originating loop decoy message can function as a sort of heartbeat
protocol that allow the mix to detect n − 1 attacks. Obviously this mix
loop decoy message might get dropped by the network for various reasons
that have nothing to do with an n − 1 attack. The red green blue
heartbeat mixnet paper (by george) suggests the countermeasure of the
individual mixes halting their routing of messages temporarily to thwart
the n − 1 attacks. This would
work but it would also probably create unnecessary outages. Instead we
want a system that let’s the client software decide whether or not there
is an ongoing n − 1 attack.
Clients can also detect such attacks with their own end to end loop
decoy messages. However we want the mixes to publish a signed
certificate containing their mix loop statistics. Client will then
download these mix loop statistics from the providers and they will use
those statistics along with their own client loop statistics to make
decisions with regards to n − 1 network status.
TODO: add detailed description of client heuristics for
deciding if there’s an n-1 attack
Core Cryptographic Protocols
Katzenpost consists of three cryptographic protocols:
PKI/Dirauth
PQ Noise
Sphinx
Katzenpost is an overlay network meaning that we aren’t trying to
replace IP (internet protocol). Overlay means we build protocol layers
that sit on top of existing Internet protocols. Currently Katzenpost
works over TCP/IP however in the future we plan to support QUIC/IP as an
optional transport that can be selected.
Katzenpost uses a PQ Noise based protocol known as the Katzenpost
wire protocol, which provides point to point transport security and
authorization. The wire protocol enforces the mix network’s topology
whereby the clients are only allowed to connect to gateway nodes,
gateway nodes are only allowed to send packets to layer 1 mixes, and
layer 1 mixes are only allowed to send packets to layer 2 mixes etc.
Clients use the wire protocol to talk to gateway nodes to whom they
send Sphinx packets. These Sphinx packets are encapsulated within the
encrypted PQ Noise messages and are therefore never exposed to passive
network observers but if they were there wouldn’t in principle be any
problem with that. This redundancy in security is often referred to as
"defense in depth".
Besides within the mixnet itself, the wire protocol is also used to
directly communicate with the directory authorities. Gateway nodes
retrieve the latest PKI document from the directory authorities and
cache the document for the epoch duration so that clients can download
the cached copy. This is a notably different use case because within the
mixnet we should have the goal of padding all the wire protocol commands
to be the same size. Whereas when gateways nodes download the consensus
they are likely receiving PKI documents which are perhaps many times
bigger than our Sphinx packet size.
The PKI/Directory authority protocol stands apart from the rest
because it’s the root of all authority within the mix network. The PKI
provides the network participants with all the connection information
and key materials they need to use the other two protocols, PQ Noise and
Sphinx. It does so by publishing a PKI document every epoch (currently
20 minutes). This is necessary because the mixes destroy their old mix
keys and create new mix keys for each new epoch thereby reducing the
window for compulsion attacks to the epoch duration.
Both the PQ Noise based wire protocol AND our Sphinx protocols are
considered to be transport protocols. However the dirauth as the 3rd
cryptographic protocol here refers to two aspects:
The client and mixnet interactions with the dirauth system; That
is, the pki document itself it signed by a majority of the dirauth nodes
AND the pki document contains the mix descriptor for each mix node in
the network. The document also specifies the topology. Mix nodes and
clients verify these cryptographic signatures.
The dirauth’s crash fault consensus cryptographic protocol for
publishing new PKI documents every epoch.
Katzenpost PKI / Directory
Authority
The public key infrastructure (PKI) protocol for Katzenpost, also
known as the Directory Authority or dirauth, is a decentralized system
of nodes which vote for each epoch’s consensus document. If we used a
BFT protocol instead then the dirauth system would fail when 1/3 + 1
nodes failed. Therefore we can say that our crash fault tolerant system
is more robust because it will fail when 1/2 + 1 nodes fail.
The Katzenpost PKI is the security root of the entire system because
all clients and network nodes will depend on the PKI to sign the
consensus document for each epoch. Currently epoch duration is every 20
minutes. The consensus document is essentially a view of the network, it
contains all the connection information and all the public cryptographic
key materials and signatures. Each mix node signs it’s descriptor and
uploads it to the dirauth nodes. Each dirauth node signs the consensus.
When clients or nodes download the consensus document they are able to
verify the dirauth node signatures on the document.
Currently we use a hybrid signature scheme consisting of the
classical Ed25519 and the post quantum stateless hash based signature
scheme known as Sphincs+ with the parameters: ‘sphincs-shake-256f‘
The Katzenpost Noise
Protocol Layer
Early versions of Katzenpost used the Noise cryptographic protocol
framework; however we used an HFS (hybrid forward secret) variation of
XX handshake that used a post quantum KEM however it could not resist
active quantum adversaries since the initial keys exchanged were
classical ECDH public keys. Such constructions offer protections against
current classical adversaries that record ciphertext transcripts in
hopes of breaking them in the future with a cryptographically relevant
quantum computer.
More recently, Katzenpost was made to use PQ Noise from the paper,
entitled, Post Quantum Noise . The paper shows us that we can
algebraically transform existing classical Noise handshake patterns into
post quantum handshake patterns by replacing all usages of ECDH with
KEM. In some of these transformations there’s additional network
interactions implied.
Our current, hybrid KEM uses our security preserving KEM combiner and
the NIKE to KEM adapter (ad hoc hashed el gamal construction). Our Noise
protocol string is:
Noise_pqXX_Kyber768X25519_ChaChaPoly_BLAKE2b
Which means that our PQ Noise protocol uses the following
cryptographic primitives:
We use the PQ Noise handshake pattern known as pqXX
which is expressed in the PQ Noise pattern language like so:
-> e
<- ekem, s
-> skem, s
<- skem
Expressed as a sequence diagram, pqXX looks like this:
Client sends there ephemeral public key (e).
Server sends it’s static public key (s), encrypted with the KEM
ciphertext (ekem) keyed to client’s public ephemeral key.
Client sends their static public key (s) encapsulated via KEM
ciphertext (skem) keyed to server’s static public key.
Server sends a KEM ciphertext (skem) encapsulated using the
client’s static public key.
future improvement, option 1:
Remove the "retrieve message" command which client’s use to poll for
new messages. Instead the client - server Noise protocol should be
designed such that clients periodically receive messages from the server
without requesting or polling for them. If no message is present in the
message queue on the server then the server will send the client a decoy
message.
future improvement, option 2:
Replace the "retrieve message" command with a "send and retrieve"
command whereby everytime the client sends a message they also receive a
message. As per usual, perhaps some of the messages send and received
are decoy messages.
Classical Sphinx and
Post Quantum Sphinx
The original Sphinx paper introduces the Sphinx nested encrypted
packet format using a NIKE 3. NIKE Sphinx can be a
hybrid post quantum construction simply by using a hybrid NIKE. Our
Sphinx implementation also can optionally use a KEM 4
instead of a NIKE, however the trade-off is that the packet’s header
will take up a lot of overhead because it must store a KEM ciphertext
for each hop. Katzenpost has a completely configurable Sphinx geometry
which allows for any KEM or NIKE to be used.
The Sphinx cryptographic packet format also uses these additional
cryptographic primitives, the current Katzenpost selection is:
stream cipher: CTR-AES256
MAC: HMAC-SHA256
KDF: HKDF-SHA256
SPRP: AEZv5
In Katzenpost the dirauths select the Sphinx geometry, each dirauth
must agree with the other dirauths. They publish the hash of the Sphinx
Geometry in the PKI document so that the rest of the network entities
can validate their Sphinx Geometry. At the time of writing the namenlos
network still uses classical Sphinx with the following geometry:
In the Katzenpost implementation of Sphinx, we MAC an unencrypted two
byte region at the beginning of the Sphinx packet; This additional data
region is to be used to match Sphinx version numbers.
Mixnet Attack Trees
The above attack tree consists of all OR nodes because each of the
leaves are alternative ways to achieve the sub-goal expressed by their
branch which in turn, each branch, e.g. physical access, compromise
human operator, compromise software are each alternatives to the overall
goal of compromising the mix node.