Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Onboarding

Registration

  1. Contact info@dnstapir.se (?)
  2. Provide User/Organisation, contact person, contact information (email/phone)
  3. Provide PGP key, Signal user handle or other trusted out-of-band channel.

Contract

  • In the test phase of DNS TAPIR all users need a contract with the TAPIR Core operations partner (The Swedish Internet Foundation, IIS)
  • Consumer only might not need this in the future.

Enrollment Key

  • After the contract is signed, the organisation/user will receive enrollments credentials from a DNS TAPIR representative on a trusted out-of-band channel.

Consumer only

  • Not available in first phases

Offboarding

Sad to see you leave but when all other options are not available this might happen

  1. Terminate the contract with the operations provider
  2. Keys will be de-activated and prevented from renewal. This will be matched with the termination date of contract.
  3. Uninstall TAPIR EDM, TAPIR POP, TAPIR-CLI
  4. Data will no longer be sent to TAPIR Core

Getting the Packages

Debian-based

Prerequisites

Make sure you have all the build requirements installed:

apt install git make golang build-essential

Building the Package

We currently do not provide any official debian builds. However, it is possible to build debian packages from a cloned repo. To build debian packages for an edge installation, start by cloning these repositories:

git clone https://github.com/dnstapir/pop
git clone https://github.com/dnstapir/cli
git clone https://github.com/dnstapir/edm

For each cloned repo, build the corresponding package by issuing:

make deb

Then install it with:

dpkg -i path/to/package

RPM-based

DNS TAPIR provides three rpm packages for an Edge installation: dnstapir-pop, dnstapir-edm and dnstapir-cli. They are built using Fedora's public Copr instance with the @dnstapir group. Currently, packages are being built for EPEL 10, EPEL 9, Fedora 42,43 and OpenSUSE Leap 15.6.

Packages in the @dnstapir/edge-testing repo are signed with this PGP key:

07FC 9787 0134 6ED4 522A 17E7 2C4D 4FAC 02CF 0AC2

Packaging code lives side-by-side with the source code in the respecive repos:

Enable the repositories in your package manager:

dnf

dnf copr enable @dnstapir/edge-testing

zypper

zypper ar https://copr.fedorainfracloud.org/coprs/g/dnstapir/edge-testing/repo/opensuse-leap-15.6/group_dnstapir-edge-testing-opensuse-leap-15.6.repo

And install them:

dnf

dnf install dnstapir-pop dnstapir-cli dnstapir-edm

zypper

zypper in dnstapir-pop dnstapir-cli dnstapir-edm

Building an RPM Package Locally

It is also possible to build and RPM package locally. This requires make, golang, rpmbuild and git. To build it, start by cloning the desired repo. This example will use pop

git clone https://github.com/dnstapir/pop

Then, change into the newly cloned repo and issue

make rpm

This will build an RPM package that can then be installed with your preferred package manager.

Managing permissions

Three system users, dnstapir-pop, dnstapir-edm and dnstapir-renew, and a group, dnstapir, will have been created. Add your administrator user to this group for easier bootstrapping and maintenance:

sudo usermod -a -G dnstapir <USERNAME>

Log out and back in and make sure the new group membership is in effect before proceeding.

Enrolling with the DNS TAPIR Node Manager

To connect with DNS TAPIR Core, an Edge node needs to be enrolled. You should have received enrollments credentials from a DNS TAPIR representative on a trusted out-of-band channel. They will look something like the following:

{
    "name":"enroll-example.test.dnstapir.se",
    "key":{
        "kty":"OKP",
        "kid":"123456789012345678901234",
        "alg":"EdDSA",
        "crv":"Ed25519",
        "x":"ABCD_EFGHIJKLMNO_PQRSTUVW_XYZ123456789_12345",
        "d":"abcdefghijklmno_pqrstuvwxyz_123456789012345"
    },
    "nodeman_url":"https://nodeman.test.dnstapir.se/"
}

Store the credentials in a file on the node that is to be enrolled. Then run:

sudo -g dnstapir dnstapir-cli --standalone enroll --enroll-credentials <PATH TO ENROLL CREDS>

The reason for running with sudo -g dnstapir is that, apart from exchanging cryptographic material with DNS TAPIR Core, the above enrollment command also generates a number of config files under /etc/dnstapir (by default). They need to have dnstapir as the group owner so that they can be used by the three system users mentioned before.

Generating Configuration Without Enrolling

Sometimes it might be desireable to re-generate the configuration boilerplate without actually enrolling. This can be done by issuing:

sudo -g dnstapir dnstapir-cli --standalone enroll -L path/to/response -c path/to-credentials

This will use information in a local copy of a response from a previous enrollment to generate the configuration boilerplate.

Edits to the Configuration Boilerplate

The configuration generated in the enrollment step contains sensible defaults for most deployments. However, some final touches need to be made before it can be properly integrated with a recursive resolver and with DNS TAPIR Core. By default, these files will be generated under /etc/dnstapir. This guide uses the default.

pop-outputs.yaml

Edit this file where annotated (1 location) with the destination to which POP will be sending DNS NOTIFY messages about changes to the RPZ zone it has generated based on the observations from Core and on the local policies.

dnstapir-pop.yaml

Edit this file where annotated (1 location) with the interface on which POP will listen to incoming zone transfer requests for the RPZ zone it has generated.

dnstapir-edm.toml

Edit this file where annotated (2 locations) with a strong secret/password and a DNSTAP interface. The secret is used when pseudonymizing the recursive traffic with Crypto-PAn. The DNSTAP interface is the IP + port where EDM listens for DNSTAP traffic from the recursive resolver.

Edge services

In addition to a recursive resolver, an Edge installation consists of three services.

dnstapir-pop

The dnstapir-pop service is provided by the dnstapir-pop package. It consists of a daemon process that communicates with a DNS TAPIR Core instance over MQTT and with a recursive resolver using zone transfers. It receives observations about domain names over MQTT and, based on a locally configured policy, produces an RPZ zone which it transfers to a recursive resolver.

dnstapir-renew

The dnstapir-renew service is installed by the dnstapir-cli package. It automates the process of renewing mTLS certificates (used to secure the MQTT connection) by issuing dnstapir-cli commands on a systemd timer.

dnstapir-edm

The dnstapir-edm service is installed by the dnstapir-edm package. It consists of a daemon process that communicates with a recursive resolver using DNSTAP and a DNS TAPIR Core instance over MQTT and HTTPS. It receives DNSTAP data from the resolver, which it anonymizes and sends to the Core instance in aggregates using HTTPS. Certain events, such as domain names being encountered for the first time, is sent over MQTT to the same Core instance.

Installation footprint

A typical DNS TAPIR Edge installation will have the following footprint.

Users and Groups

Users:

  • dnstapir-pop, systemd service user
  • dnstapir-edm, systemd service user
  • dnstapir-renew, systemd service user

Groups:

  • dnstapir, common group for service users and sysadmin account

Files and Folders

Files:

  • /usr/bin/dnstapir-pop, executable for DNS TAPIR POP
  • /usr/bin/dnstapir-edm, executable for DNS TAPIR EDM
  • /usr/bin/dnstapir-cli, executable for POP management and certificate renewal service
  • /usr/lib/systemd/system/dnstapir-pop.service, service unit for DNS TAPIR POP
  • /usr/lib/systemd/system/dnstapir-edm.service, service unit for DNS TAPIR EDM
  • /usr/lib/systemd/system/dnstapir-renew.service, service unit for certificate renewal
  • /usr/lib/systemd/system/dnstapir-renew.timer, timer unit for certificate renewal

Folders:

  • /etc/dnstapir, for configuration
  • /var/log/dnstapir, for DNS TAPIR POP logging
  • /var/lib/dnstapir/edm, for DNS TAPIR EDM runtime state

Start the services

sudo systemctl start dnstapir-pop
sudo systemctl start dnstapir-edm
sudo systemctl start dnstapir-renew

Enable the services

sudo systemctl enable dnstapir-pop
sudo systemctl enable dnstapir-edm
sudo systemctl enable dnstapir-renew

Sample Config for Resolvers

Unbound

Note that for unbound to support DNSTAP, the flag --enable-dnstap must be passed at compile time. Make sure your unbound package is doing this. After doing so, edit unbound.conf to contain the following:

rpz:
  name:         dnstapir
  primary:      <IP>@<PORT> # Must match config in dnstapir-pop.yaml
  zonefile: 	"/var/run/unbound/dnstapir.zone"
  rpz-log:      yes
  rpz-log-name: dnstapir

dnstap:
  dnstap-enable: yes
  dnstap-ip: <IP>@<PORT> # Must match config in dnstapir-edm.yaml
  dnstap-tls: no
  dnstap-log-client-query-messages: yes
  dnstap-log-client-response-messages: yes
  dnstap-send-identity: yes
  dnstap-send-version: yes

server:
    module-config: "respip validator iterator" # "respip" module needed for RPZ

Post-installation

Check out the post-installation docs for info on how to to basic connectivity checks and verify that your Edge is running as intended.

Checking connectivity

The DNS TAPIR Looptest Domain

As part of our test deployment of DNS TAPIR Core, we have set up a special domain for testing connectivity, looptest.dnstapir.se.. It currently has two uses.

The "Ticker" Test

To ensure that a POP receives observations from Core, Core will periodically send out observations with observation encoding flag 1024 set. By issuing dnstapir-cli filterlists on a running system, you should be able to see the following:

operator@edge $ dnstapir-cli filterlists
Domain                                                                     |Source              |Src Fmt             |Filter    |Flags     
---------------------------------------------------------------------------------------------------------------------------------------
# ...snip...
epoch-1761739157.ticker.looptest.dnstapir.se.                              |dns-tapir           |tapir-msg-v1        |doubt     |1024      
epoch-1761737417.ticker.looptest.dnstapir.se.                              |dns-tapir           |tapir-msg-v1        |doubt     |1024      
epoch-1761738377.ticker.looptest.dnstapir.se.                              |dns-tapir           |tapir-msg-v1        |doubt     |1024      
epoch-1761738677.ticker.looptest.dnstapir.se.                              |dns-tapir           |tapir-msg-v1        |doubt     |1024
# ...snip...

The presence of the ticker.looptest.dnstapir.se. observations indicates that connectivity from Core to your POP is working.

The "From-edge" Test

To ensure that your resolver can connect to your EDM, that your EDM can connect to Core and that Core can connect to your POP, queries that follow a specific pattern will cause a corresponding observation to be sent out by Core, again using the 1024 flag.

From a machine that can connect with the resolver on your Edge, run:

dig @<your resolver> <unique label>.from-edge.looptest.dnstapir.se

If the qname is something your EDM sees for the first time, it will send an event to Core. Core will recognize the domain as our looptest domain, flag it and then send out an observation to all Edges.

You should be able to see that the "loop is closed" by issuing dnstapir-cli filterlists on your Edge system:

operator@edge $ dnstapir-cli filterlists
Domain                                                                     |Source              |Src Fmt             |Filter    |Flags     
---------------------------------------------------------------------------------------------------------------------------------------
# ...snip...
<unique-label>.from-edge.looptest.dnstapir.se.                             |dns-tapir           |tapir-msg-v1        |doubt     |1024
# ...snip...

It might be helpful to grep for your query since there may be a lot of other domains listed in the output. Seeing your query in the output indicates that your resolver is communicating with your EDM, your EDM is communicating with Core and Core is communicating with your POP.

Note that other DNS TAPIR users will be able to see your looptests since observations are being sent out to all enrolled POPs. No profanity!

Verify that TAPIR Core receives histograms from TAPIR EDM: ....

TODO

Verify that TAPIR POP receives observations from TAPIR Core: ....

TODO

TAPIR-POP: DNS TAPIR Policy Processor

The DNS TAPIR Policy Processor, TAPIR-POP, is the component that processes the intelligence data from the TAPIR-Core (and possibly other sources) and applies local policy to reach a filtering decision.

It is the connection between TAPIR Core and the Edge platform. It manages local configurations and gets updates from TAPIR Core with alerts and config changes.

TAPIR-POP is responsible for the task of integrating all intelligence sources into a single Response Policy Zone (RPZ) that is as compact as possible. The RPZ file is used by the DNS resolver to implement blocklists and other policy-related functions.

A unified single RPZ zone instead of multiple sources

TAPIR-POP presents a single output with all conflicts resolved, rather than feeding the resolver multiple sources of data from which to look for policy guidance, where sources can even be conflicting (eg. a domainname may be flagged by one source but allowlisted by another).

The result is smaller, as no allowlisting information is needed for the resolver.

TAPIR-POP supports a local policy configuration

TAPIR-POP is able to apply further policy to the intelligence data, based on a local policy configuration. To enable the resolver operator to design a suitable threat policy TAPIR-POP uses a number of concepts:

  • lists: there are three types of lists of domain names:

    • allowlists (names that must not be blocked)
    • denylists (names that must be blocked)
    • doubtlists (names that should perhaps be blocked)
  • observations: these are attributes of a suspicious domain name. In reality whether a particular domain name should be blocked or not is not an absolute, it is a question of propabilities. Therefore, rather than a binary directive, "this name must be blocked", some intelligence sources, including DNS TAPIR, present the resolver operator with observed attributes of the name. Examples include:

    • the name has only been observed on the Internet for a short time
    • the name draws huge query traffic
    • the name resolves to an IP address known to host bad things, etc.
  • sources: TAPIR-POP supports the following types of sources for intelligence data:

    • RPZ: imported via AXFR or IXFR. TAPIR-POP understands DNS NOTIFY.
    • MQTT: DNS TAPIR Core Analyser sends out rapid updates for small numbers of names via an MQTT message bus infrastructure.
    • DAWG: Directed Acyclic Word Graphs are extremely compact data structures. TAPIR-POP is able to mmap very large lists in DAWG format which is used for large allowlists.
    • CSV Files: Text files on local disk, either with just domain names, or in CSV format are supported.
    • HTTPS: To bootstrap an intelligence feed that only distributes deltas (like DNS TAPIR, over MQTT), TAPIR-POP can bootstrap the current state of the complete feed via HTTPS.
  • outputs: TAPIR-POP outputs RPZ zones to one or several recipients. Both AXFR and IXFR is supported.

Overview of the TAPIR-POP policy

The resulting policy has the following structure (in order of precedence):

  • no allowlisted name is ever included.
  • blocklisted names are always included, together with a configurable RPZ action.
  • doubtlisted names that have particular tags that the resolver operator chooses are included, together with a configurable RPZ action.
  • the same doubtlisted name that appear in N distinct intelligence feeds is included, where N is configureable, as is the RPZ action.
  • a doubtlisted name that has M or more tags is included, where both M and the action are configurable.

Logging

Logging is done by writing either to stdout or stderr and letting systemd handle it. Logging will consist of four verbosity levels: Debug, Info, Warning and Error. Each component is responsible for its own logging.

Telemetry

Telemetry is done by publishing on a dedicated MQTT topic. Each component has its own topic and is responsible for publishing its own data. Most of the data will be aggregated statistics such as packet counters.

"Exceptional" Events

Some events are of special interest, both to the operator of a particular Edge deployment and to the Core operator. An example of such an event would be failure to renew a TLS certificate. As such, those events are written BOTH to the syslog and published over the MQTT telemetry topic by the component that observes the event. The component also assigns an identifier that is visible both in the log and in the telemetry packet so that the two can be associated.

Versioning Scheme

DNS TAPIR Edge software uses semantic versioning for releases. The Edge components are released independently, but should all adhere to the same scheme, that is X.Y.Z. Development builds, nightly builds and other unofficial builds should have a version of 0.0.0.

Versioning Scheme for Debian-packaged Edge Components

Debian-packaged Edge software should adhere to the same scheme as the upstream component. That is, X.Y.Z for releases and 0.0.0 for unofficial packages. Additionally, for unofficial packages, a snapshot string will be used to identify when the build was made and what revision of the upstream code was used. For Edge components built from non-release revisions of the code, the versioning scheme will be 0.0.0+local20251118.<SHORT SHA>. Officially released packages will not have the snapshot part, i.e. it will just be X.Y.Z.

Versioning Scheme for RPM-packaged Edge Components

RPM-packaged Edge software follows similar rules as Debian-packaged software. The version format will be slightly different for unofficial builds: 0.0.0^20251118.<SHORT SHA>-1.

Embedding Versioning Information in an Executable Binary

All Edge binaries should have version and source code revision information stored in them. This can be done, for example by using the -ldflags option at build-time.

For example:

go build -ldflags "-X 'main.version=0.0.0' -X 'main.commit=12345678'"

This will create two variables with versioning information in main.go. Ideally, they should be printed when early on the binary is invoked. To make it clear when a binary has been built in a bad way, they can be declared like this in main.go:

# in main.go

var version = "BAD-VERSION"
var commit = "BAD-COMMIT"

When printed, it will be clear from the logs that the binary has not been built in the intended way and should be replaced.

Building In-repo vs. from Extracted Tarball

During development, one will typically build the binary by issuing make while standing in the root of a checked-out git repo. However, sometimes the binary will be built from an extracted tarball. In either case, versioning information should be embedded in the built binary. This means that the git sha and any tag information that is relevant must be included when making the tarball. For instance, it could be put into two files, VERSION and COMMIT, that are included in the tarball as it is created.

Example of multi-new observation

sequenceDiagram
    participant NATS Server
    participant microservice
    participant NATS KV-Store

    microservice->>+NATS Server: Subscribe(EVENT_NEW_QNAME)
    note over microservice,NATS Server: ...some time passes...
    NATS Server->>microservice: Publish(EVENT_NEW_QNAME, "new.example.com.")
    microservice->>+NATS KV-Store: Request("new.example.com.")
    note over microservice,NATS KV-Store: ...gets collected data about the domain...
    microservice->>microservice: Has "new.example.com." been observed as new by other resolvers recently?
    microservice->>NATS Server: Publish(OBSERVATION_MULTI_NEW, "new.example.com.")

Example of ramp observation

sequenceDiagram
    participant NATS Server
    participant Data Loader
    participant S3
    participant microservice

    NATS Server->>Data Loader: Publish(EVENT_NEW_AGGREGATE)
    Data Loader->>S3: Get(NEW_AGGREGATE)
    Data Loader->>Data Loader: Create histogram
    Data Loader->>S3: Post(Histogram)
    S3-->>microservice: Publish(EVENT_NEW_HIST)
    microservice->>microservice: For domain in hist, hasRamp?
    microservice->>NATS Server: Publish(OBSERVATION_RAMP, "evil.hula.se")
    note over microservice: ...Publish ramping domains...
    microservice->>NATS Server: Publish(OBSERVATION_RAMP, "z5.nu")

Communication pattern for "NOT well-known" domains

flowchart
    EDM-->|2, EVENT_NEW_QNAME, MQTT, mTLS, RFC 7515|bridge
    bridge-->|7, OBSERVATION_RAMP, MQTT, mTLS, RFC 7515|POP

    subgraph CORE
    bridge-->|3, EVENT_NEW_QNAME|NATS_Server
    NATS_Server-->|4, EVENT_NEW_QNAME|MULTINEW_microservice
    MULTINEW_microservice-->|5, OBSERVATION_MULTI_NEW|NATS_Server
    NATS_Server-->|6, OBSERVATION_MULTI_NEW|bridge
    end

    subgraph EDGE
    POP-->|8, RPZ XFR|RecResolver
    RecResolver-->|1, DNSTAP|EDM
    end

Communication pattern for "well-known" domains

There is some discrepancy between this image and the sequence diagram for the ramp observation has the data loader component has not been taken into account here.

flowchart
    EDM-->|2, one-minute aggregate, HTTPS, mTLS, RFC 7515|aggrec
    bridge-->|7, OBSERVATION_RAMP, MQTT, mTLS, RFC 7515|POP

    subgraph CORE
    aggrec-->|3, Publish EVENT_NEW_AGGREGATE|NATS_Server
    NATS_Server-->|4, Publish EVENT_NEW_AGGREGATE|RAMP_microservice
    RAMP_microservice-->|5, Publish OBSERVATION_RAMP|NATS_Server
    NATS_Server-->|6, OBSERVATION_RAMP|bridge
    end

    subgraph EDGE
    POP-->|8, RPZ XFR|RecResolver
    RecResolver-->|1, DNSTAP|EDM
    end

Observation Encodings

Observations sent out from DNS TAPIR Core are packed in a 32-bit word with the following interpretations:

EncodingObservationInterpretation
1TODOTODO
2TODOTODO
4TODOTODO
8TODOTODO
16TODOTODO
32TODOTODO
64TODOTODO
128TODOTODO
256TODOTODO
512TODOTODO
1024FLAG_LOOPTESTObservation for testing use only.
othersTODOTODO