Commit 0f1a48e7 authored by Tomas Krizek's avatar Tomas Krizek

doc: update README and documentation

Related: #19
parent 08a89c6a
Pipeline #36797 passed with stage
in 50 seconds
This diff is collapsed.
Diffrepro
=========
Usage
-----
.. code-block:: console
$ ./diffrepro.py "${DIR}" # basic example
$ ./diffrepro.py --help # for more info
Description
-----------
Use of this tool is optional. It can be used to filter differences that either have and *unstable upstream* or are *not reproducible* (for explanation, see ``diffrepro.py`` documentation).
The tool can run queries in parallel (like orchestrator), or sequentially (slower,
but more predictable).
If it's used to test local resolvers, they should be restarted (and cache cleared) between
the queries. This can be achieved by providing a path to restart script in
``restart_script`` key in each server's section in the config file. This script
will be executed after each batch (parallel mode) or query (sequential mode)
for every server.
The output is written to the JSON report and other tools automatically use this
data if present.
Notes
-----
* If you want to ensure the most reliable reproducibility use ``-s`` argument
to force sequential, one by one, processing of the differences. With scripts
than ensure resolver restart (and clean up of the cache), this can find the
differences that can be reproduced in the most reliable way. Beware this
option can be very slow for large number of differneces.
Diffsum
=======
Usage
-----
.. code-block:: console
$ ./diffsum.py "${DIR}" # basic example
$ ./diffsum.py --help # for more info
Description
-----------
Differences computed by ``msgdiff.py`` can be translated into text report using
tool ``diffsum.py`` which computes summary based on that comparison's results.
The report uses the following terms:
- *upstream unstable* represents queries, where the servers other than
``target`` haven't received the same answer, thus the source (upstream) of
these queries is considered unstable. These queries aren't counted further
towards the other statistics.
- *not reproducible* appears in cases where ``diffrepro.py`` tool was used
to attempt to reproduce the measured differences. In case the difference
doesn't match exactly the one before, the query is ignored in further
processing.
- *target disagreements* refers to cases, when there's a difference
between the answer from ``target`` server and the others server, and the
other servers agree on the answer (there is no difference between them).
These are the most interesting cases that are analysed further.
The summary evaluates how many *target disagreements* there were in particular
*fields* (or ``criteria``), and what did these mismatches look like. It produces
both textual output and machine readable file (``*.json``).
Notes
-----
* If you adjust the ``field_weights``, just re-order the fields. Don't remove
them, otherwise there'll be issues if such field is ever encountered when
producing the summary.
* In case you update respdiff and ``diffsum.py`` doesn't work, check the
changelog. If a new field was added, adjust your config accordingly.
* Redirect *stdout* of the command to a text file in case you want to keep the
textual report for future reference.
* If you want a comprehensive list of mismatched queries in the text report,
use ``-l 0`` argument.
Histogram
=========
Usage
-----
.. code-block:: console
$ ./histogram.py "${DIR}" # basic example
$ ./histogram.py --help # for more info
Description
-----------
``histogram.py`` uses the latency data of all answers from each server to plot
a graph that can be used to analyze the performance of the servers.
The type of graph this tool generates is the
`logarithmic percentile histogram <https://blog.powerdns.com/2017/11/02/dns-performance-metrics-the-logarithmic-percentile-histogram/>`_.
See the link for full explanation of this graph and the reasoning why it's
suitable to use for benchmarking.
Reading the graph
-----------------
.. image:: example_histogram.png
:alt: (histogram plot: see example_histogram.png)
On the horizontal axis, you can read how many percent of queries were answered
*slower* than the corresping response time on the vertical axis.
In other words, the 1.0 slowest percentile means that 99 % of queries were
answered faster than the response time of this percentile. Please note both
axis are logarithmic.
Keep in mind you have to have a large sample of queries to get any meaningful
data for the lower slowest percentiles. The curve also typically flattens out for the slowest queries due to a configured *timeout*.
Notes
-----
* You can specify various image file extensions in the ``--output`` argument to
generate different image formats.
LMDB Binary Format
==================
If the data was gathered using tools other than ``orchestartor.py``, e.g.
`dnsjit <https://github.com/DNS-OARC/dnsjit>`__, the following LMDB database
environment can be used to achieve compatibility with the rest of respdiff
toolchain.
All numbers represented in binary format defined below use the **little endian** byte order.
Database ``queries``
--------------------
``queries`` database is used to store the wire format of queries that were sent
to the servers. Each query has a unique integer identifier, ``<QID>``.
+-----------+-----------------+-----------------------------+------------------+
| Key | Key Type | Value Description | Value Type |
+===========+=================+=============================+==================+
| ``<QID>`` | 4B unsigned int | DNS query sent to server(s) | DNS wire format |
+-----------+-----------------+-----------------------------+------------------+
Database ``answers``
--------------------
``answers`` database stores the binary responses from the queried servers.
If there are multiple servers, their responses are stored within a single
``<QID>`` key. Multiple responses are stored within the value by simply
concatenating them in the binary format of ``response`` described below. Please
note the order of responses is significant and must correspond with the server
definition in the ``meta`` database.
+-----------+-----------------+--------------------------------+---------------------------------------+
| Key | Key Type | Value Description | Value Type |
+===========+=================+================================+=======================================+
| ``<QID>`` | 4B unsigned int | DNS response(s) from server(s) | One or more ``response`` (see below) |
+-----------+-----------------+--------------------------------+---------------------------------------+
``response``
~~~~~~~~~~~~
``response`` represents a single DNS response from a server and has the
following binary format::
. 0 1 2 3 4 5 6 ...
+----+----+----+----+----+----+----\\----+
| time | length | wire |
+----+----+----+----+----+----+----\\----+
+------------+--------------------------+---------------------------------------------------------------------------------------------------------------------------+
| Label | Type | Description |
+============+==========================+===========================================================================================================================+
| ``time`` | 4B unsigned int | time to receive the answer in microseconds; ``4294967295`` (``FF FF FF FF``) means *timeout* |
+------------+--------------------------+---------------------------------------------------------------------------------------------------------------------------+
| ``length`` | 2B unsigned int | byte-length of the DNS ``wire`` format message that may follow; (``length`` is always present, even in case of *timeout*) |
+------------+--------------------------+---------------------------------------------------------------------------------------------------------------------------+
| ``wire`` | ``length`` B binary blob | DNS wire format of the message received from server (``wire`` is present only if ``length`` isn't zero) |
+------------+--------------------------+---------------------------------------------------------------------------------------------------------------------------+
Database ``meta``
-----------------
``meta`` database stores additional information used for further processing of the data.
+----------------+----------+-------------------------------------------------------------------+------------------+
| Key | Key Type | Value Description | Value Type |
+================+==========+===================================================================+==================+
| ``version`` | ASCII | respdiff binary format version (current: ``2018-05-21``) | ASCII |
+----------------+----------+-------------------------------------------------------------------+------------------+
| ``servers`` | ASCII | number of servers responses are collected from | 4B unsigned int |
+----------------+----------+-------------------------------------------------------------------+------------------+
| ``name0`` | ASCII | name identifier of the first server (same as in ``respdiff.cfg``) | ASCII |
+----------------+----------+-------------------------------------------------------------------+------------------+
| ``name1`` | ASCII | name identifier of the second server | ASCII |
+----------------+----------+-------------------------------------------------------------------+------------------+
| ``name<N>`` | ASCII | name identifier of the ``N+1``-th server | ASCII |
+----------------+----------+-------------------------------------------------------------------+------------------+
| ``start_time`` | ASCII | (*optional*) unix timestamp of the start of data collection | 4B unsigned int |
+----------------+----------+-------------------------------------------------------------------+------------------+
| ``end_time`` | ASCII | (*optional*) unix timestamp of the end of data collection | 4B unsigned int |
+----------------+----------+-------------------------------------------------------------------+------------------+
Msgdiff
=======
Usage
-----
.. code-block:: console
$ ./msgdiff.py "${DIR}" # basic example
$ ./msgdiff.py --help # for more info
Description
-----------
Gathered answers can be compared using the ``msgdiff.py`` tool.
which reads configuration from config file section ``[diff]``.
The tool refers to one server as ``target`` (configured in ``[diff]``
section) and to remaining servers as ``others``. Msgdiff compares specified
``criteria`` and stores results in the LMDB and the JSON datafile.
The created JSON datafile contains the information about the mismatches. This
datafile is necessary for other tools in the respdiff toolchain. The format of
this file is subject to change and backwards compatibility is not guaranteed.
Notes
-----
- Performance of ``msgdiff.py`` can be slightly boosted by compiling
``dnspython`` with CPython.
- If you change the ``criteria``, you can re-run ``msgdiff.py`` and the rest of
the toolchain on the same LMDB without gathering the answers again.
Orchestrator
============
Usage
-----
.. code-block:: console
$ ./orchestrator.py "${DIR}" # basic example
$ ./orchestrator.py --help # for more info
Description
-----------
``orchestrator.py`` reads query wire format from LMDB and sends it to
configured DNS servers and stores the received answer from each server inside LMDB.
Names of servers are specified in ``names`` key in ``[servers]`` section of the
config file. IP address, port and protocol used for each server is also read
from the config file (see ``respdiff.cfg`` for example).
Multiple queries might be sent in parallel, see ``jobs`` option in
``[sendrecv]`` section of the config file.
By default, each job (process/thread) sends another query as soon as the answer
to the previous one is received and processed. It is possible to add a random
or fixed delay between sending the queries by customizing the
``time_delay_min`` and ``time_delay_max`` options in ``[sendrecv]`` section of
the config file.
The tool automatically aborts in case it receives ``max_timeouts`` of
consecutive timeouts from a single server. To supress this behaviour, use
``--ignore-timeout`` argument.
Qprep
=====
Usage
-----
.. code-block:: console
$ ./qprep.py "${DIR}" < list_of_queries_in_text_format # basic example
$ ./qprep.py --help # for more info
Description
-----------
Tool ``qprep.py`` reads list of queries and stores wire format in a new LMDB
environment specified on command line.
Two input formats are accepted: text and PCAP.
Text format is list of queries in form ``<name> <RR type>`` and is read
from standard input, one query on one input line.
When generating wire format from text, the tool hardcodes EDNS buffer size
4096 B and DO flag set. Future versions might allow some query customization.
Second accepted format is PCAP file. The tool copies wire format from Ethernet
frames containing IP v4/v6 packets with UDP/TCP transport layer on port 53
if QR bit in DNS header is not set. Packets on port 53 which cannot be parsed
as DNS packets are copied verbatim into the database.
Notes
-----
* There is no need to re-generate the LMDB generated by ``qprep.py`` unless the
dataset changes. The generated LMDB environment directory can be copied and
re-used (before executing ``orchestrator.py`` on it).
Testing data
------------
Feel free to use the following text query datasets:
* Top 10k DNS domains (A queries): https://gitlab.labs.nic.cz/knot/respdiff/snippets/238/raw
* 100k unique DNS queries: https://gitlab.labs.nic.cz/knot/respdiff/snippets/237/raw
Sumcmp
======
Usage
-----
.. code-block:: console
$ ./sumcmp.py old.json new.json "${DIR}" # basic example
$ ./sumcmp.py --help # for more info
Description
-----------
The ``sumcmp.py`` tool compares two reports generated by ``diffsum.py``. Every
field in the new report is compared to the one in the old report and the
difference is calculated and displayed. The output of the ``sumcmp.py`` tool is
similar to ``diffsum.py``.
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment