Commit 527d6f91 authored by Petr Špaček's avatar Petr Špaček

Respdiff second generation: reachitecture, support for parallel processing

The original monolitic Respdif (one f in the name) by Jan Holusa
was reachitected and split into separate tools which (when chained
together) do very similar job but much faster and flexibly.

The second generation is conceptually chain of independent tools:
1. generate queries in wire format
2. send pre-generated wire format to resolvers and gather answers
3. analyze answers

This split allows us to repeat steps using the same data as necessary,
e.g. run analysis with different parameters without re-querying the
resolvers.

This first version is using filesystem to store queries and answers.

Tool "makedirs.py" reads list of queries in text format <name> <RR type>
and creates directory structure with subdirectory for each query. File
"q.dns" in each subdirectory contains query in DNS wire format.

Tool "orchestrator.py" then reads this stored wire format, sends it to
resolvers and stores answer from each resolver into separate file.

Directory structure for one query is:
00001/            - subdirectory name == query ID
00001/q.dns       - query in wire format
00001/bind.dns    - answer from BIND in wire format
00001/kresd.dns   -             kresd
00001/unbound.dns -             Unbound

Resulting files can be analyzed using tool "msgdiff.py".
The tool refers to one resolver as "target" and to remaining resolvers
as "others". Msgdiff compares specified fields in the answers and
compute statistics based on comparion results.

Answers where "others" do not agree with each other are simply counted but
not processed further. Answers where "others" agree but the "target"
returned a different answer than all "others" are counted separately
with higher granularity, producing stats for each field in DNS message
(rcode, flags, answer section ...).

This very first version lacks proper user interface and values are
hardcoded into Python scripts, see orchestrator.py.
parents
# Overview
This set of benchmarks is used for measuring
[knot-resolver](https://www.knot-resolver.cz/). Informations about different benchmarks
can be found in particular folders.
## Benchmarks
* [Cache usage benchmark](https://gitlab.labs.nic.cz/knot/resolver-benchmarking/tree/master/cache_usage_benchmark) - Comparison of resolving speed depending on the cache usage.
* [Response differences](https://gitlab.labs.nic.cz/knot/resolver-benchmarking/tree/master/response_differences) - Comparison of different responses between Knot-resolver, Bind
## Other
* [Scripts](https://gitlab.labs.nic.cz/knot/resolver-benchmarking/tree/master/scripts) - Folder with some other usefull scripts. More information in the folder.
and Unbound.
# Results
Results will be available at Knot-resolver [web](https://www.knot-resolver.cz).
# HowTos
Examples how to run benchmarks are available at Knot-resolver [wiki](https://gitlab.labs.nic.cz/knot/resolver/wikis/home).
/bind.keys
/named.conf.default-zones
/named.conf.etc
/named.conf.local
/rndc.key
// This is the primary configuration file for the BIND DNS server named.
//
// Please read /usr/share/doc/bind9/README.Debian.gz for information on the
// structure of BIND configuration files in Debian, *BEFORE* you customize
// this configuration file.
//
// If you are just adding zones, please do that in /etc/bind/named.conf.local
include "/etc/bind/named.conf.options";
include "/etc/bind/named.conf.local";
include "/etc/bind/named.conf.default-zones";
include "/etc/bind/rndc.key";
controls {
inet * port 953
allow {127.0.0.1; 172.20.20.162; 172.20.6.125; } keys { "rndc-key"; };
};
acl resperf {
172.20.20.0/24;
172.20.6.0/24;
localhost;
localnets;
};
options {
directory "/var/cache/bind";
// If there is a firewall between you and nameservers you want
// to talk to, you may need to fix the firewall to allow multiple
// ports to talk. See http://www.kb.cert.org/vuls/id/800113
// If your ISP provided one or more IP addresses for stable
// nameservers, you probably want to use them as forwarders.
// Uncomment the following block, and insert the addresses replacing
// the all-0's placeholder.
// forwarders {
// 0.0.0.0;
// };
//========================================================================
// If BIND logs error messages about the root key being expired,
// you will need to update your keys. See https://www.isc.org/bind-keys
//========================================================================
dnssec-enable no;
dnssec-validation no;
auth-nxdomain no; # conform to RFC1035
//listen-on-v6 { any; };
recursion yes;
//allow these ips
allow-query { resperf; };
//allow querry from cache for these hosts
allow-query-cache{ resperf; };
acache-enable yes;
max-acache-size 900M;
max-cache-size 900M;
//max-cache-ttl 30000; //limit cache records to a 20s
statistics-file "/var/cache/bind/named.stats";
dump-file "/var/cache/bind/cache_dump.db";
listen-on port 50002 { 172.20.20.160; };
// listen-on port 53 { 127.0.0.1; };
};
cache.size = 900*MB
net.listen('172.20.20.160', 50003)
\ No newline at end of file
This diff is collapsed.
/unbound_control.key
/unbound_control.pem
/unbound_server.key
/unbound_server.pem
server:
# The following line will configure unbound to perform cryptographic
# DNSSEC validation using the root trust anchor.
auto-trust-anchor-file: "/var/lib/unbound/root.key"
ip-address: 172.20.20.160
port: 50001
msg-cache-size: 500m
msg-cache-slabs: 2
neg-cache-size: 500m
rrset-cache-size: 500m
rrset-cache-slabs: 2
key-cache-size: 500m
key-cache-slabs: 2
num-threads: 1
access-control: 127.0.0.0/8 allow
access-control: 172.20.20.160/32 allow
access-control: 172.20.20.162/32 allow
access-control: 172.20.6.125/32 allow
remote-control:
control-enable: yes
control-interface: 172.20.20.160
# unbound-control key file
control-key-file: "/etc/unbound/unbound_control.key"
# unbound-control cert file
control-cert-file: "/etc/unbound/unbound_control.pem"
# unbound server certificate file.
server-cert-file: "/etc/unbound/unbound_server.pem"
# unbound server key file.
server-key-file: "/etc/unbound/unbound_server.key"
# Ignore .pyc
*.pyc
#!/usr/bin/python3
import sys
import dns.message
m = dns.message.from_wire(open(sys.argv[1], 'rb').read())
print(str(m))
import errno
import os
import sys
import makeq
i = 1
with open(sys.argv[1]) as qlist:
for line in qlist:
line = line.strip()
if i == 8000000:
break
dirname = '%07d' % i
qfilename = '%s/q.dns' % dirname
try:
os.mkdir(dirname)
except OSError as ex:
if not ex.errno == errno.EEXIST:
raise
qry = makeq.qfromtext(line.split())
if makeq.is_blacklisted(qry):
continue
with open(qfilename, 'wb') as qfile:
qfile.write(qry.to_wire())
i += 1
import argparse
import sys
import dns.name
import dns.message
import dns.rdataclass
import dns.rdatatype
qparser = argparse.ArgumentParser(description='Generate DNS message with query')
qparser.add_argument('qname', type=lambda x: int_or_fromtext(x, dns.name.from_text))
qparser.add_argument('qclass', type=lambda x: int_or_fromtext(x, dns.rdataclass.from_text), nargs='?', default='IN')
qparser.add_argument('qtype', type=lambda x: int_or_fromtext(x, dns.rdatatype.from_text))
def int_or_fromtext(value, fromtext):
try:
return int(value)
except ValueError:
return fromtext(value)
def qfromtext(*args):
arglist = ['--'] + args[0]
args = qparser.parse_args(arglist)
return dns.message.make_query(args.qname, args.qtype, args.qclass, want_dnssec=True)
def qsfrompcap(pcapname):
pass
def is_blacklisted(msg):
if len(msg.question) >= 1:
if msg.question[0].rdtype == dns.rdatatype.ANY:
return True
return False
def main():
qry = qfromtext(sys.argv[1:])
if is_blacklisted(qry):
sys.exit('query blacklisted')
sys.stdout.write(qry.to_wire())
if __name__ == "__main__":
main()
import collections
import cProfile
import json
from pprint import pprint
import sys
import dns.message
#m1 = dns.message.from_wire(open(sys.argv[1], 'rb').read())
#print('--- m1 ---')
#print(m1)
#print('--- m1 EOM ---')
#m2 = dns.message.from_wire(open(sys.argv[2], 'rb').read())
#print('--- m2 ---')
#print(m2)
#print('--- m2 EOM ---')
class DataMismatch(Exception):
def __init__(self, exp_val, got_val):
self.exp_val = exp_val
self.got_val = got_val
def __str__(self):
return 'expected "{0.exp_val}" got "{0.got_val}"'.format(self)
def compare_val(exp_val, got_val):
""" Compare values, throw exception if different. """
if exp_val != got_val:
raise DataMismatch(exp_val, got_val)
return True
def compare_rrs(expected, got):
""" Compare lists of RR sets, throw exception if different. """
for rr in expected:
if rr not in got:
raise DataMismatch(expected, got)
for rr in got:
if rr not in expected:
raise DataMismatch(expected, got)
if len(expected) != len(got):
raise DataMismatch(expected, got)
#raise Exception("expected %s records but got %s records "
# "(a duplicate RR somewhere?)"
# % (len(expected), len(got)))
return True
def match_part(exp_msg, got_msg, code):
""" Compare scripted reply to given message using single criteria. """
if code == 'opcode':
return compare_val(exp_msg.opcode(), got_msg.opcode())
elif code == 'qtype':
if len(exp_msg.question) == 0:
return True
return compare_val(exp_msg.question[0].rdtype, got_msg.question[0].rdtype)
elif code == 'qname':
if len(exp_msg.question) == 0:
return True
qname = dns.name.from_text(got_msg.question[0].name.to_text().lower())
return compare_val(exp_msg.question[0].name, qname)
elif code == 'qcase':
return compare_val(got_msg.question[0].name.labels, exp_msg.question[0].name.labels)
#elif code == 'subdomain':
# if len(exp_msg.question) == 0:
# return True
# qname = dns.name.from_text(got_msg.question[0].name.to_text().lower())
# return compare_sub(exp_msg.question[0].name, qname)
elif code == 'flags':
return compare_val(dns.flags.to_text(exp_msg.flags), dns.flags.to_text(got_msg.flags))
elif code == 'rcode':
return compare_val(dns.rcode.to_text(exp_msg.rcode()), dns.rcode.to_text(got_msg.rcode()))
elif code == 'question':
return compare_rrs(exp_msg.question, got_msg.question)
elif code == 'answer' or code == 'ttl':
return compare_rrs(exp_msg.answer, got_msg.answer)
elif code == 'authority':
return compare_rrs(exp_msg.authority, got_msg.authority)
elif code == 'additional':
return compare_rrs(exp_msg.additional, got_msg.additional)
elif code == 'edns':
if got_msg.edns != exp_msg.edns:
raise DataMismatch(exp_msg.edns, got_msg.edns)
if got_msg.payload != exp_msg.payload:
raise DataMismatch(exp_msg.payload, got_msg.payload)
elif code == 'nsid':
nsid_opt = None
for opt in exp_msg.options:
if opt.otype == dns.edns.NSID:
nsid_opt = opt
break
# Find matching NSID
for opt in got_msg.options:
if opt.otype == dns.edns.NSID:
if not nsid_opt:
raise DataMismatch(None, opt.data)
if opt == nsid_opt:
return True
else:
raise DataMismatch(nsid_opt.data, opt.data)
if nsid_opt:
raise DataMismatch(nsid_opt.data, None)
else:
raise NotImplementedError('unknown match request "%s"' % code)
def match(expected, got, match_fields):
""" Compare scripted reply to given message based on match criteria. """
for code in match_fields:
try:
res = match_part(expected, got, code)
except DataMismatch as ex:
yield (code, ex)
import itertools
import multiprocessing
import multiprocessing.pool as pool
import os
def find_querydirs(workdir):
#i = 0
for root, dirs, files in os.walk(workdir):
dirs.sort()
if not 'q.dns' in files:
continue
#i += 1
#if i == 10000:
# return
#print('yield %s' % root)
yield root
def read_answers(workdir):
answers = {}
for filename in os.listdir(workdir):
if filename == 'q.dns':
continue
#if filename == 'bind.dns':
# continue
if not filename.endswith('.dns'):
continue
name = filename[:-4]
filename = os.path.join(workdir, filename)
with open(filename, 'rb') as msgfile:
msg = dns.message.from_wire(msgfile.read())
answers[name] = msg
return answers
def diff_pair(answers, criteria, name1, name2):
"""
Returns: sequence of (field, DataMismatch())
"""
yield from match(answers[name1], answers[name2], criteria)
def diff_pairs(answers, criteria, pairs):
"""
Returns: dict(pair: diff as {'field': DataMismatch()})
"""
#print('diff_pairs: %s %s %s' % (answers, pairs, criteria))
result = {}
for pair in pairs:
diff = dict(diff_pair(answers, criteria, *pair))
if diff:
result[pair] = diff
return result
def compare(target, workdir, criteria):
#print('compare: %s %s %s' %(target, workdir, criteria))
answers = read_answers(workdir)
names = list(answers.keys())
names.remove(target)
names.append(target) # must be last
all_pairs = list(itertools.combinations(names, 2))
target_pairs = list(itertools.filterfalse(lambda x: target not in x, all_pairs))
# are there at least two other resolvers?
other_pairs = list(itertools.filterfalse(lambda x: target in x, all_pairs))
assert other_pairs # TODO
#if other_pairs:
# do others agree on the answer?
#from IPython.core.debugger import Tracer
#Tracer()()
others_agree = all(map(
lambda names: not any(diff_pair(answers, criteria, *names)),
other_pairs))
if not others_agree:
return (workdir, False, None)
assert target_pairs # TODO
target_diffs = diff_pairs(answers, criteria, target_pairs)
return (workdir, others_agree, target_diffs)
#target_agree = not any(target_diffs.values())
#if not target_agree:
# print('target:')
# pprint(target_diffs)
#if not all([target_agree, others_agree]):
#write_txt(workdir, answers)
#print('target agree %s, others agree %s' % (target_agree, others_agree))
# for a, b in other_pairs:
# diff = match(answers[a], answers[b], criteria)
# print('diff %s ? %s: %s' % (a, b, diff))
def write_txt(workdir, answers):
# target name goes first
for name, answer in answers.items():
path = os.path.join(workdir, '%s.txt' % name)
with open(path, 'w') as txtfile:
txtfile.write(str(answer))
def worker_init(criteria_arg, target_arg):
global criteria
global target
global prof
global i
i = 0
#prof = cProfile.Profile()
#prof.enable()
criteria = criteria_arg
target = target_arg
#print('criteria: %s target: %s' % (criteria, target))
def compare_wrapper(workdir):
global criteria
global target
#global result
global i
#global prof
#return compare(target, workdir, criteria)
result = compare(target, workdir, criteria)
#i += 1
#if i == 10000:
# prof.disable()
# prof.dump_stats('prof%s.prof' % multiprocessing.current_process().name)
#prof.runctx('global result; result = compare(target, workdir, criteria)', globals(), locals(), 'prof%s.prof' % multiprocessing.current_process().name)
return result
def process_results(diff_generator):
stats = {
'diff_n': 0,
'target_only_diff_n': 0,
'diff_field_c': collections.Counter()
}
for qid, others_agree, target_diff in diff_generator:
#print(qid, others_agree, target_diff)
if not others_agree:
stats['diff_n'] += 1
continue
if target_diff:
stats['target_only_diff_n'] += 1
print('"%s": ' % qid)
pprint(target_diff)
print(',')
diff_fields = list(target_diff.values()).pop().keys()
stats['diff_field_c'].update(diff_fields)
print('}')
stats['diff_field_c'] = dict(stats['diff_field_c'])
print('stats = ')
pprint(stats)
target = 'kresd'
ccriteria = ['opcode', 'rcode', 'flags', 'question', 'qname', 'qtype', 'answer'] #'authority', 'additional', 'edns']
#ccriteria = ['opcode', 'rcode', 'flags', 'question', 'qname', 'qtype', 'answer', 'authority', 'additional', 'edns', 'nsid']
if False:
dir_names = itertools.tee(find_querydirs(sys.argv[1]), 2)
for d in dir_names:
print(d)
workdirs = itertools.islice(find_querydirs(sys.argv[1]), 100000)
print('diffs = {')
serial = False
if serial:
worker_init(ccriteria, target)
process_results(map(compare_wrapper, workdirs))
else:
with pool.Pool(
processes=4,
initializer=worker_init,
initargs=(ccriteria, target)
) as p:
process_results(p.imap_unordered(compare_wrapper, workdirs, chunksize=100))
import multiprocessing.pool as pool
import os
import sendrecv
timeout = 5
resolvers = [
('kresd', '127.0.0.1', 5353),
('unbound', '127.0.0.1', 53535),
('bind', '127.0.0.1', 53533)
]
# find query files
def find_querydirs(workdir):
for root, dirs, files in os.walk('.'):
dirs.sort()
if not 'q.dns' in files:
continue
#print('yield %s' % root)
yield root
#selector.close() # TODO
with pool.Pool(
processes=8,
initializer=sendrecv.worker_init,
initargs=[resolvers, timeout]) as p:
p.map(sendrecv.query_resolvers, find_querydirs('.'))
import os
import selectors
import socket
import dns.inet
import dns.message
def sock_init(resolvers):
"""
resolvers: [(name, ipaddr, port)]
returns (selector, [(name, socket, sendtoarg)])
"""
sockets = []
selector = selectors.DefaultSelector()
for name, ipaddr, port in resolvers:
af = dns.inet.af_for_address(ipaddr)
if af == dns.inet.AF_INET:
destination = (ipaddr, port)
elif af == dns.inet.AF_INET6:
destination = (ipaddr, port, 0, 0)
else:
raise NotImplementedError('AF')
sock = socket.socket(af, socket.SOCK_DGRAM, 0)
sock.setblocking(False)
sockets.append((name, sock, destination))
selector.register(sock, selectors.EVENT_READ, name)
#print(sockets)
return selector, sockets
def send_recv_parallel(what, selector, sockets, timeout):
replies = []
for _, sock, destination in sockets:
sock.sendto(what, destination)
# receive replies
while len(replies) != len(sockets):
events = selector.select() #timeout=timeout) # BLEH! timeout shortening
for key, _ in events:
name = key.data
sock = key.fileobj
(wire, from_address) = sock.recvfrom(65535)
assert len(wire) > 14
replies.append((name, wire))
# TIMEOUT !!!!
return replies
def worker_init(resolvers, init_timeout):
global selector
global sockets
global timeout
timeout = init_timeout
selector, sockets = sock_init(resolvers)
def query_resolvers(workdir):
global selector
global sockets
global timeout
qfilename = os.path.join(workdir, 'q.dns')
#print(qfilename)
with open(qfilename, 'rb') as qfile:
qwire = qfile.read()
replies = send_recv_parallel(qwire, selector, sockets, timeout)
for answer in replies:
afilename = os.path.join(workdir, "%s.dns" % answer[0])
with open(afilename, 'wb') as afile:
afile.write(answer[1])
#print('%s DONE' % qfilename)
/.project
/.pydevproject
/.settings
# Ignore SWP files
*.swp
# Ignore .log files
*.log
# Result directory
/results/**
# Keep default config
!/config/respdif.cfg
# Overview and usage
## What it roughly does
This python testing tool starts resolvers at servers according the config files.
After that, it starts sending queries to the resolvers and comparing responses.
Different responses are stored in the result folder.
## What do you need
At the test machine:
* Python2.7
* Python package [dns](http://www.dnspython.org/)
* User named `kresdbench`
At the machine with resolvers:
* Bind, Unbound, Power Dns and Knot-resolver libraries for compiling resolvers from resources
* User named `kresdbench` in `sudo` group
## How to run tests
* Test can be run by command `$python2.7 respdif -c config/config.cfg -i data/dataset`
It uses these optional flags:
* `-s, --case_sensitive` - starts comparing case-sensitive.
* `-d, --debug` - switch log output level from info to debug.
* `--json` - creates json output.
* `-o, --compare_others` - Compare also other resolvers each to other. Not just kresd to other.
* `-b, --branch` - which branch of Knot Resolver should be used. This option has higher priority than from config file.
## Input file
Input file can be in two formats:
* First contains on each line number and server to be queried separated
by comma. For example `666,nic.cz`. Example of input file is file `top-1m.csv`.
* Second contains on each line server to be queried and type of query separated
by tabulator. For example `nic.cz AAAA`.
## Configuration files
### Test configuration
Contains 4 section: general and one section for each resolver. In section general
you can configure these parameters:
* `rdatatype` - list of types to be tested (MX, AAAA, A, NS, ...). Each query will be tested on each
type in the list.
* `rdataclass` - list of classes to be tested (IN, CH, ...). Each query will be tested on each
class in the list.
* `rdata_rdatatype_ignore` - In which type of rdatatype queries should be ignored rdata section in comparison.
* `querries` - how many queries send from the input file.
* `querry_timeout` - timeout for each query.
* `ttl_range` - indicates how benevolent will be ttl comparison.
* `compare_sections` - list of sections to compare (opcode, rcode, flags, answer, ...).
* `local_interface` - Set local interface name. Default is em0. It is necessary for preparing resolvers at server side.
* `run_under_docker` - Set program to run under docker. if parameter is set to yes different style of saving results is used.
Possible values are yes and no. Default value is no.
* `result` - results folder where will be created 'date' folders with results.
* `email` - email where to send result summary. In case of differences the readable output is also attached.
In other three sections named knot, bind, unbound and pdns
you can use these parameters:
* `port` port where is running particular resolver.
* `ip` ip address where is running particular resolver.
* `start_remotely` is flag, which indicates start of the server in given ip and port.
Possible values are yes and no. Default is no.
* `branch` name of the knot branch. This parameter is possible to use just in the `knot` section.
Example of configuration file: `config/respdif.cfg`
### Resolver config
Configuration scripts of resolvers are located in the `resolvers_setup` folder.
## Outputs
Output from the each test is stored in the `result` folder into timestamp folder of the test.
Each timestamp folder contains log file and output file. It is possible to change names of this
files in the file `local_constants.py`. Log file show just results of each sended query
(OK [just debug mode] - servers return the same response, or NOK - response was not t