respdiff issueshttps://gitlab.nic.cz/knot/respdiff/-/issues2022-03-24T11:20:08+01:00https://gitlab.nic.cz/knot/respdiff/-/issues/4orchestator has low performance2022-03-24T11:20:08+01:00Petr Špačekorchestator has low performanceOur tool `orchestrator` to send and receive packets is just a quick hack and does not scale well. We might consider replacing it with [Drool](https://www.dns-oarc.net/tools/drool) which is made specifically for performance.
Open questio...Our tool `orchestrator` to send and receive packets is just a quick hack and does not scale well. We might consider replacing it with [Drool](https://www.dns-oarc.net/tools/drool) which is made specifically for performance.
Open questions:
- Can Drool be extended so it sends single query to multiple endpoints at once?
- How can we store answers received by Drool?
- How do we map (1 query)->(n replies)?
- Can we extract answer processing into a separate tool which could read PCAP? That would allow us to use different tools for gathering answers as needed.https://gitlab.nic.cz/knot/respdiff/-/issues/29add --ignore-dnsviz-warnings to diffsum2021-07-09T16:05:46+02:00Petr Špačekadd --ignore-dnsviz-warnings to diffsumhttps://gitlab.nic.cz/knot/respdiff/-/issues/28job_manager/resperf: issue SIGTERM and check for ASAN errors2021-07-09T16:03:29+02:00Tomas Krizekjob_manager/resperf: issue SIGTERM and check for ASAN errorsSome ASAN issues only occur during shutdown of kresd. These cases should be tested for as well.Some ASAN issues only occur during shutdown of kresd. These cases should be tested for as well.https://gitlab.nic.cz/knot/respdiff/-/issues/26ability to compare authority section only for NODATA and NXDOMAIN answers2021-07-09T16:01:01+02:00Petr Špačekability to compare authority section only for NODATA and NXDOMAIN answersI envision a new check which will compare authority section only if the message contains is NODATA or NXDOMAIN answer. These could be two separate checks sharing code to compare authority section.
Beware: Answer section can contain CNAM...I envision a new check which will compare authority section only if the message contains is NODATA or NXDOMAIN answer. These could be two separate checks sharing code to compare authority section.
Beware: Answer section can contain CNAME and DNAMEs + their signatures and the answer as a whole can still be clasified as NODATA or NXDOMAIN.Štěpán BalážikŠtěpán Balážikhttps://gitlab.nic.cz/knot/respdiff/-/issues/23optimization using memoryview2021-07-09T15:57:50+02:00Petr Špačekoptimization using memoryviewMaybe there is change to get speedup in msgdiff by using Python `memoryview`?
(It might require changes in `dns.message.from_wire()` so this might be a long-term effort.)Maybe there is change to get speedup in msgdiff by using Python `memoryview`?
(It might require changes in `dns.message.from_wire()` so this might be a long-term effort.)https://gitlab.nic.cz/knot/respdiff/-/issues/13cluster analysis of mismatches2021-07-09T15:46:47+02:00Petr Špačekcluster analysis of mismatchesKeeping in mind that analysis is not going to be 100 % automated, we still need a tool to help with detecting clusters of related failures.
It would be valuable to have tool which walks through DNS tree for multiple queries and clusters...Keeping in mind that analysis is not going to be 100 % automated, we still need a tool to help with detecting clusters of related failures.
It would be valuable to have tool which walks through DNS tree for multiple queries and clusters queries according to features observed in the DNS tree.
Example
=======
Multiple sites do not reply correctly to query `site.example. DNSKEY` but at the same time these sites do not have `site.example. DS` record in the parent zone. For these sites, even though they are not behaving correctly, the resolution algorithm can be modified not to ask for `DNSKEY` if `DS` does not exist. This would be effective workaround which is still standard-compliant.
There are multiple issues like this. To prioritize work it would be incredibly valuable to cluster detected mismatches, e.g. into cluster "does not reply for DNSKEY, does not have DS in parent". Then we can compare sizes of clusters and decide what to do first, what can be postponed, and what should be ignored completely because it is totally protocol non-compliant.https://gitlab.nic.cz/knot/respdiff/-/issues/10support blacklisting in results2021-07-09T15:34:57+02:00Petr Špačeksupport blacklisting in resultsSome queries/domains might be broken on the remote end, which should be detected when resolver-benchmarking#19 is implemented.
Once we have information that particular domain is broken on remote end, we should generate a blacklist so th...Some queries/domains might be broken on the remote end, which should be detected when resolver-benchmarking#19 is implemented.
Once we have information that particular domain is broken on remote end, we should generate a blacklist so the domain does not show up again in subsequent analysis.
Example:
```
== Global statistics
queries 100000
answers 100000 100.00 % of queries
blacklisted answers 100 0.10 % of answers (analyzing 99.90 % of answers)
others agree 99000 99.10 % of analyzed answers (ignoring 0.90 %)
target diagrees 500 0.51 % of matching answers from others
```
Detailed reports should not include blacklisted domains.
Open question is how we should do the blacklisting:
- [ ] by domain name
- [ ] by domain sub-tree
- [ ] by name servers (this might require additonal DNS queries just to find out NS set)https://gitlab.nic.cz/knot/respdiff/-/issues/1sumcmp: UI changes2020-11-02T15:27:08+01:00Tomas Krizeksumcmp: UI changes- https://gitlab.labs.nic.cz/knot/resolver-benchmarking/merge_requests/43#note_77023
- https://gitlab.labs.nic.cz/knot/resolver-benchmarking/merge_requests/43#note_76914
Discussion started in resolver-benchmarking!43, prototype in resol...- https://gitlab.labs.nic.cz/knot/resolver-benchmarking/merge_requests/43#note_77023
- https://gitlab.labs.nic.cz/knot/resolver-benchmarking/merge_requests/43#note_76914
Discussion started in resolver-benchmarking!43, prototype in resolver-benchmarking!50.https://gitlab.nic.cz/knot/respdiff/-/issues/36msgdiff seems to fail when running lmdb 0.9.262020-08-27T14:29:27+02:00Simon Vikströmmsgdiff seems to fail when running lmdb 0.9.26msgdiff fails with the following error
```Traceback (most recent call last):
File "./msgdiff.py", line 137, in <module>
main()
File "./msgdiff.py", line 118, in main
report = prepare_report(lmdb_, servers)
File "./msgdiff...msgdiff fails with the following error
```Traceback (most recent call last):
File "./msgdiff.py", line 137, in <module>
main()
File "./msgdiff.py", line 118, in main
report = prepare_report(lmdb_, servers)
File "./msgdiff.py", line 81, in prepare_report
qdb = lmdb_.open_db(LMDB.QUERIES)
File "/home/simon/src/git/one.com/respdiff/respdiff/database.py", line 94, in open_db
db = self.env.open_db(key=dbname, create=create, **LMDB.DB_OPEN_DEFAULTS)
lmdb.InvalidParameterError: mdb_txn_begin: Invalid argument
```
Downgrading lmdb seems to solve the issue.https://gitlab.nic.cz/knot/respdiff/-/issues/34ci: histogram.py should be paralelized/optimized2019-11-06T16:36:04+01:00Tomas Krizekci: histogram.py should be paralelized/optimizedCurrently, our respdiff jobs in Knot Resolver's CI spend around 20-40% of total execution time generating histograms. The script runs for a few minutes, utilizing just a single thread.Currently, our respdiff jobs in Knot Resolver's CI spend around 20-40% of total execution time generating histograms. The script runs for a few minutes, utilizing just a single thread.https://gitlab.nic.cz/knot/respdiff/-/issues/32histogram: graph per RCODE2019-07-09T17:11:56+02:00Petr Špačekhistogram: graph per RCODE- [x] Histogram tool should be extended with ability to generate graph per RCODE.
For example, it might not be very useful to compare latency of SERVFAIL or REFUSED answers, but it is very releant for NOERROR and NXDOMAIN answers.
- [ ...- [x] Histogram tool should be extended with ability to generate graph per RCODE.
For example, it might not be very useful to compare latency of SERVFAIL or REFUSED answers, but it is very releant for NOERROR and NXDOMAIN answers.
- [ ] Also, it might be necessary to limit graphing to answers which matched (to ensure RCODE value is reliable).Ivana KrumlovaIvana Krumlovahttps://gitlab.nic.cz/knot/respdiff/-/issues/9analyze mismatch causes2018-12-18T18:06:36+01:00Petr Špačekanalyze mismatch causesEven reproducible mismatches can have multiple causes:
- a bug in code under test
- non-compliant behavior on remote end (e.g. auhoritative servers)
- a network problem (like dumb firewall) on the way between test machine and remote end
...Even reproducible mismatches can have multiple causes:
- a bug in code under test
- non-compliant behavior on remote end (e.g. auhoritative servers)
- a network problem (like dumb firewall) on the way between test machine and remote end
Categorizing mismatches into these categories takes a lot of time and is tedious. We should use some automation for basic categorization. E.g. queries which are answered correctly according to [DNSViz](http://dnsviz.net/) should be reported separately.
We should store results from DNSViz and other tools for further analysis, like cluster analysis by detected problem. (Please note that DNSViz is just an example.)
This approach will certainly have multiple issues:
- [ ] Not all queries are suitable for analysis by DNSViz. This includes queries which test local resolver behavior but should not cause further communication with authoritative servers. E.g. queries with header bit `RD=0`.
- [ ] It does not discover firewall/network problems.
Design needs to take into account these.Tomas KrizekTomas Krizek2018-12-31https://gitlab.nic.cz/knot/respdiff/-/issues/7sanity check configuration2018-10-30T17:38:16+01:00Petr Špačeksanity check configuration- [x] Option `[diff] target =` must contain name defined in `[servers]` section. It is an error if `target` points to an non-existing server.
- [x] Option `[servers]/[diff] ignore = ` must not be equal to `[diff] target`- [x] Option `[diff] target =` must contain name defined in `[servers]` section. It is an error if `target` points to an non-existing server.
- [x] Option `[servers]/[diff] ignore = ` must not be equal to `[diff] target`Tomas KrizekTomas Krizekhttps://gitlab.nic.cz/knot/respdiff/-/issues/2respdiff optimization: test recursive and forwarding modes at once2018-10-30T17:36:37+01:00Petr Špačekrespdiff optimization: test recursive and forwarding modes at onceIt would be useful to have ability to run `orchestrator` from respdiff once against servers in different modes and then analyze the data twice using different settings.
Example with resolvers:
- Unbound
- BIND
- kresd in recursive mode
...It would be useful to have ability to run `orchestrator` from respdiff once against servers in different modes and then analyze the data twice using different settings.
Example with resolvers:
- Unbound
- BIND
- kresd in recursive mode
- kresd in forwarding mode (e.g. to Unbound)
The orchestrator should make queries to all 4 resolvers simulateniously and store results in DB. Then we need to do diffs for tuples (Unbound, BIND, kresd in recursive mode) and (Unbound, BIND, kresd in forwarding mode) and print respective results separately.
The slowest operation is `orchestrator` so it would be good to find way to do single orchestrator run and then analyze results separately.
Beware, copying LMDB between images and so on might require special steps to preserve sparsenes of the database files.https://gitlab.nic.cz/knot/respdiff/-/issues/15tool to determine measurement error2018-10-30T17:29:16+01:00Petr Špačektool to determine measurement errorRelated to: resolver-benchmarking#22
Values in various diffsum report fields have different "stability" and it is hard to see by naked eye if two consecutive reports indicate statistically significant change or not.
We need a tool whi...Related to: resolver-benchmarking#22
Values in various diffsum report fields have different "stability" and it is hard to see by naked eye if two consecutive reports indicate statistically significant change or not.
We need a tool which will determine "stability" for each report field, and these numbers can be then used as input for resolver-benchmarking#22.Tomas KrizekTomas Krizekhttps://gitlab.nic.cz/knot/respdiff/-/issues/8sanity checks during data gathering2018-10-30T17:27:00+01:00Petr Špačeksanity checks during data gatheringRight now it is relativery easy to make a mistake in respdiff/firewall/daemon configuration and find out about the mistake only later during analysis, which is too late because the time and resources were already wasted.
Some lightweigh...Right now it is relativery easy to make a mistake in respdiff/firewall/daemon configuration and find out about the mistake only later during analysis, which is too late because the time and resources were already wasted.
Some lightweight sanity checks should be done during data gathering so misconfigurations can be detected early and users do not waste time and computing power unnecesairly.
Ideas:
- [x] check that X (or X %) last answers in row are not timeouts
- [ ] check that X (or X %) of randomly selected answers exhibit following properties:
- other resolvers agree in > 90 % of cases
- target agrees in > 90 % of cases
- [ ] compare results from random sampling and last X answers - there should not be significant difference, especially not a drop in match rateTomas KrizekTomas Krizekhttps://gitlab.nic.cz/knot/respdiff/-/issues/6automatic mismatch verification/reproducibility testing2018-10-24T18:24:24+02:00Petr Špačekautomatic mismatch verification/reproducibility testingSome mismatches might be transient (e.g. a random packet loss) and others might be caused by a bug. A tool which will attempt to verify reproducibility of particular mismatch would be very useful. Proof-of-concept is called `diffrepro` b...Some mismatches might be transient (e.g. a random packet loss) and others might be caused by a bug. A tool which will attempt to verify reproducibility of particular mismatch would be very useful. Proof-of-concept is called `diffrepro` but it needs a lot of work to make it easy to use.
Roungh ideas:
- [x] Provide hooks allowing user to put in custom scripts to flush cache or restart daemon. This should allow us to determine if the mismatch was caused by some bad state in cache or if it is reproducible with empty cache.
- [x] Dump state of cache before flush for further inspection
- [x] Output reproducible and irreproducible failures for further processing, e.g. for statistical evaluation that domain example.com. is suffering long-term instability and thus can be excluded from comparison using live data.
- [x] Capture verbose logs from reproduction attempts to reproducible issues can be investigated right away.Tomas KrizekTomas Krizek2018-09-30https://gitlab.nic.cz/knot/respdiff/-/issues/12diff for reports with support for thresholds2018-08-29T13:48:59+02:00Petr Špačekdiff for reports with support for thresholdsOnce we have machine readable output (resolver-benchmarking#12), the next step is to write semantic diff for reports (output from `diffsum`).
It should have ability to display diff and also to evaluate the diff according to configured t...Once we have machine readable output (resolver-benchmarking#12), the next step is to write semantic diff for reports (output from `diffsum`).
It should have ability to display diff and also to evaluate the diff according to configured thresholds/conditions. Values should support specification as % of (answers agreed by other resolvers)/(of mismatches)/absolute value (including zero)
Thresholds should include:
- [ ] number of mismatches for new field: e.g. number of mismatches in field `opcode` which was not present at all in the old data is allowed to have at most value X
- [ ] number of mismatches for new pair of mismatching values: e.g. `rcode` field in the new report contains mismatch pair `(REFUSED,NOTIMP)` which was previously not present:
- [ ] increase in previously recorded mismatches can be at most X
- [ ] maximum number of new domain names referenced in the report is X
Example
========
Old report
----------
```
== Global statistics
queries 907607
answers 907607 100.00 % of queries
others agree 901240 99.30 % of answers (ignoring 0.70 % of answers)
target diagrees 2409 0.27 % of matching answers from others
== Field count % of mismatches
answerrrsigs 1611 67 %
rcode 607 25 %
answertypes 160 7 %
flags 31 1 %
[...]
== Field "flags" mismatch ('QR RD RA AD', 'QR RD RA') query details
nofreezingmac.click. DS 1 mismatches
[...]
```
New report
----------
```
== Global statistics
queries 907607
answers 907607 100.00 % of queries
others agree 901240 99.30 % of answers (ignoring 0.70 % of answers)
target diagrees 2500 0.28 % of matching answers from others
== Field count % of mismatches
answerrrsigs 1611 67 %
rcode 607 25 %
answertypes 160 7 %
opcode 90 4 %
flags 32 1 %
[...]
== Field "flags" mismatch ('QR RD RA AD', 'QR RD RA') query details
nofreezingmac.click. DS 2 mismatches
[...]
```
Diff
----
```
== Global statistics
queries 907607
answers 907607 100.00 % of queries
others agree 901240 99.30 % of answers (ignoring 0.70 % of answers)
target diagrees 2500 0.28 % of matching answers from others +91, +0.01 %
== Field count % of mismatches
answerrrsigs 1611 67 %
rcode 607 25 %
answertypes 160 7 %
opcode 90 4 % +90, +4 %
flags 32 1 % +1, +0 %
[...]
== Field "flags" mismatch ('QR RD RA AD', 'QR RD RA') query details
nofreezingmac.click. DS 2 mismatches +1
[...]
== New domains with mismatching answers (1 total)
now.invalid.opcode.example. +90
```Tomas KrizekTomas Krizekhttps://gitlab.nic.cz/knot/respdiff/-/issues/22connection refused should terminate query collection2018-07-27T17:06:52+02:00Petr Špačekconnection refused should terminate query collectionIf resolver refuses TCP connection, orchestrator should exit.
```
ConnectionRefusedError: [Errno 111] Connection refused
Traceback (most recent call last):
File "/usr/lib/python3.5/multiprocessing/process.py", line 249, in _bootstrap
...If resolver refuses TCP connection, orchestrator should exit.
```
ConnectionRefusedError: [Errno 111] Connection refused
Traceback (most recent call last):
File "/usr/lib/python3.5/multiprocessing/process.py", line 249, in _bootstrap
self.run()
File "/usr/lib/python3.5/multiprocessing/process.py", line 93, in run
self._target(*self._args, **self._kwargs)
File "/usr/lib/python3.5/multiprocessing/pool.py", line 103, in worker
initializer(*initargs)
File "/home/pspacek/respdiff/respdiff/sendrecv.py", line 81, in worker_init
worker_reinit()
File "/home/pspacek/respdiff/respdiff/sendrecv.py", line 85, in worker_reinit
selector, sockets = sock_init() # type: Tuple[Selector, ResolverSockets]
File "/home/pspacek/respdiff/respdiff/sendrecv.py", line 186, in sock_init
sock.connect(destination)
File "/usr/lib/python3.5/ssl.py", line 1019, in connect
self._real_connect(addr, False)
File "/usr/lib/python3.5/multiprocessing/process.py", line 249, in _bootstrap
self.run()
File "/usr/lib/python3.5/ssl.py", line 1006, in _real_connect
socket.connect(self, addr)
File "/usr/lib/python3.5/multiprocessing/process.py", line 93, in run
self._target(*self._args, **self._kwargs)
File "/usr/lib/python3.5/multiprocessing/pool.py", line 103, in worker
initializer(*initargs)
File "/home/pspacek/respdiff/respdiff/sendrecv.py", line 81, in worker_init
worker_reinit()
```Tomas KrizekTomas Krizekhttps://gitlab.nic.cz/knot/respdiff/-/issues/24require minimal Python version2018-07-24T17:54:09+02:00Petr Špačekrequire minimal Python versionIt seems that Python 3.5 is now the minimal required version.
Please add this to requirements.txt if it is possible - I could not find appropriate documentation.It seems that Python 3.5 is now the minimal required version.
Please add this to requirements.txt if it is possible - I could not find appropriate documentation.