|
|
|
# Related projects and goals
|
|
|
|
|
|
|
|
Semi-random notes and ideas
|
|
|
|
|
|
|
|
## Related projects
|
|
|
|
|
|
|
|
### Old DSC
|
|
|
|
|
|
|
|
* Data collected into predefined XML-stored tables ("1D" and "2D" data)
|
|
|
|
* Old-gen web frontend
|
|
|
|
|
|
|
|
### Hedgehog
|
|
|
|
|
|
|
|
* By Sinodun for ICANN, visualising DSC data
|
|
|
|
* https://github.com/dns-stats/hedgehog
|
|
|
|
|
|
|
|
### DscNg
|
|
|
|
|
|
|
|
* Beda and CZNic
|
|
|
|
* http://www.dscng.cz/
|
|
|
|
* DSC data in SQL database
|
|
|
|
* Interactive web frontend
|
|
|
|
* Not developed since 2013
|
|
|
|
|
|
|
|
### PacketQ
|
|
|
|
|
|
|
|
* Packet query language (SQL-like)
|
|
|
|
* Web server with basic result vis.
|
|
|
|
* Source https://github.com/dotse/PacketQ
|
|
|
|
* No longer developed (2014)
|
|
|
|
|
|
|
|
### Wireshark (tshark) - PDML
|
|
|
|
|
|
|
|
* PDML = XML with wireshark-like details
|
|
|
|
* `tshark -r ../data-akuma/akuma.20150106.145000.018146 -T pdml` takes 20 s for 74k packets.
|
|
|
|
* With larger amounts: 3500 packets / s.
|
|
|
|
|
|
|
|
## Targets
|
|
|
|
|
|
|
|
### ICANN collection and storage
|
|
|
|
|
|
|
|
ICANN requirements on storage and resources
|
|
|
|
|
|
|
|
### DSC replacement
|
|
|
|
|
|
|
|
E.g. Cz.Nic admins, maybe CZERT? (What do they want?)
|
|
|
|
|
|
|
|
### ENTRADA-like storage and visualisation
|
|
|
|
|
|
|
|
PacketQ / wireshark / hedgehog / DscNg
|
|
|
|
|
|
|
|
### Statistical predictions and features
|
|
|
|
|
|
|
|
Alexandra, more research - what would be useful?
|
|
|
|
|
|
|
|
|
|
|
|
## Classification ideas
|
|
|
|
|
|
|
|
### DNS clients (and subnets)
|
|
|
|
|
|
|
|
By behaviour: recursive server, private recursive server, spammer, DNS cache, botnet member, misconfigured, ...
|
|
|
|
|
|
|
|
### Queries (packet pairs)
|
|
|
|
|
|
|
|
By type manual (by flags): succesfull signed response, unsigned MX response, ...
|
|
|
|
|
|
|
|
By learned typical classes: class 1 (most short unsigned messages), class 2 (signed med-long messages), ...
|
|
|
|
|
|
|
|
### Domains (and subdomains)
|
|
|
|
|
|
|
|
By popularity, access patterns, letter freq. typicality (pronounceability), resolved adress: stable popular, stable unknown, new typical, scam domains, technical/utility domains, ...
|
|
|
|
|
|
|
|
### Time windows
|
|
|
|
|
|
|
|
By behaviour of various statistics (individual and together): normal, high, low, attack, weird, downtime, ... |