DNS Crawler Operation

CZ.NIC association as the operator of the .cz top-level domain performs regular checks of all registered second-level domains. The DNS crawler tool is used for this purpose. The aims are the following:

  1. Improve quality and validity of DNS data by detecting problems in zone contents and configuration, such as expired DNSSEC keys, weak cryptographic algorithms or circular CNAME records.
  2. Discover malevolent activities and security problems, such as abusive e-shops or domains used for the operation of botnets.
  3. Automatically classify the domains according to the configuration and contents of DNS zones, implementations and versions of their DNS, mail and web servers, as well as the general character of the domain's main web page (if it exists).

DNS crawler is designed so as to collect data efficiently, without requesting the same information repeatedly. The extra burden on the Internet infrastructure that it causes should be negligible compared to regular traffic.

CZ.NIC decided to be absolutely open with respect to DNS crawler operation. In particular:

The only aspect of DNS crawler operation that CZ.NIC does not publish is the list of second-level domains in the .cz zone.

Machines running the DNS crawler

  1. crawler-1.labs.nic.cz (IPv4: 217.31.192.34, IPv6: 2001:1488:ac15:ff40::34)
  2. crawler-2.labs.nic.cz (IPv4: 217.31.192.35, IPv6: 2001:1488:ac15:ff40::35) 
  3. crawler-3.labs.nic.cz (IPv4: 217.31.192.36, IPv6: 2001:1488:ac15:ff40::36)
  4. crawler-4.labs.nic.cz (IPv4: 217.31.192.37, IPv6: 2001:1488:ac15:ff40::37)

Collected data and operational schedule

DNS crawler collects the following data for all second-level domains under .cz:

In a normal mode of operation, DNS crawler is run regularly with two different periods – weekly and mohthly – depending on the type of data. However, newly created domains are scanned every day during the first two weeks of their existence, so as to discover malicious activities and configuration problems as early as possible. The data collection schedule and retention periods for each category of data are shown in the following table.

Data collection schedule and retention
Type of data New domains Other domains Max. retention period
DNS once a day once a week 1 year
SMTP once a day once a week 1 year
Web – metadata once a day once a week 1 year
Web – contents once a day once a month 1 month

Data use policy

CZ.NIC promises to adhere to the following rules regarding the data obtained from DNS crawler operation:

  1. Original data collected by the crawler, as well as processed data and information about specific domains, shall neither be published nor passed to third parties, except in the following cases:
    •  CZ.NIC association is required by legal regulations to reveal the data.
    • Know-how and services of third parties will be utilized for the purposes stated above, e.g. in joint projects. In this case, data shall be provided under a non-disclosure agreement.
  2. Problems of all kinds that need to be addressed by domain owners or administrators shall be communicated privately to appropriate domain contacts obtained from the .cz domain registry.
  3. Discovered security incidents shall be handled using the standard procedures of the CSIRT.CZ security team.
  4. CZ.NIC association shall use the classification of domains and their contents for operational, planning, research and educational purposes.
  5. General statistics obtained from the collected data shall be publicly available, both in a graphical form and as open data.
  6. Each data item collected by the DNS crawler shall be retained for no longer than the maximum period shown in the table above (usually much shorter).

Contact information