This repository contains example notebooks and tutorials for working with the xReverseLabs Domain Dataset, which is openly available through the AWS Open Data Sponsorship Program.
The xReverseLabs Domain Dataset provides multiple collections of domain and DNS-related data:
-
Daily Domain Dump Dataset
- Plain text files (
YYYY-MM-DD_domains.txt
) - Each line is a domain, updated daily
- Plain text files (
-
Domain By Date Full Data (Historical)
- Daily new/unique domains (
all_new_domains_YYYY-MM-DD.txt
)
- Daily new/unique domains (
-
Domain By Extension Dataset
- Domains grouped by TLD/extension (
com.txt.gz
,org.txt.gz
, etc.)
- Domains grouped by TLD/extension (
-
Forward DNS (FDNS) Dataset
- JSON records of DNS resolution results (status, timestamp, IP addresses)
π Full dataset documentation: https://opendata.xreverselabs.org/about.php
Notebook: xreverselabs/get-to-know-a-dataset.ipynb
This notebook demonstrates how to:
- Access the dataset from AWS S3
- Load daily domain dump files
- Analyze new domains by date
- Explore TLD-specific data
- Parse Forward DNS (FDNS) JSON records
Install dependencies with:
pip install boto3 pandas
Optional (for FDNS exploration and compressed TLD files):
pip install gzip json
π License This repository and example notebooks are provided under the MIT License. The dataset itself is distributed under CC-BY 4.0 license.
π¬ Contact For questions about the dataset or tutorials:
Website: https://opendata.xreverselabs.org
Email: opendata@xreverselabs.org