Cyber Dreamcatcher

This repository implements a Graph Attention Network (GAT) (same architecture as TacticAI) as a network-aware reinforcement learning policy for cyber defence. Our work extends the Cyber Operations Research Gym (CybORG) to represent network states as directed graphs with low-level features to explore more realistic autonomous defence strategies.

Overview

Core Features

Topology-Aware Defence: Processes the complete network graph structure instead of simplified flat state observations
Runtime Adaptability: Handles dynamic changes in network topology as new connections appear
Cross-Network Generalisation: Trained policies can be deployed to networks of different sizes
Enhanced Interpretability: Defence actions can be explained through tangible network properties

What is included?

Custom CybORG environment with graph-based network state representation
GAT architecture modified for compatibility with policy gradient methods
Empirical evaluation for assessing policy generalisation vs. specialised training across varying network sizes

Note

This is a research project that serves as a proof-of-concept towards more realistic network environments in cyber defence. Our implementation uses the low-level structure of the CybORG v2.1 simulator as a practical context, but the technique itself can be adapted to other simulators with comparable complexity.

Setup

Expand

We used and recommend pixi to setup a reproducible project with predefined tasks.

Clone this repo recursively to clone the CybORG v2.1 simulator and CAGE 2 reference submissions as submodules.

git clone https://github.com/IlyaOrson/CyberDreamcatcher.git --recurse-submodules -j4

Install the dependencies of the project in a local environment.

cd CyberDreamcatcher
pixi install  # setup from pixi.toml file

Then install the submodules as local packages avoiding using pip to deal with dependencies.

# install environments from git submodules as a local packages
pixi run install-cyborg  # CybORG 2.1 + update to gymnasium API

# OR a debugged version from The Alan Turing Institute (https://github.com/alan-turing-institute/CybORG_plus_plus)
pixi run install-cyborg-debugged

# install troublesome dependencies without using pip to track their requirements
pixi run install-sb3  # stable baselines 3

Voila! An activated shell within this environment will have all dependencies working together.

pixi shell  # activate shell
python -m cyberdreamcatcher  # try out a single environment simulation

[!TIP] If you would like to use other project management tool, the list of dependencies and installation tasks are available in pixi.toml. Untested environment files are provided for uv/pip (pyproject.toml) and for conda/mamba (conda_env.yml). Make sure to manually ignore the deps set by CybORG/SB3 when installing it locally.

Functionality

We include predefined tasks that can be run to make sure everything is working:

pixi task list  # displays available tasks

pixi run test-cyborg  # run gymnasium-based cyborg tests

pixi run eval-cardiff  # CAGE 2 winner policy inference (simplified and flattened observation space)

Tip

Hydra is used to handle the inputs and outputs of every script. The available parameters for each task are accessible with the --help flag. The content generated per execution is stored in the outputs/ directory with subdirectories per timestamp of execution. The hyperparameters used in each run are registered in a hidden subfolder .hydra/ within the generated output folder. Tensorboard is used to track interesting metrics, just specify the correct hydra output folder as the logdir: tensorboard --logdir=outputs/...

Graph Layout

Quickly visualise the graph layout setup in the cage 2 challenge scenario file, and the graph observations received by a random GAT policy.

pixi run plot-network scenario=Scenario2  # see --help for hyperparameters

Warning

This is the layout we expect from the simulator configuration and the actions available to the meander agent, but CybORG does not enforce these connection layout at runtime. Connections between other subnets to User0 appear sporadically (unexpected), possibly as a hackish way of flagging the interaction of the meander agent with deployed decoys.

Training

We include an implementation of the REINFORCE algorithm with a normalised rewards-to-go baseline. This is a bit slow since it samples a lot of episodes with a fixed policy to estimate the gradient before taking an optimisation step.

pixi run train-gnn-reinforce  # see --help for hyperparameters

A high-level flattened observation space reference

This script trains a MLP policy with PPO using Stable Baselines 3. The observation space used is the original CAGE 2 observation space - a flattened high-level representation of the network.

pixi run train-flat-sb3-ppo  # see --help for hyperparameters

Important

This SB3 MLP serves as a reference for performance, but cannot extrapolate to different network dimensions. A major caveat for a performance comparison with this or the CAGE 2 submissions is that the observation spaces are fundamentally different: the flattened version is a higher level representation designed for the CAGE 2 Challenge, whereas our custom graph observation uses low-level information from the CybORG simulator. See below for a performance comparison with CAGE 2 Challenge submissions.

Performance

It is possible (❗) to extrapolate the performance of a trained GAT policy under different network layouts.

Visualise reward to go at each timestep

Expand

Specify a scenario to sample episodes from and optionally the weights of a pretrained policy (potentially trained on a different scenario).

# The default behaviour is to use a random policy on "Scenario2".
pixi run plot-performance

# This will compare the performance of a trained policy
# with a random policy on the scenario used for training
pixi run plot-performance policy_weights="path/to/trained_params.pt"

Generalisation to different networks

Expand

The objective is to compare the optimality gap trade-off between the extrapolation of a policy against a policy trained from scratch in each scenario. Specify the path to the trained policy to be tested and array of paths of the specialised policies to compare it to; the corresponding scenarios are loaded from the logged configuration.

# add --help to see the available options
pixi run plot-generalisation policy_weights=path/to/trained_params.pt local_policies=[path/to/0/trained_params.pt,path/to/1/trained_params.pt,path/to/3/trained_params.pt, ...]

Comparison to CAGE 2 submissions

Expand

For a detailed description of the CAGE 2 Challenge, see this preprint.

For a complete list of CAGE 2 submission standings, see here.

Scenario 2 Red Agent: Meander Steps: 30	Penalty mean	Observation Space	Structural Generalization
CAGE2 Winner	~ 6	High-level Flat	No
Stable Baselines 3 MLP + PPO	~ 12	High-level Flat	No
CyberDreamcatcher REINFORCE	~ 18	Low-level Graph	Reasonable
CAGE2 CSS Random	~ 33	High-level Flat	N/A
CAGE2 CSS Sleeper	~ 39	High-level Flat	N/A

Name		Name	Last commit message	Last commit date
Latest commit History 107 Commits
CybORG @ d6464ba		CybORG @ d6464ba
CybORG_plus_plus @ 630ddd1		CybORG_plus_plus @ 630ddd1
cardiff-cage-2 @ 9a29509		cardiff-cage-2 @ 9a29509
cyberdreamcatcher		cyberdreamcatcher
scenarios		scenarios
scripts		scripts
turing-cage-2 @ 559fdff		turing-cage-2 @ 559fdff
.gitattributes		.gitattributes
.gitignore		.gitignore
.gitmodules		.gitmodules
CITATION.cff		CITATION.cff
LICENSE		LICENSE
README.md		README.md
conda_env.yml		conda_env.yml
pixi.lock		pixi.lock
pixi.toml		pixi.toml
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Cyber Dreamcatcher

Overview

Core Features

What is included?

Setup

Functionality

Graph Layout

Training

A high-level flattened observation space reference

Performance

Visualise reward to go at each timestep

Generalisation to different networks

Comparison to CAGE 2 submissions

About

Releases

Packages

Languages

License

IlyaOrson/CyberDreamcatcher

Folders and files

Latest commit

History

Repository files navigation

Cyber Dreamcatcher

Overview

Core Features

What is included?

Setup

Functionality

Graph Layout

Training

A high-level flattened observation space reference

Performance

Visualise reward to go at each timestep

Generalisation to different networks

Comparison to CAGE 2 submissions

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages