Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Run bench on demo #1552

Merged
merged 115 commits into from
Aug 29, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
115 commits
Select commit Hold shift + click to select a range
fe73fd1
Define basic demo options
ffakenz Aug 6, 2024
86201c0
Extract genDatasetConstantUTxO
ffakenz Aug 6, 2024
ffaf84c
Extract bench scenario
ffakenz Aug 6, 2024
74ad553
Extract findRunningCardanoNode'
ffakenz Aug 6, 2024
8d1eb68
Add network id as part of the options
ffakenz Aug 6, 2024
85667fd
Draft base bench demo script
ffakenz Aug 6, 2024
22a0dbd
Fix seeding devnet on bench-demo given scripts were already published
ffakenz Aug 8, 2024
1a16408
Fix tx building for funding transaction
ffakenz Aug 8, 2024
03da04b
Add hydra signing keys as option args to bench-demo
ffakenz Aug 8, 2024
47f105c
Extend generator of constant utxo property
ffakenz Aug 12, 2024
6b5b52e
Trying to work out why large numbers of clients fail
noonio Aug 12, 2024
85c6f75
Limit the nbr of client keys on arbitrary for dataset
ffakenz Aug 12, 2024
f338f27
Return funds to faucet when it finishes
ffakenz Aug 13, 2024
8cb5813
Avoid passing network id to options and faucet keys on generators
ffakenz Aug 13, 2024
74d6896
Make bench-demo sim to run on top of previous run
ffakenz Aug 14, 2024
b6d53d8
Replace hydra-keys options by hydra-client hosts instead
ffakenz Aug 14, 2024
d1716b7
Start a github action for network testing
ch1bo Aug 14, 2024
a449194
Spin up things individually
ch1bo Aug 14, 2024
9e0041c
Acquire and upload logs
ch1bo Aug 14, 2024
1de6a0a
Make the hnode fallback in demo/seed-devnet.sh non-interactive tty
ch1bo Aug 14, 2024
0cc7f8a
Add invocation to produce traffic
ch1bo Aug 14, 2024
d1a7094
check socket permissions
noonio Aug 14, 2024
04f7724
ls not la
noonio Aug 14, 2024
8800ea3
tmate
noonio Aug 14, 2024
59d27da
tmate; socket file permissions
noonio Aug 14, 2024
6c8cc69
split setup and run benchmarks
noonio Aug 14, 2024
7039d7f
Run pumba and the benchmarks
noonio Aug 14, 2024
781200d
Line buffering; 100% loss on pumba
noonio Aug 14, 2024
7a5750e
tmate
noonio Aug 14, 2024
cca50cc
Rebase and make use new hydra-client options during CI
ffakenz Aug 14, 2024
a8c420f
Use docker version of pumba
noonio Aug 15, 2024
7d77a2f
Does it work with 20% loss?
noonio Aug 15, 2024
f79dbb4
See if it succeeds with 0% loss
noonio Aug 15, 2024
69ced33
What about 5%
noonio Aug 15, 2024
c1d15e7
Optional tmate'ing
noonio Aug 16, 2024
26998e8
Disable condition as it doesn't work
noonio Aug 16, 2024
4fc35a2
Try nix again; leave very long running time
noonio Aug 16, 2024
61b1e89
Try a sleep
noonio Aug 16, 2024
9703451
Horrible hack
noonio Aug 16, 2024
9e00319
Note about checking for pumba working
noonio Aug 16, 2024
336ae3f
Target node 1 for loss and limit target ips for node-2, and node-3
noonio Aug 16, 2024
9ad4b41
Try /24
noonio Aug 16, 2024
9530f1e
Multiple targets is right
noonio Aug 16, 2024
a0e18bc
Let's try 2%
noonio Aug 16, 2024
9c49594
Add now 5%
noonio Aug 16, 2024
b66a128
3%
noonio Aug 16, 2024
e6e0e2a
4%
noonio Aug 16, 2024
314ec58
Try to wait for alice peer disconnected from bob and carol nodes
ffakenz Aug 16, 2024
8dcb837
Executable; compatible with NixOS
noonio Aug 16, 2024
a9ff65b
Add persitency-dir options to hydra-node
ffakenz Aug 16, 2024
88e5bce
Use a special image built for netem; no more tcimage
noonio Aug 19, 2024
080d915
Add base network testing readme docs
ffakenz Aug 19, 2024
6290615
Use local functions to seed the network
ffakenz Aug 19, 2024
de10549
Do not seed neetwork before runnin bench-demo
ffakenz Aug 20, 2024
1c086e9
Remove FIXME as its not reproduceable
ffakenz Aug 20, 2024
58eab02
Minor refactor on generateOneTransfer
ffakenz Aug 20, 2024
088d661
Use random/self transfer generator functions
noonio Aug 20, 2024
e48b67a
Update demo-bech to take datasets from args
ffakenz Aug 20, 2024
4d1b2cb
Gen demo datasets
ffakenz Aug 20, 2024
eaef6ec
Refactor HydraClient to keep peer information
ffakenz Aug 21, 2024
b29df8f
dbg
ffakenz Aug 21, 2024
7be87ce
Try to use the pre-seeded client funds instead of faucet's
ffakenz Aug 21, 2024
6c517b7
Generate each client dataset from a different client funding tx
ffakenz Aug 21, 2024
5a61308
Try not to specify any output
ffakenz Aug 21, 2024
2a69816
Try using mkTxOutAutoBalance instead
ffakenz Aug 21, 2024
3a67f0c
Fix client utxo
ffakenz Aug 21, 2024
6eec715
Try not to specify any output ~ v2
ffakenz Aug 21, 2024
b8bfb3e
Go back to unbalanced outputs ~ v2
ffakenz Aug 21, 2024
99035a4
Make funding tx optional for datasets
ffakenz Aug 21, 2024
374147a
Try not to specify any output ~ v3
ffakenz Aug 21, 2024
0250340
Try using mkTxOutAutoBalance instead ~ v2
ffakenz Aug 21, 2024
5617254
WIP; Revert faucet; use buildRawTransaction
noonio Aug 21, 2024
ef399ad
Create workdir if it doesn't exist
noonio Aug 22, 2024
04648a9
WIP on generateDemoUTxO function; just missing witnesses.
noonio Aug 22, 2024
0d7c62c
Small refactors around types
noonio Aug 22, 2024
c17ecfb
WIP back to generating the dataset on the fly
noonio Aug 22, 2024
b98ad0a
Do not pre-seed devnet
ffakenz Aug 22, 2024
42d4cdf
Use other script for exporting pparams
noonio Aug 23, 2024
b761578
Short contestation period so that it closes fast
noonio Aug 23, 2024
79dfee8
add failure handling on tx processing while head is open
ffakenz Aug 23, 2024
0a15354
Compute pct of transactions confirmed
noonio Aug 23, 2024
e205443
4% should work
noonio Aug 23, 2024
b00a43c
Enhance workflow
ffakenz Aug 26, 2024
0872fa1
Some nice refactoring
ffakenz Aug 27, 2024
8e2c14d
Make watch for logs optional
ffakenz Aug 27, 2024
f77ef9b
Reframe wait for peer disconnected variable
noonio Aug 27, 2024
a855021
Experiment with a matrix
noonio Aug 27, 2024
faee8b8
Name in artifact
noonio Aug 27, 2024
0efbdc4
Expect-failure matrix configuration
noonio Aug 27, 2024
7335112
Add failure mode we expect; refine matrix
noonio Aug 27, 2024
c80281f
Fix order of detecting error conditions
noonio Aug 27, 2024
a1e0de4
Add scaling factor to matrix config
noonio Aug 27, 2024
d14576b
Replace yq by docker inspect
ffakenz Aug 27, 2024
b90ee87
Upload the results as part of the artifacts
ffakenz Aug 27, 2024
46746c6
Make sure results.csv is written to the outputDirectory not the tmp d…
ffakenz Aug 27, 2024
8324d53
Write the summary out even when it failed
ffakenz Aug 27, 2024
0ccb72d
Make peers part of the matrix
noonio Aug 28, 2024
dd2051f
Allow 5% to fail
noonio Aug 28, 2024
b4c3376
Remove explicit failure tracking in the includes/excludes
noonio Aug 28, 2024
d7b18ab
Add extra scaling factor
noonio Aug 28, 2024
fc26227
Matrix information in the name
noonio Aug 28, 2024
4f31cfe
Rename hydra clients peers to api hosts
ffakenz Aug 28, 2024
cc59644
Update README
ffakenz Aug 28, 2024
116976f
Enhance user error during processTransactions
ffakenz Aug 28, 2024
d1d96fd
Update bench demo option description
ffakenz Aug 28, 2024
a5185a5
Make the scenario to seed the network
ffakenz Aug 28, 2024
02d460d
Rever changes over the benchmarks slug used by the website
ffakenz Aug 28, 2024
36ac132
Do not trace signing keys during benchmark
ffakenz Aug 28, 2024
cdf5ff6
Remove watch logs step and simplify workflow args
ffakenz Aug 28, 2024
5e69dd7
Remove unnecessary scripts
ffakenz Aug 28, 2024
a555aec
Enhance failure message for HUnitFailure errors
ffakenz Aug 28, 2024
3aafd9c
Remove hack to check every node sees head is initialized for demo
ffakenz Aug 29, 2024
710d65f
Refactor hacky boolean blindness to manage query params on withConnec…
ffakenz Aug 29, 2024
6654d08
Fix not seeding during scenario but before it gets executed
ffakenz Aug 29, 2024
c5d1208
Don't call --random on pumba; note about time
noonio Aug 29, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
132 changes: 132 additions & 0 deletions .github/workflows/network-test.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,132 @@
name: "Network fault tolerance"

on:
pull_request:
workflow_dispatch:
inputs:
debug_enabled:
type: boolean
description: 'Run the build with tmate debugging enabled (https://github.com/marketplace/actions/debugging-with-tmate)'
required: false
default: false

jobs:
network-test:
runs-on: ubuntu-latest
strategy:
matrix:
# Note: At present we can only run for 3 peers; to configure this for
# more we need to make the docker-compose spin-up dynamic across
# however many we would like to configure here.
# Currently this is just a label and does not have any functional impact.
peers: [3]
scaling_factor: [10, 50]
netem_loss: [0, 1, 2, 3, 4, 5, 10, 20]
name: "Peers: ${{ matrix.peers }}, scaling: ${{ matrix.scaling_factor }}, loss: ${{ matrix.netem_loss }}"
steps:
- uses: actions/checkout@v4
with:
submodules: true

- name: ❄ Prepare nix
uses: cachix/install-nix-action@V27
with:
extra_nix_config: |
accept-flake-config = true
log-lines = 1000

- name: ❄ Cachix cache of nix derivations
uses: cachix/cachix-action@v15
with:
name: cardano-scaling
authToken: '${{ secrets.CACHIX_CARDANO_SCALING_AUTH_TOKEN }}'

- name: Build docker images for netem specifically
run: |
nix build .#docker-hydra-node-for-netem
./result | docker load

- name: Setup containers for network testing
run: |
cd demo
./prepare-devnet.sh
docker compose up -d cardano-node
sleep 5
# :tear: socket permissions.
sudo chown runner:docker devnet/node.socket
./export-tx-id-and-pparams.sh
# Specify two docker compose yamls; the second one overrides the
# images to use the netem ones specifically
docker compose -f docker-compose.yaml -f docker-compose-netem.yaml up -d hydra-node-{1,2,3}
sleep 3
docker ps

- name: Build required nix and docker derivations
run: |
nix build .#legacyPackages.x86_64-linux.hydra-cluster.components.benchmarks.bench-e2e
nix build github:noonio/pumba/noon/add-flake

# Use tmate to get a shell onto the runner to do some temporary hacking
#
# <https://github.com/mxschmitt/action-tmate>
#
- name: Setup tmate session
uses: mxschmitt/action-tmate@v3
if: ${{ github.event_name == 'workflow_dispatch' && github.event.inputs.debug_enabled }}
with:
limit-access-to-actor: true

- name: Run pumba and the benchmarks
# Note: We're going to allow everything to fail. In the job on GitHub,
# we will be able to see which ones _did_, in fact, fail. Originally,
# we were keeping track of our expectations with 'include' and
# 'exclude' directives here, but I think it's best to leave those out,
# as some of the tests (say 5%) fail, and overall the conditions of
# failure depend on the scaling factor, the peers, etc, and it becomes
# too complicated to track here.
continue-on-error: true
run: |
# Extract inputs with defaults for non-workflow_dispatch events
percent="${{ matrix.netem_loss }}"
scaling_factor="${{ matrix.scaling_factor }}"
target_peer="hydra-node-1"
other_peers="172.16.238.20 172.16.238.30"

.github/workflows/network/run_pumba.sh $target_peer $percent $other_peers

# Run benchmark on demo
mkdir benchmarks
touch benchmarks/test.log

nix run .#legacyPackages.x86_64-linux.hydra-cluster.components.benchmarks.bench-e2e -- \
demo \
--output-directory=benchmarks \
--scaling-factor="$scaling_factor" \
--timeout=1000s \
--testnet-magic 42 \
--node-socket=demo/devnet/node.socket \
--hydra-client=localhost:4001 \
--hydra-client=localhost:4002 \
--hydra-client=localhost:4003

- name: Acquire logs
if: always()
run: |
cd demo
docker compose logs > docker-logs

- name: 💾 Upload logs
if: always()
uses: actions/upload-artifact@v4
with:
name: "docker-logs-netem-loss=${{ matrix.netem_loss }},scaling_factor=${{ matrix.scaling_factor }},peers=${{ matrix.peers }}"
path: demo/docker-logs
if-no-files-found: ignore

- name: 💾 Upload build & test artifacts
if: always()
uses: actions/upload-artifact@v4
with:
name: "benchmarks-netem-loss=${{ matrix.netem_loss }},scaling_factor=${{ matrix.scaling_factor }},peers=${{ matrix.peers }}"
path: benchmarks
if-no-files-found: ignore
23 changes: 23 additions & 0 deletions .github/workflows/network/run_pumba.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
#!/usr/bin/env bash

target_node_name=$1

percent=$2

rest_node_names=$3

# Build Pumba netem command
# Note: We leave it for 20 minutes; but really it's effectively unlimited. We don't
# expect any of our tests to run longer than that.
nix_command="nix run github:noonio/pumba/noon/add-flake -- -l debug netem --duration 20m"

while IFS= read -r network; do
nix_command+=" --target $network"
done <<< "$rest_node_names"

nix_command+=" loss --percent \"$percent\" \"re2:$target_node_name\" &"

echo "$nix_command"

# Run Pumba netem command
eval "$nix_command"
2 changes: 2 additions & 0 deletions demo/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
/benchmarks
/datasets
9 changes: 9 additions & 0 deletions demo/docker-compose-netem.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
services:
hydra-node-1:
image: hydra-node-for-netem

hydra-node-2:
image: hydra-node-for-netem

hydra-node-3:
image: hydra-node-for-netem
7 changes: 6 additions & 1 deletion demo/docker-compose.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -48,6 +48,8 @@ services:
, "--ledger-protocol-parameters", "/devnet/protocol-parameters.json"
, "--testnet-magic", "42"
, "--node-socket", "/devnet/node.socket"
, "--persistence-dir", "/devnet/persistence/alice"
, "--contestation-period", "3"
]
networks:
hydra_net:
Expand Down Expand Up @@ -83,6 +85,8 @@ services:
, "--ledger-protocol-parameters", "/devnet/protocol-parameters.json"
, "--testnet-magic", "42"
, "--node-socket", "/devnet/node.socket"
, "--persistence-dir", "/devnet/persistence/bob"
, "--contestation-period", "3"
]
networks:
hydra_net:
Expand Down Expand Up @@ -118,6 +122,8 @@ services:
, "--ledger-protocol-parameters", "/devnet/protocol-parameters.json"
, "--testnet-magic", "42"
, "--node-socket", "/devnet/node.socket"
, "--persistence-dir", "/devnet/persistence/carol"
, "--contestation-period", "3"
]
networks:
hydra_net:
Expand Down Expand Up @@ -188,7 +194,6 @@ services:
hydra_net:
ipv4_address: 172.16.238.5


networks:
hydra_net:
driver: bridge
Expand Down
71 changes: 71 additions & 0 deletions demo/export-tx-id-and-pparams.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,71 @@
#!/usr/bin/env bash

set -eo pipefail

SCRIPT_DIR=${SCRIPT_DIR:-$(realpath $(dirname $(realpath $0)))}
NETWORK_ID=42

CCLI_CMD=
DEVNET_DIR=/devnet
if [[ -n ${1} ]]; then
echo >&2 "Using provided cardano-cli command: ${1}"
$(${1} version > /dev/null)
CCLI_CMD=${1}
DEVNET_DIR=${SCRIPT_DIR}/devnet
fi

HYDRA_NODE_CMD=
if [[ -n ${2} ]]; then
echo >&2 "Using provided hydra-node command: ${2}"
${2} --version > /dev/null
HYDRA_NODE_CMD=${2}
fi

# Invoke hydra-node in a container or via provided executable
function hnode() {
if [[ -n ${HYDRA_NODE_CMD} ]]; then
${HYDRA_NODE_CMD} ${@}
else
docker run --rm \
--pull always \
-v ${SCRIPT_DIR}/devnet:/devnet \
ghcr.io/cardano-scaling/hydra-node:0.18.1 -- ${@}
fi
}

function publishReferenceScripts() {
echo >&2 "Publishing reference scripts..."
hnode publish-scripts \
--testnet-magic ${NETWORK_ID} \
--node-socket ${DEVNET_DIR}/node.socket \
--cardano-signing-key devnet/credentials/faucet.sk
}

# Invoke cardano-cli in running cardano-node container or via provided cardano-cli
function ccli() {
ccli_ ${@} --testnet-magic ${NETWORK_ID}
}
function ccli_() {
if [[ -x ${CCLI_CMD} ]]; then
${CCLI_CMD} ${@}
else
${DOCKER_COMPOSE_CMD} exec cardano-node cardano-cli ${@}
fi
}

function queryPParams() {
echo >&2 "Query Protocol parameters"
if [[ -x ${CCLI_CMD} ]]; then
ccli query protocol-parameters --socket-path ${DEVNET_DIR}/node.socket --out-file /dev/stdout \
| jq ".txFeeFixed = 0 | .txFeePerByte = 0 | .executionUnitPrices.priceMemory = 0 | .executionUnitPrices.priceSteps = 0" > devnet/protocol-parameters.json
else
docker exec demo-cardano-node-1 cardano-cli query protocol-parameters --testnet-magic ${NETWORK_ID} --socket-path ${DEVNET_DIR}/node.socket --out-file /dev/stdout \
| jq ".txFeeFixed = 0 | .txFeePerByte = 0 | .executionUnitPrices.priceMemory = 0 | .executionUnitPrices.priceSteps = 0" > devnet/protocol-parameters.json
fi
echo >&2 "Saved in protocol-parameters.json"
}

queryPParams
echo "HYDRA_SCRIPTS_TX_ID=$(publishReferenceScripts)" > .env
echo >&2 "Environment variable stored in '.env'"
echo >&2 -e "\n\t$(cat .env)\n"
38 changes: 2 additions & 36 deletions demo/seed-devnet.sh
Original file line number Diff line number Diff line change
Expand Up @@ -43,18 +43,6 @@ function ccli_() {
fi
}

# Invoke hydra-node in a container or via provided executable
function hnode() {
if [[ -n ${HYDRA_NODE_CMD} ]]; then
${HYDRA_NODE_CMD} ${@}
else
docker run --rm -it \
--pull always \
-v ${SCRIPT_DIR}/devnet:/devnet \
ghcr.io/cardano-scaling/hydra-node:0.18.1 -- ${@}
fi
}

# Retrieve some lovelace from faucet
function seedFaucet() {
ACTOR=${1}
Expand Down Expand Up @@ -89,26 +77,6 @@ function seedFaucet() {
echo >&2 "Done"
}

function publishReferenceScripts() {
echo >&2 "Publishing reference scripts..."
hnode publish-scripts \
--testnet-magic ${NETWORK_ID} \
--node-socket ${DEVNET_DIR}/node.socket \
--cardano-signing-key devnet/credentials/faucet.sk
}

function queryPParams() {
echo >&2 "Query Protocol parameters"
if [[ -x ${CCLI_CMD} ]]; then
ccli query protocol-parameters --socket-path ${DEVNET_DIR}/node.socket --out-file /dev/stdout \
| jq ".txFeeFixed = 0 | .txFeePerByte = 0 | .executionUnitPrices.priceMemory = 0 | .executionUnitPrices.priceSteps = 0" > devnet/protocol-parameters.json
else
docker exec demo-cardano-node-1 cardano-cli query protocol-parameters --testnet-magic ${NETWORK_ID} --socket-path ${DEVNET_DIR}/node.socket --out-file /dev/stdout \
| jq ".txFeeFixed = 0 | .txFeePerByte = 0 | .executionUnitPrices.priceMemory = 0 | .executionUnitPrices.priceSteps = 0" > devnet/protocol-parameters.json
fi
echo >&2 "Saved in protocol-parameters.json"
}

echo >&2 "Fueling up hydra nodes of alice, bob and carol..."
seedFaucet "alice" 30000000 # 30 Ada to the node
seedFaucet "bob" 30000000 # 30 Ada to the node
Expand All @@ -117,7 +85,5 @@ echo >&2 "Distributing funds to alice, bob and carol..."
seedFaucet "alice-funds" 100000000 # 100 Ada to commit
seedFaucet "bob-funds" 50000000 # 50 Ada to commit
seedFaucet "carol-funds" 25000000 # 25 Ada to commit
queryPParams
echo "HYDRA_SCRIPTS_TX_ID=$(publishReferenceScripts)" > .env
echo >&2 "Environment variable stored in '.env'"
echo >&2 -e "\n\t$(cat .env)\n"

./export-tx-id-and-pparams.sh
11 changes: 11 additions & 0 deletions hydra-cluster/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -140,3 +140,14 @@ The benchmark can be run in two modes corresponding to two different commands:
* `datasets`: Runs one or more preexisting _datasets_ in sequence and collect their results in a single markdown formatted file. This is useful to track the evolution of hydra-node's performance over some well-known datasets over time and produce a human-readable summary.

Check out `cabal bench --benchmark-options --help` for more details.

# Network Testing

The benchmark can be also run over the running `demo` hydra-cluster, using `cabal bench` and produces a
`results.csv` file in a work directory. Same as for benchmarks results, you can use the `bench/plot.sh` script to plot the transaction confirmation times.

To run the benchmark in this mode, the command is:
* `demo`: Runs a single _dataset_ freshly generated and collects its results in a markdown formatted file. The purpose of this setup is to facilitate a variaty of network-resiliance scenarios, such as packet loss or node failures. This is useful to prove the robustness and performance of the hydra-node's network over time and produce a human-readable summary.

For instance, we make use of this in our [CI](https://github.com/cardano-scaling/hydra/blob/master/.github/workflows/network-test.yaml) to keep track for scenarios that we care about.

Loading
Loading