-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature/multi loader logs collection #598
base: main
Are you sure you want to change the base?
Feature/multi loader logs collection #598
Conversation
3dfaaaf
to
d5d74ac
Compare
44717ff
to
c4798ba
Compare
535398d
to
0e11bd8
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks nice. I left some comments.
ba47317
to
bfb51ab
Compare
@leokondrashov as discussed, added the log consolidation logic in 0dc0950 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great, thanks. Please fix couple minor comments
Signed-off-by: Lenson <nosnelmil@gmail.com> add node discovery validators Signed-off-by: Lenson <nosnelmil@gmail.com> add collect TOP metric functions Signed-off-by: Lenson <nosnelmil@gmail.com> add multi-loader metric_manager Signed-off-by: Lenson <nosnelmil@gmail.com> add autoscaler log collection Signed-off-by: Lenson <nosnelmil@gmail.com> add activator log collection Signed-off-by: Lenson <nosnelmil@gmail.com> add prometh log collection Signed-off-by: Lenson <nosnelmil@gmail.com> refactor metric manager contants Signed-off-by: Lenson <nosnelmil@gmail.com> minor fix for node discovery Signed-off-by: Lenson <nosnelmil@gmail.com> fix node discovery Signed-off-by: Lenson <nosnelmil@gmail.com> minor fix Signed-off-by: Lenson <nosnelmil@gmail.com> minor fix Signed-off-by: Lenson <nosnelmil@gmail.com> add logs for prometh Signed-off-by: Lenson <nosnelmil@gmail.com> add pause between prometh collection Signed-off-by: Lenson <nosnelmil@gmail.com> update wait time Signed-off-by: Lenson <nosnelmil@gmail.com> update condition for node discovery Signed-off-by: Lenson <nosnelmil@gmail.com> update logging Signed-off-by: Lenson <nosnelmil@gmail.com>
Signed-off-by: Lenson <nosnelmil@gmail.com> update kind ssh update script Signed-off-by: Lenson <nosnelmil@gmail.com> fix setup kind ssh Signed-off-by: Lenson <nosnelmil@gmail.com>
Signed-off-by: Lenson <nosnelmil@gmail.com> update setup metrics script Signed-off-by: Lenson <nosnelmil@gmail.com>
Signed-off-by: Lenson <nosnelmil@gmail.com>
Signed-off-by: Lenson <nosnelmil@gmail.com> fix log collection test commit a05990d Author: Lenson <nosnelmil@gmail.com> Date: Mon Feb 3 15:39:39 2025 +0800 update test trigger Signed-off-by: Lenson <nosnelmil@gmail.com> commit 3edb3b4 Author: Lenson <nosnelmil@gmail.com> Date: Mon Feb 3 15:33:06 2025 +0800 update test Signed-off-by: Lenson <nosnelmil@gmail.com> commit 56a0f7d Author: Lenson <nosnelmil@gmail.com> Date: Mon Feb 3 15:18:40 2025 +0800 fix Signed-off-by: Lenson <nosnelmil@gmail.com> commit 67c520d Author: Lenson <nosnelmil@gmail.com> Date: Mon Feb 3 15:06:20 2025 +0800 fix Signed-off-by: Lenson <nosnelmil@gmail.com> commit 48ff845 Author: Lenson <nosnelmil@gmail.com> Date: Mon Feb 3 14:46:29 2025 +0800 test' Signed-off-by: Lenson <nosnelmil@gmail.com> commit 295c761 Author: Lenson <nosnelmil@gmail.com> Date: Mon Feb 3 14:45:35 2025 +0800 add adv log collection tests Signed-off-by: Lenson <nosnelmil@gmail.com> commit 8469bdb Author: Lenson <nosnelmil@gmail.com> Date: Mon Feb 3 14:45:05 2025 +0800 update logging Signed-off-by: Lenson <nosnelmil@gmail.com> commit 10e295a Author: Lenson <nosnelmil@gmail.com> Date: Mon Feb 3 14:44:42 2025 +0800 update kind ssh update script Signed-off-by: Lenson <nosnelmil@gmail.com> commit c56a9d8 Author: Lenson <nosnelmil@gmail.com> Date: Mon Feb 3 13:19:27 2025 +0800 add KinD ssh setup script Signed-off-by: Lenson <nosnelmil@gmail.com> commit bf9a804 Author: Lenson <nosnelmil@gmail.com> Date: Mon Feb 3 10:31:55 2025 +0800 update condition for node discovery Signed-off-by: Lenson <nosnelmil@gmail.com> commit b3f078b Author: Lenson <nosnelmil@gmail.com> Date: Fri Jan 31 18:35:03 2025 +0800 add multi loader log collection Signed-off-by: Lenson <nosnelmil@gmail.com> add node discovery validators Signed-off-by: Lenson <nosnelmil@gmail.com> add collect TOP metric functions Signed-off-by: Lenson <nosnelmil@gmail.com> add multi-loader metric_manager Signed-off-by: Lenson <nosnelmil@gmail.com> add autoscaler log collection Signed-off-by: Lenson <nosnelmil@gmail.com> add activator log collection Signed-off-by: Lenson <nosnelmil@gmail.com> add prometh log collection Signed-off-by: Lenson <nosnelmil@gmail.com> refactor metric manager contants Signed-off-by: Lenson <nosnelmil@gmail.com> minor fix for node discovery Signed-off-by: Lenson <nosnelmil@gmail.com> fix node discovery Signed-off-by: Lenson <nosnelmil@gmail.com> minor fix Signed-off-by: Lenson <nosnelmil@gmail.com> minor fix Signed-off-by: Lenson <nosnelmil@gmail.com> add logs for prometh Signed-off-by: Lenson <nosnelmil@gmail.com> add pause between prometh collection Signed-off-by: Lenson <nosnelmil@gmail.com> update wait time Signed-off-by: Lenson <nosnelmil@gmail.com> commit 9bac3c4 Author: Lenson <nosnelmil@gmail.com> Date: Tue Jan 21 13:00:50 2025 +0800 update multi loader docs Signed-off-by: Lenson <nosnelmil@gmail.com> update multi-loader docs Signed-off-by: Lenson <nosnelmil@gmail.com> commit bfd17be Author: Lenson <nosnelmil@gmail.com> Date: Mon Jan 20 16:30:13 2025 +0800 minor multi loader fix Signed-off-by: Lenson <nosnelmil@gmail.com> fix incorrect retry logging Signed-off-by: Lenson <nosnelmil@gmail.com> remove iat and generated cli args Signed-off-by: Lenson <nosnelmil@gmail.com> remove make clean from clean up Signed-off-by: Lenson <nosnelmil@gmail.com> commit 91042aa Author: Lenson <nosnelmil@gmail.com> Date: Thu Jan 16 15:53:19 2025 +0800 update tests Signed-off-by: Lenson <nosnelmil@gmail.com> update multi loader e2e tests Signed-off-by: Lenson <nosnelmil@gmail.com> revert setup.cfg Signed-off-by: Lenson <nosnelmil@gmail.com> chmod script Signed-off-by: Lenson <nosnelmil@gmail.com> update unit tests Signed-off-by: Lenson <nosnelmil@gmail.com> fix e2e test Signed-off-by: Lenson <nosnelmil@gmail.com> update tests Signed-off-by: Lenson <nosnelmil@gmail.com> commit 69c3c3a Author: Lenson <nosnelmil@gmail.com> Date: Tue Dec 31 11:49:55 2024 +0800 add failfast flag Signed-off-by: Lenson <nosnelmil@gmail.com> update failfast flag description Signed-off-by: Lenson <nosnelmil@gmail.com> update comments Signed-off-by: Lenson <nosnelmil@gmail.com> update wordlist with multiloader specific words Signed-off-by: Lenson <nosnelmil@gmail.com> simplify run experiment logic Signed-off-by: Lenson <nosnelmil@gmail.com> refactor partial experiment naming Signed-off-by: Lenson <nosnelmil@gmail.com> fix wrong indexing Signed-off-by: Lenson <nosnelmil@gmail.com> add progress in logging Signed-off-by: Lenson <nosnelmil@gmail.com> commit fc3ad98 Author: Lenson <nosnelmil@gmail.com> Date: Sun Nov 17 14:07:35 2024 +0800 refactor multi loader Signed-off-by: Lenson <nosnelmil@gmail.com> add multi-loader tests Signed-off-by: Lenson <nosnelmil@gmail.com> update test Signed-off-by: Lenson <nosnelmil@gmail.com> refactor multi-loader tests Signed-off-by: Lenson <nosnelmil@gmail.com> add loader experiment Signed-off-by: Lenson <nosnelmil@gmail.com> update logs Signed-off-by: Lenson <nosnelmil@gmail.com> update log verbosity Signed-off-by: Lenson <nosnelmil@gmail.com> update logs Signed-off-by: Lenson <nosnelmil@gmail.com> update logs Signed-off-by: Lenson <nosnelmil@gmail.com> rename multiloader driver to runner Signed-off-by: Lenson <nosnelmil@gmail.com> refactor common files to multiloader folder Signed-off-by: Lenson <nosnelmil@gmail.com> refactor multiloader functions Signed-off-by: Lenson <nosnelmil@gmail.com> rename createNewStudy function name Signed-off-by: Lenson <nosnelmil@gmail.com> fix formatting Signed-off-by: Lenson <nosnelmil@gmail.com> remove extra features Signed-off-by: Lenson <nosnelmil@gmail.com> remove extra features Signed-off-by: Lenson <nosnelmil@gmail.com> add validation for platform Signed-off-by: Lenson <nosnelmil@gmail.com> commit ca5e2ad Author: Lenson <nosnelmil@gmail.com> Date: Sat Nov 16 18:49:35 2024 +0800 add multi loader documentation Signed-off-by: Lenson <nosnelmil@gmail.com> update docs Signed-off-by: Lenson <nosnelmil@gmail.com> fix docs Signed-off-by: Lenson <nosnelmil@gmail.com> update documentation Signed-off-by: Lenson <nosnelmil@gmail.com> commit 3c7e6b5 Author: Lenson <nosnelmil@gmail.com> Date: Sat Nov 16 12:36:43 2024 +0800 add multi-loader Signed-off-by: Lenson <nosnelmil@gmail.com> add multi-loader config reader Signed-off-by: Lenson <nosnelmil@gmail.com> add multi loader base Signed-off-by: Lenson <nosnelmil@gmail.com> add multi loader base Signed-off-by: Lenson <nosnelmil@gmail.com> add node group struct Signed-off-by: Lenson <nosnelmil@gmail.com> add multi loader runner Signed-off-by: Lenson <nosnelmil@gmail.com> refactor multi loader config Signed-off-by: Lenson <nosnelmil@gmail.com> add multi loader config validators Signed-off-by: Lenson <nosnelmil@gmail.com> add knative specific config enricher Signed-off-by: Lenson <nosnelmil@gmail.com> add additional knative platform type Signed-off-by: Lenson <nosnelmil@gmail.com> add base runner entry point Signed-off-by: Lenson <nosnelmil@gmail.com> refactor multi loader config Signed-off-by: Lenson <nosnelmil@gmail.com> update multi loader config struct Signed-off-by: Lenson <nosnelmil@gmail.com> update unpack study doc Signed-off-by: Lenson <nosnelmil@gmail.com> add unpack study Signed-off-by: Lenson <nosnelmil@gmail.com> add prepare experiment Signed-off-by: Lenson <nosnelmil@gmail.com> update experiment config temp path Signed-off-by: Lenson <nosnelmil@gmail.com> add run loader function Signed-off-by: Lenson <nosnelmil@gmail.com> update log parser Signed-off-by: Lenson <nosnelmil@gmail.com> update log parser Signed-off-by: Lenson <nosnelmil@gmail.com> update log parser Signed-off-by: Lenson <nosnelmil@gmail.com> add clean up function Signed-off-by: Lenson <nosnelmil@gmail.com> add logs to indicate run status Signed-off-by: Lenson <nosnelmil@gmail.com> expose entry points for multi loader runner Signed-off-by: Lenson <nosnelmil@gmail.com> add multi loader runner execution Signed-off-by: Lenson <nosnelmil@gmail.com> update default multi loader config path Signed-off-by: Lenson <nosnelmil@gmail.com> add cpu limit validator Signed-off-by: Lenson <nosnelmil@gmail.com> remove extra knative feature Signed-off-by: Lenson <nosnelmil@gmail.com> remove knative extra features Signed-off-by: Lenson <nosnelmil@gmail.com> add multi loader tests Signed-off-by: Lenson <nosnelmil@gmail.com> add basic config Signed-off-by: Lenson <nosnelmil@gmail.com> update basic config Signed-off-by: Lenson <nosnelmil@gmail.com> update basic config Signed-off-by: Lenson <nosnelmil@gmail.com> add basic configs Signed-off-by: Lenson <nosnelmil@gmail.com> update base config Signed-off-by: Lenson <nosnelmil@gmail.com> Signed-off-by: Lenson <nosnelmil@gmail.com> update e2e test Signed-off-by: Lenson <nosnelmil@gmail.com>
Signed-off-by: Lenson <nosnelmil@gmail.com>
67ea824
to
02c8260
Compare
Signed-off-by: Lenson <nosnelmil@gmail.com>
Signed-off-by: Lenson <nosnelmil@gmail.com> update metrics description in docs Signed-off-by: Lenson <nosnelmil@gmail.com>
Signed-off-by: Lenson <nosnelmil@gmail.com>
Signed-off-by: Lenson <nosnelmil@gmail.com>
Signed-off-by: Lenson <nosnelmil@gmail.com>
Signed-off-by: Lenson <nosnelmil@gmail.com> add interval for prometh snapshot collection Signed-off-by: Lenson <nosnelmil@gmail.com>
Signed-off-by: Lenson <nosnelmil@gmail.com>
Signed-off-by: Lenson <nosnelmil@gmail.com>
Signed-off-by: Lenson <nosnelmil@gmail.com>
Signed-off-by: Lenson <nosnelmil@gmail.com>
02c8260
to
8580b19
Compare
Hi @cvetkovic, this PR extends the previously added multi-loader tool by introducing enhanced log collection capabilities. The new feature allows users to gather logs from the Activator and Autoscaler nodes, retrieve TOP metrics from all cluster nodes, and capture Prometheus snapshots. Users can also specify the exact metrics they want to collect using the newly introduced Metric field in the multi-loader configuration. I would appreciate your review and if everything looks good, I will tidy up the commits and prepare for merging into main. Thank you! |
Summary
Extends multi-loader by collecting key logs from nodes in the cluster for the Knative platform. Users can optionally collect the following logs:
Implementation Notes ⚒️
Metrics
field in the multi-loader config, accepting an array with any of the following values:top
,prometheus
,activator
,autoscaler
.MasterNode
,ActivatorNode
,AutoscalerNode
, andWorkerNodes
to allow users to manually specify IPs instead of relying on multi-loader to determine them (mostly unnecessary in typical scenarios).kubectl
to automatically determine node IPs and classify them based on their roles./var/log/pods/knative-serving_activator-*/activator/*
/var/log/pods/knative-serving_autoscaler-*/autoscaler/*
External Dependencies 🍀
Breaking API Changes⚠️