-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: lgen operator #204
feat: lgen operator #204
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mostly questions, am happy to approve if needed but was leaving the review to the folks more familiar with the k8s requirements.
}, | ||
EnvVar { | ||
name: "RUST_LOG".to_owned(), | ||
value: Some("info,keramik_runner=trace".to_owned()), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we really want to default to trace here? I don't think we actually have tracing yet, but that will probably cause a lot of logs/memory consumption 💸
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is just for future iterations. Cause we need tracing one less thing to do I guess when we enable it. We just need to figure out what's causing high memory consumption in tracing.
For now this wont cause a memory consumption increase
} | ||
|
||
/// Start a controller for the LoadGenerator CRD. | ||
pub async fn run() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what do you see as the future of the simulation crd? will that likely go away? seems like there is a lot of duplication here, which is fine for now but we may want to think about how to share some things if we're going to keep both around (and/or add other variations)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see load generation completely separate from simulations.
Simulations will be short lived, more flexible, produce burst traffic and offer way more flexibility and coverage in terms of use cases it covers.
Load generation is for steady non bursty load on the network.
I believe both can be used together as well. Have a steady load over the network, on top of that load we can run simulations to test out specific use-cases.
operator/src/lgen/job.rs
Outdated
metadata: Some(ObjectMeta { | ||
labels: Some(BTreeMap::from_iter(vec![( | ||
"name".to_owned(), | ||
"load-gen-job".to_owned(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we use the LOAD_GENERATOR_JOB_NAME
const for name in this file as well?
..Default::default() | ||
}), | ||
spec: Some(PodSpec { | ||
hostname: Some("job".to_owned()), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not super familiar with the hostname/subdomain settings here.. do we want something more specific?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Naming is hard. We can probably be more specific here, load-gen-job sound better?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Kubernetes uses the hostname and subdomain fields to create a fully qualified domain name (FQDN) for the pod. The FQDN will be of the form hostname.subdomain.namespace.svc.cluster.local
I feel given the subdomain, this should be fine. Any thoughts?
k8s/operator/manifests/operator.yaml
Outdated
@@ -87,7 +87,7 @@ apiVersion: apps/v1 | |||
kind: Deployment | |||
metadata: | |||
name: keramik-operator | |||
namespace: default | |||
namespace: keramik |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe this was intentionally default. IIUC, the operator doesn't itself run in the keramik
namespace, though @nathanielc might have a better idea if it matters much.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only minor comments from my side. LGTM!
job::{job_spec, JobConfig, JobImageConfig}, | ||
spec::{LoadGenerator, LoadGeneratorState}, | ||
}, | ||
simulation::controller::monitoring_ready, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be moved somewhere common? Maybe it's ok but feels odd to pull in simulation-related code in lgen.
pub mod job; | ||
/// Spec module for creating load generator specs. | ||
#[cfg(feature = "controller")] | ||
pub mod spec; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Want to also add these couple of lines at the end? Very minor, just for consistency with the other mods:
#[cfg(feature = "controller")]
pub use controller::run;
operator/src/main.rs
Outdated
@@ -38,7 +38,8 @@ async fn main() -> Result<()> { | |||
Command::Daemon => { | |||
tokio::join!( | |||
keramik_operator::network::run(), | |||
keramik_operator::simulation::run() | |||
keramik_operator::simulation::run(), | |||
keramik_operator::lgen::controller::run(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: if you also export run
, you'll be able to use the same syntax as the other modes.
Features in this pr :
Tests :
Test 1 : Create a load generation and apply it.
Result : Success - Was able to generate load successfully against a ceramic node. Model got indexed on the ceramic peer.