-
Notifications
You must be signed in to change notification settings - Fork 144
Guest tracing improvements to use tracing
crate
#844
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
- This feature is not used separate from the mem_profile - All the unwind logic is now gated by mem_profile Signed-off-by: Doru Blânzeanu <dblnz@pm.me>
- The guest side does not use this type of OutBAction - The stack unwinding is done either way when the mem_profile feature is enabled Signed-off-by: Doru Blânzeanu <dblnz@pm.me>
Signed-off-by: Doru Blânzeanu <dblnz@pm.me>
- This helps with keeping code separate and easily gating it out Signed-off-by: Doru Blânzeanu <dblnz@pm.me>
- This steps cleans up codebase for the new way of tracing guests - The current method involves custom macros and logic that are not the best for maintainability Signed-off-by: Doru Blânzeanu <dblnz@pm.me>
- Define a separate struct that holds the functionality related to memory profiling of the guest
…ents - This helps us have control over the events fields and store what we care Signed-off-by: Doru Blânzeanu <dblnz@pm.me>
- The `heapless` definitions for spans and events are useful for memory footprint control of the guest Signed-off-by: Doru Blânzeanu <dblnz@pm.me>
Signed-off-by: Doru Blânzeanu <dblnz@pm.me>
Signed-off-by: Doru Blânzeanu <dblnz@pm.me>
- Parse the spans and events coming from the guest and create corresponding spans and events from the host that mimics a single call from host Signed-off-by: Doru Blânzeanu <dblnz@pm.me>
- conditionally handle logs either through tracing or the dedicated VM exit based on whether tracing is initialized on the guest Signed-off-by: Doru Blânzeanu <dblnz@pm.me>
Signed-off-by: Doru Blânzeanu <dblnz@pm.me>
tracing
crate
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I made a high level review, and what I've seen looks good :-)
@@ -1094,15 +1095,9 @@ impl Hypervisor for HypervWindowsDriver { | |||
} | |||
|
|||
#[cfg(feature = "trace_guest")] | |||
fn read_trace_reg(&self, reg: TraceRegister) -> Result<u64> { | |||
fn read_trace_reg(&self) -> Result<X86_64Regs> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should this be read_regs(&self)
?
} | ||
#[cfg(feature = "trace_guest")] | ||
fn trace_info_as_mut(&mut self) -> &mut TraceInfo { | ||
fn trace_info_mut(&mut self) -> &mut TraceInfo { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
<rant>
Ah, I never understand this pattern, why isn't trace_info directly a public member?
But we have that all over the codebase, so... it's ok here
</rant>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because in this case, I need to retrieve the TraceInfo
through the &dyn Hypervisor
which doesn't know about the KvmDriver
's member field.
I don't like it either, to be honest, I think this will look a lot better when we unify the Hypervisors into a single struct. A lot of duplicated code will go also
@@ -11,7 +11,6 @@ members = [ | |||
"src/hyperlight_host", | |||
"src/hyperlight_guest_capi", | |||
"src/hyperlight_guest_tracing", | |||
"src/hyperlight_guest_tracing_macro", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we keep it around, but as deprecated, shimming the new API? so that we don't break anyone who was already using this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I think this is the way to go. But I don't think we'll ever use it again
src/hyperlight_common/src/outb.rs
Outdated
#[cfg(feature = "trace_guest")] | ||
104 => Ok(OutBAction::TraceRecord), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven't though about this before, but, does this mean that a guest compiled with an older hyperlight_guest doesn't run in a newer hyperlight_host?
We should think about this more carefully, not particularly in this PR, but I think in general we haven't been too diligent on paying attention to that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've created #845 to track that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, and in this case, Ideally we should keep that 104
value reserved
HyperlightExit::Halt() | ||
| HyperlightExit::IoOut(_, _, _, _) | ||
| HyperlightExit::Mmio(_) => { | ||
// If the result is not a halt, io out, mmio or debug exit, we need to process the trace batch |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the comment wrong? the logic wrong? or my understanding wrong?
// If the result is not a halt, io out, mmio or debug exit, we need to process the trace batch | |
// If the result is a halt, io out, mmio or debug exit, we need to process the trace batch |
HyperlightExit::Halt() | ||
| HyperlightExit::IoOut(_, _, _, _) | ||
| HyperlightExit::Mmio(_) => { | ||
// If the result is not a halt, io out, mmio or debug exit, we need to process the trace batch |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here
HyperlightExit::Halt() | ||
| HyperlightExit::IoOut(_, _, _, _) | ||
| HyperlightExit::Mmio(_) => { | ||
// If the result is not a halt, io out, mmio or debug exit, we need to process the trace batch |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here
Description
This PR closes #723, #704 and partially addresses #318.
These changes modify the way we perform guest tracing to use the
tracing
crate and its macros (instrument
, trace).How it works
Guest
What makes this possible is the implementation of the
Subscriber
trait in thehyperlight-guest-tracing
crate. By implementing it, we can now handle the capturing of spans and events and choose how to store them and when to export them to the host.The
GuestSubscriber
type that implementsSubscriber
keeps an internalTraceState
that holds all the needed information.Whenever a new span is created, entered or exited, a callback on the subscriber is called so that we can handle the functionality. The same happens with the events also.
Each time a new span or event is added to the internal state, we check whether the buffer got full and send them to the host to process.
Host
When the host detects a VM exit from the guest, it checks whether it contains tracing information in the
OutB
instruction.When tracing information is found, the host starts going through it and check against the local storage of spans.
The spans parents are set based on the information got from the host.
TODO
These do not correctly set the parents of the spans created in the host to the last one created in the guest before doing the VM exit
I need to find a way to propagate the context into the guest and back whenever it is needed. But using the Opentelemetry propagators needs
std
support which we do not have in the guest.Jaeger picture of a Guest call that calls back into the host