-
Notifications
You must be signed in to change notification settings - Fork 1.6k
enhance: browser_som_screenshot #3059
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Important Review skippedAuto reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the You can disable this status message by setting the WalkthroughReplaces manual SVG overlay with a labeler-driven SOM injection path, adds hierarchy-based clickable filtering and a fullVisualMode sentinel, centralizes snapshot retrieval via getSnapshotForAction, wires fullVisualMode through Python↔TS, improves TS server readiness/logging, and tightens popup/DOM-stability and click/new-tab handling. Changes
Sequence Diagram(s)sequenceDiagram
autonumber
participant User
participant PyToolkit as PythonToolkit
participant TSServer
participant Session
participant Page
rect rgba(230,240,255,0.35)
note over User,Page: Action flow (fullVisualMode-aware)
User->>PyToolkit: browser_click / visit_page / ...
PyToolkit->>TSServer: send WS command (includes fullVisualMode)
TSServer->>Session: performAction(...)
Session->>Session: getSnapshotForAction(viewportLimit)
alt fullVisualMode = true
Session-->>TSServer: snapshot = "full visual mode" sentinel
else
Session->>Page: collect page state / compute snapshot
Session-->>TSServer: snapshot (real)
end
TSServer-->>PyToolkit: { result, snapshot, tabs... }
PyToolkit-->>User: structured response
end
sequenceDiagram
autonumber
participant Session
participant Page
participant Parser as HierarchyParser
participant Overlay as SomScreenshotInjected
rect rgba(235,255,235,0.35)
note over Session,Overlay: SOM overlay generation path
Session->>Page: fetch page + AI snapshot
Session->>Parser: filterClickableByHierarchy(snapshotText, clickables)
Parser-->>Session: filteredClickableSet
Session->>Overlay: captureOptimized(page, snapshotResult, filteredClickableSet, undefined)
Overlay-->>Session: { imageDataURL, timing, markedElements }
end
sequenceDiagram
autonumber
participant Session
participant Page
participant NewTab as PopupPage
rect rgba(255,245,230,0.35)
note over Session,NewTab: Click with popup/new-tab handling and DOM stability wait
Session->>Page: performClick(target)
alt popup/new-tab detected
Session->>NewTab: wait for DOMContentLoaded
par
Session->>NewTab: waitForDOMStability(~100ms quiet)
Session->>NewTab: wait for network idle
and
Session->>NewTab: detect content/interactive elements
end
else
Session->>Page: waitForDOMStability / stability checks
end
Session-->>Session: continue workflow
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Suggested labels
Suggested reviewers
Poem
✨ Finishing Touches🧪 Generate unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. CodeRabbit Commands (Invoked using PR/Issue comments)Type Other keywords and placeholders
CodeRabbit Configuration File (
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
🧹 Nitpick comments (12)
camel/toolkits/hybrid_browser_toolkit/ts/src/screenshot-labeler.ts (6)
4-11
: Remove or utilizeisClickable
to avoid dead fields.
isClickable
exists onElementWithCoordinates
but is not used anywhere in this module. Either remove it to avoid drift or use it to vary styling (e.g., color/opacity) for future non-clickable overlays.interface ElementWithCoordinates { ref: string; x: number; y: number; width: number; height: number; - isClickable: boolean; }
85-127
: Label bounds use a fixed width; long refs will overflow and occupancy checks become inaccurate.Both the rendered background rect (lines 429–433) and the collision checks (lines 106–121) assume
LABEL_MIN_WIDTH
. For multi-character refs (common), text will overflow and overlap logic underestimates occupied space.Apply dynamic label width based on an approximate text width. This keeps occupancy and rendering consistent:
+ private readonly LABEL_MAX_WIDTH = 160; + private estimateTextWidth(text: string): number { + // Approximate 11px Arial at ~6px per char plus padding + const approx = this.LABEL_PADDING * 2 + Math.ceil(text.length * 6); + return Math.max(this.LABEL_MIN_WIDTH, Math.min(this.LABEL_MAX_WIDTH, approx)); + }- sortedElements.forEach(element => { + sortedElements.forEach(element => { + const labelWidth = this.estimateTextWidth(element.ref); // Try different positions within the element const positions = [ { x: element.x + this.LABEL_PADDING, y: element.y + this.LABEL_HEIGHT }, // Top-left - { x: element.x + element.width - this.LABEL_MIN_WIDTH - this.LABEL_PADDING, y: element.y + this.LABEL_HEIGHT }, // Top-right + { x: element.x + element.width - labelWidth - this.LABEL_PADDING, y: element.y + this.LABEL_HEIGHT }, // Top-right { x: element.x + this.LABEL_PADDING, y: element.y + element.height - this.LABEL_PADDING }, // Bottom-left - { x: element.x + element.width - this.LABEL_MIN_WIDTH - this.LABEL_PADDING, y: element.y + element.height - this.LABEL_PADDING }, // Bottom-right - { x: element.x + element.width / 2 - this.LABEL_MIN_WIDTH / 2, y: element.y + element.height / 2 } // Center + { x: element.x + element.width - labelWidth - this.LABEL_PADDING, y: element.y + element.height - this.LABEL_PADDING }, // Bottom-right + { x: element.x + element.width / 2 - labelWidth / 2, y: element.y + element.height / 2 } // Center ]; // Find first non-overlapping position for (const pos of positions) { const labelBounds = { x: pos.x, y: pos.y - this.LABEL_HEIGHT, - width: this.LABEL_MIN_WIDTH, + width: labelWidth, height: this.LABEL_HEIGHT };
132-145
: Carry dynamic widths into occupancy for external labels.When seeding
occupiedPositions
from existing labels, derive each label's actual width or you’ll underestimate collisions.- const occupiedPositions = existingLabels.map(label => ({ - x: label.x, - y: label.y - this.LABEL_HEIGHT, - width: this.LABEL_MIN_WIDTH, - height: this.LABEL_HEIGHT - })); + const occupiedPositions = existingLabels.map(label => { + const w = this.estimateTextWidth(label.ref); + return { + x: label.x, + y: label.y - this.LABEL_HEIGHT, + width: w, + height: this.LABEL_HEIGHT + }; + });
205-231
: Naive O(n^2) grouping; also not transitive.Current grouping checks proximity only to the seed element. Elements A—B—C where A
B and BC, but not A~C, end up split. For dense clusters, this increases label clutter.Refactor to union-find/BFS over proximity graph:
- Build adjacency by threshold.
- BFS/DFS to collect connected components for groups.
I can provide a patch if you want to switch to a transitive grouping.
325-354
: Use dynamic width in external position search.
findBestExternalPosition
should use the measured width; otherwise left/right placements will be off-screen or overlapping.- const positions = [ - { x: element.x + element.width + 10, y: element.y + element.height / 2 }, // Right - { x: element.x - this.LABEL_MIN_WIDTH - 10, y: element.y + element.height / 2 }, // Left - { x: element.x + element.width / 2 - this.LABEL_MIN_WIDTH / 2, y: element.y - this.LABEL_HEIGHT - 10 }, // Top - { x: element.x + element.width / 2 - this.LABEL_MIN_WIDTH / 2, y: element.y + element.height + 10 + this.LABEL_HEIGHT } // Bottom - ]; + const w = this.estimateTextWidth(element.ref); + const positions = [ + { x: element.x + element.width + 10, y: element.y + element.height / 2 }, // Right + { x: element.x - w - 10, y: element.y + element.height / 2 }, // Left + { x: element.x + element.width / 2 - w / 2, y: element.y - this.LABEL_HEIGHT - 10 }, // Top + { x: element.x + element.width / 2 - w / 2, y: element.y + element.height + 10 + this.LABEL_HEIGHT } // Bottom + ]; for (const pos of positions) { const labelBounds = { x: pos.x, y: pos.y - this.LABEL_HEIGHT, - width: this.LABEL_MIN_WIDTH, + width: w, height: this.LABEL_HEIGHT };
383-389
: Consider a small collision margin to avoid labels abutting each other.Edges touching are currently considered “no overlap”. Adding a small margin improves legibility with dense layouts.
- private overlapsWithExisting( + private overlapsWithExisting( label: {x: number, y: number, width: number, height: number}, existing: Array<{x: number, y: number, width: number, height: number}> ): boolean { - return existing.some(existing => - !(label.x + label.width < existing.x || - existing.x + existing.width < label.x || - label.y + label.height < existing.y || - existing.y + existing.height < label.y) - ); + const m = 2; // margin in px + return existing.some(e => + !((label.x + label.width) <= (e.x - m) || + (e.x + e.width) <= (label.x - m) || + (label.y + label.height) <= (e.y - m) || + (e.y + e.height) <= (label.y - m)) + ); }camel/toolkits/hybrid_browser_toolkit/ts/src/hybrid-browser-toolkit.ts (6)
247-251
: Clickable filtering depends on[cursor=pointer]
heuristic; may miss truly interactive elements.Many clickable elements (e.g., anchors, buttons) don’t always emit
[cursor=pointer]
in the snapshot. You may under-mark elements.
- Consider alternative signals from the snapshot (role=button/link, tabindex, event listeners, tag name) to build the clickable set.
- Fall back to visibility + size thresholds when in doubt.
Would you like me to prototype an enriched parser?
256-266
: Clamp and skip degenerate rectangles after scaling.After scaling and clamping to the screenshot, width/height can drop to 0–1 px near edges. These produce invisible labels and can skew occupancy.
- const elements = interactiveElements.map(([ref, element]) => { + const elements = interactiveElements.map(([ref, element]) => { const coords = element.coordinates!; - return { + const x = Math.max(0, coords.x * scaleX); + const y = Math.max(0, coords.y * scaleY); + const width = Math.min(coords.width * scaleX, screenshotWidth - x); + const height = Math.min(coords.height * scaleY, screenshotHeight - y); + if (width < 1 || height < 1) { + return null; + } + return { ref, - x: Math.max(0, coords.x * scaleX), - y: Math.max(0, coords.y * scaleY), - width: Math.min(coords.width * scaleX, screenshotWidth - Math.max(0, coords.x * scaleX)), - height: Math.min(coords.height * scaleY, screenshotHeight - Math.max(0, coords.y * scaleY)), + x, + y, + width, + height, isClickable: true // All filtered elements are clickable }; - }); + }).filter(Boolean) as any;
268-274
: WrapsvgContent
in a grouping element for easier future styling/debug toggles.Not required, but grouping facilitates toggling overlay visibility or styles via a single
class
/id
.- const svgOverlay = ` - <svg width="${screenshotWidth}" height="${screenshotHeight}" xmlns="http://www.w3.org/2000/svg"> - ${svgContent} - </svg> - `; + const svgOverlay = ` + <svg width="${screenshotWidth}" height="${screenshotHeight}" xmlns="http://www.w3.org/2000/svg"> + <g id="camel-overlay" data-version="v2"> + ${svgContent} + </g> + </svg> + `;
153-158
: “marked visually” count can be inaccurate; align it with clickable filtering.You report all elements-with-coordinates, but later filter to clickable ones. The message can overstate how many were actually labeled.
- // Count elements with coordinates - const elementsWithCoords = Object.values(snapshotResult.elements).filter(el => el.coordinates).length; + // Count eligible elements (with coordinates and clickable hint) + const clickableSet = this.parseClickableElements(snapshotResult.snapshot); + const elementsWithCoords = Object.entries(snapshotResult.elements) + .filter(([ref, el]) => el.coordinates && clickableSet.has(ref)).length;Optionally, return the exact count from
addVisualMarksOptimized
to include overlap-removal effects.Do you want me to thread the exact labeled count back via a small return-structure change?
319-345
: Topmost detection relies on iteration order ofelements
; not DOM/stacking-aware.
removeOverlappedElements
assumes “later in array = on top”. IfsnapshotResult.elements
isn’t in DOM or stacking order, results will be flaky.
- If snapshot provides z-index / paint order, sort by it before filtering.
- Otherwise, consider retaining all overlapping clickable elements but de-emphasize ones that are fully covered (e.g., dashed outline) to avoid false negatives.
Can you confirm the ordering guarantees of
snapshotResult.elements
?
290-293
: Silent fallback on errors loses diagnostics; log minimal context in debug mode.Returning the raw screenshot is correct, but a guarded debug log helps triage in CI.
} catch (error) { // Error adding visual marks, falling back to original screenshot // Return original screenshot if marking fails + if (process.env.HYBRID_BROWSER_DEBUG === '1') { + console.warn('[VisualMarks] Failed to render overlay:', error); + } return screenshotBuffer; }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (2)
camel/toolkits/hybrid_browser_toolkit/ts/src/hybrid-browser-toolkit.ts
(2 hunks)camel/toolkits/hybrid_browser_toolkit/ts/src/screenshot-labeler.ts
(1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
- GitHub Check: pytest_apps
- GitHub Check: pytest_examples
- GitHub Check: pytest_package_fast_test
- GitHub Check: pytest_package_llm_test
- GitHub Check: Minimal dependency check
- GitHub Check: pre-commit
🔇 Additional comments (3)
camel/toolkits/hybrid_browser_toolkit/ts/src/screenshot-labeler.ts (2)
25-32
: Good encapsulation and clear, tunable constants.The class-based design and centralized constants make this easy to evolve and test. Solid foundation for the labeling system.
1-2
: Overall: strong improvement over inline mark generation.The centralization solves overlap and readability pain points we’ve seen. With the small fixes above (escaping, dynamic widths, group occupancy), this should be robust in the wild.
If you have
noUnusedLocals
enabled, confirm this compiles after removinggroupCenter
and adding the helper methods. I can add unit tests covering:
- Large vs small elements
- Group labels clamping and collision avoidance
- SVG escaping for tricky refs (e.g.,
<a>&"'</a>
)Would you like me to draft those tests?
camel/toolkits/hybrid_browser_toolkit/ts/src/hybrid-browser-toolkit.ts (1)
5-5
: Good modularization by delegating toScreenshotLabeler
.Importing the labeler keeps this file focused on orchestration and reduces duplication of SVG logic. Nice separation of concerns.
camel/toolkits/hybrid_browser_toolkit/ts/src/screenshot-labeler.ts
Outdated
Show resolved
Hide resolved
camel/toolkits/hybrid_browser_toolkit/ts/src/screenshot-labeler.ts
Outdated
Show resolved
Hide resolved
camel/toolkits/hybrid_browser_toolkit/ts/src/screenshot-labeler.ts
Outdated
Show resolved
Hide resolved
@coderabbitai review |
✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
♻️ Duplicate comments (2)
camel/toolkits/hybrid_browser_toolkit/ts/src/screenshot-labeler.ts (2)
185-209
: Good catch: group labels now clamp vertically and avoid collisionsThe added clamping and overlap checks for group labels address the earlier review’s concerns about off-screen rendering and collision in dense regions. Nice improvement.
Also applies to: 307-364
487-498
: Security: XML/SVG escaping is implemented correctly and background now scales with textEscaping refs and using dynamic background width closes SVG/XML injection vectors and improves legibility. Good job.
Also applies to: 500-538
🧹 Nitpick comments (3)
camel/toolkits/hybrid_browser_toolkit/ts/src/screenshot-labeler.ts (3)
220-246
: Grouping heuristic is O(n^2) with a fixed 50px threshold; consider adaptive threshold or spatial indexingFor pages with many small elements, consider:
- Use a grid/spatial hash (bucket by, e.g., 64px) to reduce candidate pairs.
- Make “threshold” relative to LABEL_HEIGHT or viewport size to adapt across DPIs.
487-498
: Optional: Use isClickable to style/filter, or drop the property to avoid confusionYou define isClickable on ElementWithCoordinates but don’t use it. Either:
- Style non-clickables differently (stroke/label color), or
- Filter them out before labeling and remove the field.
435-441
: Nit: Consider integer rounding for crisper renderingRounding x/y/width/height to integers often produces sharper lines/text in browsers.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (1)
camel/toolkits/hybrid_browser_toolkit/ts/src/screenshot-labeler.ts
(1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
- GitHub Check: pre-commit
- GitHub Check: pytest_examples
- GitHub Check: pytest_apps
- GitHub Check: pytest_package_fast_test
- GitHub Check: pytest_package_llm_test
🔇 Additional comments (1)
camel/toolkits/hybrid_browser_toolkit/ts/src/screenshot-labeler.ts (1)
481-486
: Confirm consumers expect an SVG fragment (not a full element)generateSVG returns a fragment of SVG nodes. Verify the caller wraps it in an with proper width/height/viewBox. If the consumer expects a full SVG, export a container here.
camel/toolkits/hybrid_browser_toolkit/ts/src/screenshot-labeler.ts
Outdated
Show resolved
Hide resolved
@coderabbitai review |
✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
♻️ Duplicate comments (1)
camel/toolkits/hybrid_browser_toolkit/ts/src/screenshot-labeler.ts (1)
25-35
: Unify label geometry (dynamic width/height) and fix “Top” placement bug to prevent collisions and off-screen renderingPlacement uses fixed LABEL_MIN_WIDTH/LABEL_HEIGHT while rendering uses dynamic width (estimateTextWidth + padding). This mismatch causes missed overlap detection and labels rendering out of bounds. Also, findBestExternalPosition subtracts LABEL_HEIGHT for the “Top” baseline, pushing labels too far up. Centralize font/padding, use a single getLabelWidth helper everywhere, and clamp consistently.
Apply the following diffs:
@@ export class ScreenshotLabeler { private readonly MIN_ELEMENT_WIDTH = 40; // Minimum width for direct labeling private readonly MIN_ELEMENT_HEIGHT = 70; // Minimum height for direct labeling private readonly LABEL_PADDING = 4; private readonly LABEL_HEIGHT = 16; private readonly LABEL_MIN_WIDTH = 24; private readonly LINE_WIDTH = 3; // Increased line width private readonly LINE_COLOR = '#FF0066'; // Bright pink/red for high visibility private readonly ARROW_SIZE = 8; // Arrow head size private readonly MIN_LABEL_SPACING = 20; + // Centralized font settings to keep placement and rendering in sync + private readonly FONT_SIZE = 11; + private readonly FONT_FAMILY = 'Arial, sans-serif';@@ private createInternalLabels(elements: ElementWithCoordinates[]): LabelPosition[] { @@ - sortedElements.forEach(element => { - // Try different positions within the element + sortedElements.forEach(element => { + const w = this.getLabelWidth(element.ref); + // Try different positions within the element (baseline y for text) const positions = [ - { x: element.x + this.LABEL_PADDING, y: element.y + this.LABEL_HEIGHT }, // Top-left - { x: element.x + element.width - this.LABEL_MIN_WIDTH - this.LABEL_PADDING, y: element.y + this.LABEL_HEIGHT }, // Top-right - { x: element.x + this.LABEL_PADDING, y: element.y + element.height - this.LABEL_PADDING }, // Bottom-left - { x: element.x + element.width - this.LABEL_MIN_WIDTH - this.LABEL_PADDING, y: element.y + element.height - this.LABEL_PADDING }, // Bottom-right - { x: element.x + element.width / 2 - this.LABEL_MIN_WIDTH / 2, y: element.y + element.height / 2 } // Center + { x: element.x + this.LABEL_PADDING, y: element.y + this.LABEL_HEIGHT + this.LABEL_PADDING }, // Top-left + { x: element.x + element.width - w - this.LABEL_PADDING, y: element.y + this.LABEL_HEIGHT + this.LABEL_PADDING }, // Top-right + { x: element.x + this.LABEL_PADDING, y: element.y + element.height - this.LABEL_PADDING }, // Bottom-left + { x: element.x + element.width - w - this.LABEL_PADDING, y: element.y + element.height - this.LABEL_PADDING }, // Bottom-right + { x: element.x + element.width / 2 - w / 2, y: element.y + element.height / 2 } // Center ]; @@ - const labelBounds = { - x: pos.x, - y: pos.y - this.LABEL_HEIGHT, - width: this.LABEL_MIN_WIDTH, - height: this.LABEL_HEIGHT - }; + const labelBounds = { + x: pos.x, + y: pos.y - this.LABEL_HEIGHT - this.LABEL_PADDING, + width: w, + height: this.LABEL_HEIGHT + this.LABEL_PADDING * 2 + };@@ private createExternalLabels( @@ - const occupiedPositions = existingLabels.map(label => ({ - x: label.x, - y: label.y - this.LABEL_HEIGHT, - width: this.LABEL_MIN_WIDTH, - height: this.LABEL_HEIGHT - })); + const occupiedPositions = existingLabels.map(label => ({ + x: label.x, + y: label.y - this.LABEL_HEIGHT - this.LABEL_PADDING, + width: this.getLabelWidth(label.ref), + height: this.LABEL_HEIGHT + this.LABEL_PADDING * 2 + })); @@ - if (labelPos) { + if (labelPos) { labels.push({ ref: element.ref, x: labelPos.x, y: labelPos.y, lineToElement: (() => { - const actualLabelWidth = this.calculateLabelWidth(element.ref); - const padding = 4; + const actualLabelWidth = this.getLabelWidth(element.ref); + const padding = this.LABEL_PADDING; const connections = this.getBestConnectionPoints( labelPos.x - padding, // Account for padding labelPos.y, actualLabelWidth, - this.LABEL_HEIGHT + 4, // Account for vertical padding + this.LABEL_HEIGHT + this.LABEL_PADDING * 2, // Account for vertical padding element.x, element.y, element.width, element.height ); return { x1: connections.start.x, y1: connections.start.y, x2: connections.end.x, y2: connections.end.y }; })() }); @@ - occupiedPositions.push({ - x: labelPos.x, - y: labelPos.y - this.LABEL_HEIGHT, - width: this.LABEL_MIN_WIDTH, - height: this.LABEL_HEIGHT - }); + const w = this.getLabelWidth(element.ref); + occupiedPositions.push({ + x: labelPos.x, + y: labelPos.y - this.LABEL_HEIGHT - this.LABEL_PADDING, + width: w, + height: this.LABEL_HEIGHT + this.LABEL_PADDING * 2 + });@@ - // Clamp Y position to screen bounds - const clampedY = Math.max(this.LABEL_HEIGHT, Math.min(screenshotHeight, label.y)); + // Clamp Y position to screen bounds (respect padding) + const clampedY = Math.max(this.LABEL_HEIGHT + this.LABEL_PADDING, Math.min(screenshotHeight - this.LABEL_PADDING, label.y)); label.y = clampedY; @@ - const labelBounds = { - x: label.x, - y: label.y - this.LABEL_HEIGHT, - width: this.LABEL_MIN_WIDTH, - height: this.LABEL_HEIGHT - }; + const labelBounds = { + x: label.x, + y: label.y - this.LABEL_HEIGHT - this.LABEL_PADDING, + width: this.getLabelWidth(label.ref), + height: this.LABEL_HEIGHT + this.LABEL_PADDING * 2 + };@@ - // Check if right side placement would go off-screen - if (labelX + this.LABEL_MIN_WIDTH > screenshotWidth) { + // Check if right side placement would go off-screen + const maxW = Math.max(...group.map(e => this.getLabelWidth(e.ref))); + if (labelX + maxW > screenshotWidth) { // Try left side instead - const leftX = groupBounds.x - this.LABEL_MIN_WIDTH - 20; + const leftX = groupBounds.x - maxW - 20; if (leftX >= 0) { labelX = leftX; useLeftSide = true; } else { // Can't place on either side return labels; } } @@ - const labelBounds = { - x: labelX, - y: labelYPosition - this.LABEL_HEIGHT, - width: this.LABEL_MIN_WIDTH, - height: this.LABEL_HEIGHT - }; + const w = this.getLabelWidth(sortedGroup[i].ref); + const labelBounds = { + x: labelX, + y: labelYPosition - this.LABEL_HEIGHT - this.LABEL_PADDING, + width: w, + height: this.LABEL_HEIGHT + this.LABEL_PADDING * 2 + }; @@ - const actualLabelWidth = this.calculateLabelWidth(sortedGroup[i].ref); - const padding = 4; + const actualLabelWidth = this.getLabelWidth(sortedGroup[i].ref); + const padding = this.LABEL_PADDING; const connections = this.getBestConnectionPoints( labelX - padding, // Account for padding labelYPosition, actualLabelWidth, - this.LABEL_HEIGHT + 4, // Account for vertical padding + this.LABEL_HEIGHT + this.LABEL_PADDING * 2, // Account for vertical padding sortedGroup[i].x, sortedGroup[i].y, sortedGroup[i].width, sortedGroup[i].height );@@ - if (!foundValidPosition) { + if (!foundValidPosition) { sortedGroup.forEach((element, index) => { const labelYPosition = Math.max(this.LABEL_HEIGHT, Math.min(screenshotHeight, labelY + (index * this.MIN_LABEL_SPACING))); - - const actualLabelWidth = this.calculateLabelWidth(element.ref); - const padding = 4; + const actualLabelWidth = this.getLabelWidth(element.ref); + const padding = this.LABEL_PADDING; const connections = this.getBestConnectionPoints( labelX - padding, // Account for padding labelYPosition, actualLabelWidth, - this.LABEL_HEIGHT + 4, // Account for vertical padding + this.LABEL_HEIGHT + this.LABEL_PADDING * 2, // Account for vertical padding element.x, element.y, element.width, element.height );@@ private findBestExternalPosition( @@ - ): {x: number, y: number} | null { - // Try positions around the element - const positions = [ - { x: element.x + element.width + 10, y: element.y + element.height / 2 }, // Right - { x: element.x - this.LABEL_MIN_WIDTH - 10, y: element.y + element.height / 2 }, // Left - { x: element.x + element.width / 2 - this.LABEL_MIN_WIDTH / 2, y: element.y - this.LABEL_HEIGHT - 10 }, // Top - { x: element.x + element.width / 2 - this.LABEL_MIN_WIDTH / 2, y: element.y + element.height + 10 + this.LABEL_HEIGHT } // Bottom - ]; + ): {x: number, y: number} | null { + const w = this.getLabelWidth(element.ref); + // Try positions around the element + const positions = [ + { x: element.x + element.width + 10, y: element.y + element.height / 2 }, // Right + { x: element.x - w - 10, y: element.y + element.height / 2 }, // Left + { x: element.x + element.width / 2 - w / 2, y: element.y - 10 }, // Top (baseline at gap above element) + { x: element.x + element.width / 2 - w / 2, y: element.y + element.height + 10 + this.LABEL_HEIGHT } // Bottom + ]; @@ - const labelBounds = { - x: pos.x, - y: pos.y - this.LABEL_HEIGHT, - width: this.LABEL_MIN_WIDTH, - height: this.LABEL_HEIGHT - }; + const labelBounds = { + x: pos.x, + y: pos.y - this.LABEL_HEIGHT - this.LABEL_PADDING, + width: w, + height: this.LABEL_HEIGHT + this.LABEL_PADDING * 2 + };@@ - /** - * Calculate actual label width including padding - */ - private calculateLabelWidth(ref: string): number { - const padding = 4; - const textWidth = this.estimateTextWidth(ref, 11, 'Arial, sans-serif'); - return Math.max(this.LABEL_MIN_WIDTH, textWidth + padding * 2); - } + /** + * Compute final label box width used consistently in placement and rendering + */ + private getLabelWidth(ref: string): number { + const textWidth = this.estimateTextWidth(ref, this.FONT_SIZE, this.FONT_FAMILY); + return Math.max(this.LABEL_MIN_WIDTH, textWidth + this.LABEL_PADDING * 2); + }@@ - const textColor = '#0096FF'; - const fontSize = 11; - const fontFamily = 'Arial, sans-serif'; - const padding = 4; + const textColor = '#0096FF'; @@ - // Calculate dynamic width based on text content - const textWidth = this.estimateTextWidth(label.ref, fontSize, fontFamily); - const labelWidth = Math.max(this.LABEL_MIN_WIDTH, textWidth + padding * 2); + // Dynamic width based on text content and padding + const labelWidth = this.getLabelWidth(label.ref); @@ - <rect x="${label.x - padding}" y="${label.y - this.LABEL_HEIGHT - 2}" - width="${labelWidth}" height="${this.LABEL_HEIGHT + 4}" + <rect x="${label.x - this.LABEL_PADDING}" y="${label.y - this.LABEL_HEIGHT - this.LABEL_PADDING}" + width="${labelWidth}" height="${this.LABEL_HEIGHT + this.LABEL_PADDING * 2}" fill="white" opacity="0.9" rx="2"/> - <text x="${label.x}" y="${label.y}" font-family="${fontFamily}" - font-size="${fontSize}" fill="${textColor}" font-weight="bold">${escapedRef}</text> + <text x="${label.x}" y="${label.y}" font-family="${this.FONT_FAMILY}" + font-size="${this.FONT_SIZE}" fill="${textColor}" font-weight="bold">${escapedRef}</text>Why this matters:
- Eliminates geometry drift between placement and rendering.
- Prevents false-negative overlap checks and off-screen label backgrounds.
- Corrects the “Top” position baseline so labels don’t jump above the viewport.
Please sanity-check on a dense page (e.g., a form with many short labels). Ensure no labels overlap and none render off-screen when “Top” is chosen.
Also applies to: 97-127, 142-147, 164-196, 205-221, 316-327, 351-356, 401-432, 447-468, 656-676, 591-598
🧹 Nitpick comments (2)
camel/toolkits/hybrid_browser_toolkit/ts/src/screenshot-labeler.ts (2)
332-400
: Group-column placement: consider left-side early and clamp the whole column rather than per-label retriesYou already try column shifts, but you clamp per-label and stop on first failure. A column-level clamp (ensuring first and last label stay within [LABEL_HEIGHT+PADDING, screenshotHeight-PADDING]) reduces attempts and avoids jitter. If the right column is tight against the edge, try left before entering shift attempts.
If helpful, I can propose a compact column-fit function that returns the best shift or null.
238-264
: Grouping heuristic might under-cluster dense UIsA fixed threshold=50 px may under-cluster on high-DPI screenshots or oversized canvases. Consider scaling threshold by screenshot dimensions (e.g., min(width,height)*0.03) or by median element size for more robust grouping.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (2)
camel/toolkits/hybrid_browser_toolkit/ts/src/hybrid-browser-toolkit.ts
(3 hunks)camel/toolkits/hybrid_browser_toolkit/ts/src/screenshot-labeler.ts
(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- camel/toolkits/hybrid_browser_toolkit/ts/src/hybrid-browser-toolkit.ts
🔇 Additional comments (2)
camel/toolkits/hybrid_browser_toolkit/ts/src/screenshot-labeler.ts (2)
609-621
: LGTM: element overlaysBlue rectangle styling for element bounds is consistent and readable. Rounded corners and 2px stroke are a good balance.
496-553
: LGTM: connection point selectiongetBestConnectionPoints is clear and correct: computes candidate edge centers and picks the shortest pair. Works well with the earlier change to pass dynamic labelWidth/Height.
camel/toolkits/hybrid_browser_toolkit/ts/src/screenshot-labeler.ts
Outdated
Show resolved
Hide resolved
@coderabbitai review |
✅ Actions performedReview triggered.
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 6
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (3)
camel/toolkits/hybrid_browser_toolkit/ws_wrapper.py (3)
48-53
: Avoid logging secrets/PII (mask sensitive args in action_logger).Currently, keystrokes (e.g.,
text
) and other sensitive inputs are logged verbatim.@@ - # Log inputs (skip self) - inputs = { - "args": args, - "kwargs": kwargs, - } + # Log inputs (skip self) — mask sensitive fields + SENSITIVE_KEYS = { + "text", "password", "passwd", "token", "api_key", "apikey", + "authorization", "auth", "secret", "cookie", "session" + } + masked_kwargs = { + k: ("***" if isinstance(k, str) and k.lower() in SENSITIVE_KEYS else v) + for k, v in kwargs.items() + } + inputs = {"args": args, "kwargs": masked_kwargs} @@ - await self._log_action( + await self._log_action( action_name=action_name, inputs=inputs, outputs=result, execution_time=execution_time, page_load_time=page_load_time, ) @@ - await self._log_action( + await self._log_action( action_name=action_name, inputs=inputs, outputs=None, execution_time=execution_time, error=error_msg, )Also applies to: 65-71, 80-86
121-146
: Sanitize session_id in log filenames to prevent path/name injection.Use a safe slug when composing paths.
@@ - self.session_id = (config or {}).get('session_id', 'default') + self.session_id = (config or {}).get('session_id', 'default') + # restrict to [A-Za-z0-9_-] + session_id_safe = ''.join( + (c if (c.isalnum() or c in ('-', '_')) else '_') for c in self.session_id + ) @@ - self.log_file_path = os.path.join( + self.log_file_path = os.path.join( log_dir, - f"hybrid_browser_toolkit_ws_{timestamp}_{self.session_id}.log", + f"hybrid_browser_toolkit_ws_{timestamp}_{session_id_safe}.log", ) @@ - self.ts_log_file_path = os.path.join( + self.ts_log_file_path = os.path.join( log_dir, - f"typescript_console_{timestamp}_{self.session_id}.log", + f"typescript_console_{timestamp}_{session_id_safe}.log", )
414-421
: Make ping bounded; don't hang on dead connections.Bound the health check with a timeout.
- await self.websocket.ping() + await asyncio.wait_for(self.websocket.ping(), timeout=5.0)
♻️ Duplicate comments (2)
camel/toolkits/hybrid_browser_toolkit/ts/src/snapshot-parser.ts (1)
124-130
: Remove dataset-specific hard-coded pair debugging.Hard-coding ref IDs leaks dataset artifacts and creates noisy logs. Use the DEBUG flag if you must log, but never bake specific refs.
Apply this diff:
- // Debug specific pairs - if ((parentRef === 'e296' && childRef === 'e297') || - (parentRef === 'e361' && childRef === 'e363') || - (parentRef === 'e371' && childRef === 'e373') || - (parentRef === 'e344' && childRef === 'e346') || - (parentRef === 'e348' && childRef === 'e350')) { - console.log(`[Debug] Found pair: ${parentRef} (${parentNode.type}) -> ${childRef} (${childNode.type})`); - }camel/toolkits/hybrid_browser_toolkit/ts/src/browser-session.ts (1)
426-431
: Boolean coercion for onclick is correct now.Prevents string|boolean leakage into later logic.
🧹 Nitpick comments (9)
camel/toolkits/hybrid_browser_toolkit/ts/src/snapshot-parser.ts (2)
154-159
: Gate debug logs behind DEBUG_SNAPSHOT_PARSER.Unconditional console logging will spam consumers. Respect the existing DEBUG flag.
Apply this diff:
- filtered.delete(child); - console.log(`[Hierarchy Filter] Filtered ${child} (${childType}) - keeping parent ${parent} (${parentType})`); + filtered.delete(child); + if (DEBUG_SNAPSHOT_PARSER) { + console.log(`[Hierarchy Filter] Filtered ${child} (${childType}) - keeping parent ${parent} (${parentType})`); + } @@ - filtered.delete(parent); - console.log(`[Hierarchy Filter] Filtered ${parent} (${parentType}) - keeping child ${child} (${childType})`); + filtered.delete(parent); + if (DEBUG_SNAPSHOT_PARSER) { + console.log(`[Hierarchy Filter] Filtered ${parent} (${parentType}) - keeping child ${child} (${childType})`); + } @@ - filtered.delete(childRef); - console.log(`[Hierarchy Filter] Filtered ${childRef} (${childType}) contained in ${currentParent} (${parentType})`); + filtered.delete(childRef); + if (DEBUG_SNAPSHOT_PARSER) { + console.log(`[Hierarchy Filter] Filtered ${childRef} (${childType}) contained in ${currentParent} (${parentType})`); + } @@ - filtered.delete(ref); - console.log(`[Hierarchy Filter] Filtered ${ref} (generic wrapper around button ${clickableChildren[0]})`); + filtered.delete(ref); + if (DEBUG_SNAPSHOT_PARSER) { + console.log(`[Hierarchy Filter] Filtered ${ref} (generic wrapper around button ${clickableChildren[0]})`); + }Also applies to: 195-197, 221-222
90-90
: Remove unused variabledebugInfo
.It’s never read.
Apply this diff:
- const debugInfo: any[] = [];
camel/toolkits/hybrid_browser_toolkit/ts/src/browser-session.ts (3)
248-260
: Snapshot parsing helper: broaden role regex and normalize case.Align with other modules that allow hyphens/digits and use lower-case.
Apply this diff:
- const typeMatch = line.match(/^\s*-?\s*([\w-]+)/); - const role = typeMatch ? typeMatch[1] : undefined; + const typeMatch = line.match(/^\s*-?\s*([a-z0-9_-]+)/i); + const role = typeMatch ? typeMatch[1].toLowerCase() : undefined;
885-909
: Use evaluate instead of evaluateHandle and drop unused local; tiny leak/readability.No need for a handle; also remove unused
lastChangeTime
.Apply this diff:
- const startTime = Date.now(); - const stabilityThreshold = 100; // Consider stable if no changes for 100ms - let lastChangeTime = Date.now(); + const startTime = Date.now(); + const stabilityThreshold = 100; // Consider stable if no changes for 100ms @@ - // Monitor DOM changes - await page.evaluateHandle(() => { + // Monitor DOM changes + await page.evaluate(() => { let changeCount = 0; (window as any).__domStabilityCheck = { changeCount: 0, lastChange: Date.now() }; @@ - (window as any).__domStabilityObserver = observer; - }); + (window as any).__domStabilityObserver = observer; + }); @@ - await page.evaluate(() => { + await page.evaluate(() => { const observer = (window as any).__domStabilityObserver; if (observer) observer.disconnect(); delete (window as any).__domStabilityObserver; delete (window as any).__domStabilityCheck; }).catch(() => {});Also applies to: 927-935
1269-1275
: Guard against non-positive scrollPositionScale.A scale of 0 breaks viewport math. Normalize to > 0.
Apply this diff:
- const scale = Number.isFinite(browserConfig.scrollPositionScale) - ? browserConfig.scrollPositionScale - : 1; + const rawScale = Number(browserConfig.scrollPositionScale); + const scale = Number.isFinite(rawScale) && rawScale > 0 ? rawScale : 1;camel/toolkits/hybrid_browser_toolkit/ws_wrapper.py (4)
369-373
: Harden JSON logging for non-serializable objects.Use
default=str
to avoid crashes on exotic types.- f.write( - json.dumps(log_entry, ensure_ascii=False, indent=2) + '\n' - ) + f.write(json.dumps( + log_entry, ensure_ascii=False, indent=2, default=str + ) + '\n')
721-731
: Ruff SIM115: persistent file handle is intentional; annotate.Since the handle must stay open, add a
noqa
to silence the false positive.- self.ts_log_file = open( - self.ts_log_file_path, 'w', encoding='utf-8' - ) + self.ts_log_file = open( # noqa: SIM115 - persistent handle across loop + self.ts_log_file_path, 'w', encoding='utf-8' + )
539-549
: Type hint correctness for _ensure_ref_prefix.Function returns
None
whenref
is falsy; update hints accordingly.- def _ensure_ref_prefix(self, ref: str) -> str: + def _ensure_ref_prefix(self, ref: Optional[str]) -> Optional[str]: @@ - if not ref: - return ref + if not ref: + return ref
156-170
: Use shutil.which for tool checks (faster, style-aligned).Avoid spawning processes just to test presence.
+ import shutil - # Check if npm is installed - npm_check = subprocess.run( - ['npm', '--version'], - capture_output=True, - text=True, - ) - if npm_check.returncode != 0: + # Check if npm is installed + if shutil.which('npm') is None: raise RuntimeError( "npm is not installed or not in PATH. " "Please install Node.js and npm from https://nodejs.org/ " "to use the hybrid browser toolkit." ) @@ - # Check if node is installed - node_check = subprocess.run( - ['node', '--version'], - capture_output=True, - text=True, - ) - if node_check.returncode != 0: + # Check if node is installed + if shutil.which('node') is None: raise RuntimeError( "node is not installed or not in PATH. " "Please install Node.js from https://nodejs.org/ " "to use the hybrid browser toolkit." )Also applies to: 171-183
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (3)
camel/toolkits/hybrid_browser_toolkit/ts/src/browser-session.ts
(10 hunks)camel/toolkits/hybrid_browser_toolkit/ts/src/snapshot-parser.ts
(1 hunks)camel/toolkits/hybrid_browser_toolkit/ws_wrapper.py
(7 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py
⚙️ CodeRabbit configuration file
Review the python code against the google python style guide and point out any mismatches.
Files:
camel/toolkits/hybrid_browser_toolkit/ws_wrapper.py
🪛 Ruff (0.12.2)
camel/toolkits/hybrid_browser_toolkit/ws_wrapper.py
722-722: Use a context manager for opening files
(SIM115)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
- GitHub Check: pytest_package_fast_test
- GitHub Check: pytest_package_llm_test
- GitHub Check: pytest_examples
- GitHub Check: pytest_apps
- GitHub Check: pre-commit
🔇 Additional comments (3)
camel/toolkits/hybrid_browser_toolkit/ts/src/snapshot-parser.ts (1)
51-55
: Indent-based parent resolution: confirm indent semantics.Using “>= indent” pops siblings and any ancestor at the same indentation. Ensure the snapshot emitter uses fixed-width indentation (e.g., 2 spaces per level). If tabs or mixed widths appear, consider normalizing indentation to levels.
camel/toolkits/hybrid_browser_toolkit/ts/src/browser-session.ts (2)
67-75
: Applying stealth headers/UA for CDP contexts looks good.Keeps behavior consistent across connection modes.
137-145
: Applying stealth headers/UA for launched contexts looks good.Parity with CDP path maintained.
camel/toolkits/hybrid_browser_toolkit/ts/src/browser-session.ts
Outdated
Show resolved
Hide resolved
camel/toolkits/hybrid_browser_toolkit/ts/src/browser-session.ts
Outdated
Show resolved
Hide resolved
camel/toolkits/hybrid_browser_toolkit/ts/src/snapshot-parser.ts
Outdated
Show resolved
Hide resolved
@coderabbitai review |
✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (5)
camel/toolkits/hybrid_browser_toolkit/ws_wrapper.py (4)
250-255
: Cleanup on start-time failure is incomplete (log reader/task/file leak).If server readiness times out, only the process is killed; the log-reader task remains and the file handle may stay open. Cancel the reader, await it, close the file, and clear handles.
Apply this diff:
if not server_ready: - self.process.kill() - raise RuntimeError( - "WebSocket server failed to start within timeout" - ) + self.process.kill() + with contextlib.suppress(Exception): + # Ensure the process fully exits + self.process.wait(timeout=2) + # Cancel and await the log reader task + if self._log_reader_task and not self._log_reader_task.done(): + self._log_reader_task.cancel() + with contextlib.suppress(asyncio.CancelledError): + await self._log_reader_task + # Close TS log file if open + if getattr(self, 'ts_log_file', None): + with contextlib.suppress(Exception): + self.ts_log_file.close() + self.ts_log_file = None + self.process = None + raise RuntimeError("WebSocket server failed to start within timeout")
265-269
: Also cleanup when WebSocket connect fails after process start.Mirror the above cleanup to avoid a dangling reader and file handle if connect throws.
Apply this diff:
except Exception as e: - self.process.kill() - raise RuntimeError( - f"Failed to connect to WebSocket server: {e}" - ) from e + self.process.kill() + with contextlib.suppress(Exception): + self.process.wait(timeout=2) + if self._log_reader_task and not self._log_reader_task.done(): + self._log_reader_task.cancel() + with contextlib.suppress(asyncio.CancelledError): + await self._log_reader_task + if getattr(self, 'ts_log_file', None): + with contextlib.suppress(Exception): + self.ts_log_file.close() + self.ts_log_file = None + self.process = None + raise RuntimeError(f"Failed to connect to WebSocket server: {e}") from e
420-427
: Bound ping to a short timeout to avoid indefinite await.Awaiting
ping()
can hang if the server never responds. Useasyncio.wait_for
.Apply this diff:
- # Check if connection is still alive - try: - # Send a ping to check connection - await self.websocket.ping() + # Check if connection is still alive + try: + # Send a ping to check connection (bounded wait) + await asyncio.wait_for(self.websocket.ping(), timeout=5.0)
722-786
: Open the TS log file with a context manager and simplify finalization.Prevents accidental leaks and matches linter guidance.
Apply this diff:
- async def _read_and_log_output(self): + async def _read_and_log_output(self): """Read stdout from Node.js process & handle SERVER_READY + logging.""" if not self.process: return try: - if self.ts_log_file_path: - self.ts_log_file = open( - self.ts_log_file_path, 'w', encoding='utf-8' - ) - self.ts_log_file.write( - f"TypeScript Console Log - Started at " - f"{time.strftime('%Y-%m-%d %H:%M:%S')}\n" - ) - self.ts_log_file.write("=" * 80 + "\n") - self.ts_log_file.flush() - - while self.process and self.process.poll() is None: - try: - line = await asyncio.get_running_loop().run_in_executor( - None, self.process.stdout.readline - ) - if not line: # EOF - break - - # Check for SERVER_READY message - if line.startswith('SERVER_READY:'): - try: - self.server_port = int( - line.split(':', 1)[1].strip() - ) - logger.info( - f"WebSocket server ready on port " - f"{self.server_port}" - ) - if ( - self._server_ready_future - and not self._server_ready_future.done() - ): - self._server_ready_future.set_result(True) - except (ValueError, IndexError) as e: - logger.error(f"Failed to parse SERVER_READY: {e}") - - # Write all output to log file - if self.ts_log_file: - timestamp = time.strftime('%H:%M:%S') - self.ts_log_file.write(f"[{timestamp}] {line}") - self.ts_log_file.flush() - - except Exception as e: - logger.warning(f"Error reading stdout: {e}") - break - - except Exception as e: - logger.warning(f"Error in _read_and_log_output: {e}") - finally: - if self.ts_log_file: - self.ts_log_file.write("\n" + "=" * 80 + "\n") - self.ts_log_file.write( - f"TypeScript Console Log - Ended at " - f"{time.strftime('%Y-%m-%d %H:%M:%S')}\n" - ) - self.ts_log_file.close() - self.ts_log_file = None + import contextlib + with contextlib.ExitStack() as stack: + if self.ts_log_file_path: + self.ts_log_file = stack.enter_context( + open(self.ts_log_file_path, 'w', encoding='utf-8') + ) + self.ts_log_file.write( + f"TypeScript Console Log - Started at " + f"{time.strftime('%Y-%m-%d %H:%M:%S')}\n" + ) + self.ts_log_file.write("=" * 80 + "\n") + self.ts_log_file.flush() + + while self.process and self.process.poll() is None: + try: + line = await asyncio.get_running_loop().run_in_executor( + None, self.process.stdout.readline + ) + if not line: # EOF + break + + # Check for SERVER_READY message + if line.startswith('SERVER_READY:'): + try: + self.server_port = int( + line.split(':', 1)[1].strip() + ) + logger.info( + f"WebSocket server ready on port " + f"{self.server_port}" + ) + if ( + self._server_ready_future + and not self._server_ready_future.done() + ): + self._server_ready_future.set_result(True) + except (ValueError, IndexError) as e: + logger.error(f"Failed to parse SERVER_READY: {e}") + + # Write all output to log file + if self.ts_log_file: + timestamp = time.strftime('%H:%M:%S') + self.ts_log_file.write(f"[{timestamp}] {line}") + self.ts_log_file.flush() + + except Exception as e: + logger.warning(f"Error reading stdout: {e}") + break + + # Footer if we had a file + if self.ts_log_file: + self.ts_log_file.write("\n" + "=" * 80 + "\n") + self.ts_log_file.write( + f"TypeScript Console Log - Ended at " + f"{time.strftime('%Y-%m-%d %H:%M:%S')}\n" + ) + # ExitStack closes file; clear handle + self.ts_log_file = None + except Exception as e: + logger.warning(f"Error in _read_and_log_output: {e}")camel/toolkits/hybrid_browser_toolkit/ts/src/browser-session.ts (1)
342-371
: Timing fields are misleading: coordinate enrichment is reported as 0.You integrated coordinate calc into mappingTime but set coordinate_enrichment_time_ms to 0. Split the timings.
Apply this diff:
- const mappingTime = Date.now() - mappingStart; + const baseMappingTime = Date.now() - mappingStart; + let coordinateTime = 0; @@ - return { + return { snapshot: finalSnapshot, elements: finalElements, metadata: { elementCount: Object.keys(finalElements).length, url: page.url(), timestamp: new Date().toISOString(), }, timing: { total_time_ms: totalTime, snapshot_time_ms: snapshotTime, - coordinate_enrichment_time_ms: 0, // Integrated into mapping - aria_mapping_time_ms: mappingTime, + coordinate_enrichment_time_ms: coordinateTime, + aria_mapping_time_ms: baseMappingTime, }, };And above the includeCoordinates block, measure coordinate time:
- if (includeCoordinates) { + if (includeCoordinates) { + const coordStart = Date.now(); // Get coordinates for each ref using aria-ref selector for (const ref of refs) { try { const selector = `aria-ref=${ref}`; - const element = await page.locator(selector).first(); - const exists = await element.count() > 0; + const element = page.locator(selector).first(); + const exists = (await element.count()) > 0; @@ } catch (error) { // Failed to get coordinates for element } } + coordinateTime = Date.now() - coordStart; }
♻️ Duplicate comments (2)
camel/toolkits/hybrid_browser_toolkit/ws_wrapper.py (1)
225-231
: Good fix: create the readiness future before starting the reader.This removes the race and aligns with earlier guidance.
camel/toolkits/hybrid_browser_toolkit/ts/src/browser-session.ts (1)
452-461
: Correlate popup to the clicking page; prefer page.waitForEvent('popup') with non-throwing wait.Using context().waitForEvent('page') can capture unrelated tabs; correlating to the page avoids flakiness. Also keep the wait short and non-throwing.
Apply this diff:
- // Set up popup listener before clicking - const popupPromise = page.context().waitForEvent('page', { timeout: browserConfig.popupTimeout }); + // Set up popup listener correlated to this page + const popupPromise = page + .waitForEvent('popup', { timeout: Math.min(browserConfig.popupTimeout, 700) }) + .catch(() => null); @@ - try { - // Wait for new page to open - const newPage = await popupPromise; + try { + // Wait for new page (if any) + const newPage = await popupPromise; + if (!newPage) { + return { success: true, method: 'playwright-aria-ref' }; + }
🧹 Nitpick comments (4)
camel/toolkits/hybrid_browser_toolkit/ts/src/browser-session.ts (4)
248-260
: Remove unused parseElementFromSnapshot to avoid duplication.Mapping now uses buildSnapshotIndex; this helper appears dead and risks divergence.
Apply this diff:
- private parseElementFromSnapshot(snapshotText: string, ref: string): { role?: string; text?: string } { - const lines = snapshotText.split('\n'); - for (const line of lines) { - if (line.includes(`[ref=${ref}]`)) { - const typeMatch = line.match(/^\s*-?\s*([\w-]+)/); - const role = typeMatch ? typeMatch[1] : undefined; - const textMatch = line.match(/"([^"]*)"/); - const text = textMatch ? textMatch[1] : undefined; - return { role, text }; - } - } - return {}; - }
311-339
: Minor: avoid unnecessary awaits and clarify existence checks.
locator().first()
is sync; awaiting it yields no benefit. Add parentheses for readability on count comparison.Apply this diff:
- const element = await page.locator(selector).first(); - const exists = await element.count() > 0; + const element = page.locator(selector).first(); + const exists = (await element.count()) > 0;Repeat similarly in performType/performSelect for consistency.
430-439
: Behavioral change: forcing all links to open in new tabs. Verify product intent.This alters site behavior and can spawn many tabs. Confirm this is desired; otherwise gate behind a config flag.
I can add a
forceLinksOpenInNewTabs
boolean in config and default it to false if you want.
110-117
: UA placement: set userAgent on contexts, not on launch().
chromium.launch()
ignores userAgent; you already set it onnewContext
. Remove the launch-level userAgent to avoid confusion.Apply this diff:
- if (stealthConfig.enabled) { - launchOptions.args = stealthConfig.args || []; - - // Apply stealth user agent if configured - if (stealthConfig.userAgent) { - launchOptions.userAgent = stealthConfig.userAgent; - } - } + if (stealthConfig.enabled) { + launchOptions.args = stealthConfig.args || []; + }Also applies to: 132-148
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (3)
camel/toolkits/hybrid_browser_toolkit/ts/src/browser-session.ts
(7 hunks)camel/toolkits/hybrid_browser_toolkit/ts/src/snapshot-parser.ts
(1 hunks)camel/toolkits/hybrid_browser_toolkit/ws_wrapper.py
(8 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- camel/toolkits/hybrid_browser_toolkit/ts/src/snapshot-parser.ts
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py
⚙️ CodeRabbit configuration file
Review the python code against the google python style guide and point out any mismatches.
Files:
camel/toolkits/hybrid_browser_toolkit/ws_wrapper.py
🪛 Ruff (0.12.2)
camel/toolkits/hybrid_browser_toolkit/ws_wrapper.py
729-729: Use a context manager for opening files
(SIM115)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
- GitHub Check: pre-commit
- GitHub Check: pytest_examples
- GitHub Check: pytest_package_llm_test
- GitHub Check: pytest_apps
- GitHub Check: pytest_package_fast_test
🔇 Additional comments (4)
camel/toolkits/hybrid_browser_toolkit/ws_wrapper.py (3)
233-238
: Helpful log: surface TS console log path.This improves operability and debugging.
295-331
: Orderly shutdown looks solid now.Shutdown command first, then process termination and finally task cancellation with suppress — good sequencing to avoid blocked
.readline()
.
441-444
: Correct: bind futures to the running loop.Prevents cross-loop hazards.
camel/toolkits/hybrid_browser_toolkit/ts/src/browser-session.ts (1)
300-309
: Good switch to one-pass snapshot indexing.This removes O(n²) parsing and reduces latency on large pages.
camel/toolkits/hybrid_browser_toolkit/ts/src/browser-session.ts
Outdated
Show resolved
Hide resolved
@coderabbitai review |
✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
♻️ Duplicate comments (3)
camel/toolkits/hybrid_browser_toolkit/ws_wrapper.py (2)
225-231
: Good: readiness future created before starting the log reader.This eliminates the race flagged earlier; the guard in the reader also looks correct.
467-469
: Good: futures are created on the running loop.This resolves cross-loop pitfalls.
camel/toolkits/hybrid_browser_toolkit/ts/src/browser-session.ts (1)
452-481
: Scope popup detection to the clicked page and make it non-blocking short wait.Context-level
page
event can capture unrelated tabs and adds latency. Prefer a short, non-throwingpage.waitForEvent('popup')
and handle the no-popup case.Apply:
- const popupPromise = page.context().waitForEvent('page', { timeout: browserConfig.popupTimeout }); + const popupPromise = page + .waitForEvent('popup', { timeout: Math.min(browserConfig.popupTimeout, 700) }) + .catch(() => null); @@ - try { - // Wait for new page to open - const newPage = await popupPromise; + // Wait for new page to open (if any) + const newPage = await popupPromise; + if (!newPage) { + return { success: true, method: 'playwright-aria-ref' }; + } @@ - return { success: true, method: 'playwright-aria-ref-newtab', newTabId }; - } catch (popupError) { - return { success: true, method: 'playwright-aria-ref' }; - } + return { success: true, method: 'playwright-aria-ref-newtab', newTabId };
🧹 Nitpick comments (6)
camel/toolkits/hybrid_browser_toolkit/ws_wrapper.py (4)
24-25
: Add precise type annotations for new attributes (minor style + clarity).Annotate the new attributes and import IO to match Google style and improve IDE/type checking.
-from typing import TYPE_CHECKING, Any, Dict, List, Optional +from typing import TYPE_CHECKING, Any, Dict, List, Optional, IO @@ - self._server_ready_future = None # Future to track server ready state + self._server_ready_future: Optional[asyncio.Future[bool]] = None # Future to track server ready state @@ - self.ts_log_file_path: Optional[str] = None - self.ts_log_file = None # File handle for TypeScript logs - self._log_reader_task = None # Task for reading and logging stdout + self.ts_log_file_path: Optional[str] = None + self.ts_log_file: Optional[IO[str]] = None # File handle for TypeScript logs + self._log_reader_task: Optional[asyncio.Task[None]] = None # Task for reading and logging stdoutAlso applies to: 119-131
220-223
: Pin subprocess text encoding to UTF-8.Avoid locale-dependent decoding of Node output.
- text=True, + text=True, + encoding="utf-8",
243-249
: Make server start timeout configurable.Ten seconds can be tight on CI or cold machines. Read from config with a sane default.
- timeout = 10 # 10 seconds timeout + timeout = int((self.config or {}).get('server_start_timeout_sec', 10))
304-356
: Shutdown ordering looks solid; also fail fast any pending futures.Nice sequencing: send shutdown → close WS → stop process → cancel tasks → close files. Add a small block to cancel outstanding command futures to avoid hung awaiters during shutdown.
# Close TS log file if open if getattr(self, 'ts_log_file', None): with contextlib.suppress(Exception): self.ts_log_file.close() self.ts_log_file = None # Ensure process handle cleared self.process = None + + # Reject any pending command futures to avoid hangs on shutdown + if self._pending_responses: + err = RuntimeError("Shutting down: pending command cancelled") + for fut in list(self._pending_responses.values()): + if not fut.done(): + fut.set_exception(err) + self._pending_responses.clear()camel/toolkits/hybrid_browser_toolkit/ts/src/browser-session.ts (2)
1243-1249
: Viewport math fix — LGTM; drop unused scrollPos param.
filterElementsInViewport
no longer usesscrollPos
. Remove it here and at the call site to avoid confusion.Apply:
- private filterElementsInViewport( - elements: Record<string, SnapshotElement>, - viewport: { width: number, height: number }, - scrollPos: { x: number, y: number } - ): Record<string, SnapshotElement> { + private filterElementsInViewport( + elements: Record<string, SnapshotElement>, + viewport: { width: number, height: number } + ): Record<string, SnapshotElement> {And at usage (Line 351):
- finalElements = this.filterElementsInViewport(playwrightMapping, viewport, scrollPos); + finalElements = this.filterElementsInViewport(playwrightMapping, viewport);
248-260
: Remove unused private methodparseElementFromSnapshot
This helper isn’t referenced anywhere; delete it to prevent future drift.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (2)
camel/toolkits/hybrid_browser_toolkit/ts/src/browser-session.ts
(8 hunks)camel/toolkits/hybrid_browser_toolkit/ws_wrapper.py
(10 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py
⚙️ CodeRabbit configuration file
Review the python code against the google python style guide and point out any mismatches.
Files:
camel/toolkits/hybrid_browser_toolkit/ws_wrapper.py
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
- GitHub Check: pytest_package_fast_test
- GitHub Check: pytest_package_llm_test
- GitHub Check: pytest_apps
🔇 Additional comments (6)
camel/toolkits/hybrid_browser_toolkit/ws_wrapper.py (1)
747-816
: Log reader and SERVER_READY handling look robust.
- ExitStack-managed file handle with footer, flush per line: good.
- SERVER_READY parsing sets the future with guard: good.
- run_in_executor with get_running_loop: good.
camel/toolkits/hybrid_browser_toolkit/ts/src/browser-session.ts (5)
137-145
: Stealth headers/UA on new context — LGTM.
262-276
: Single-pass snapshot index — LGTM.
300-309
: O(n) mapping from the prebuilt index — LGTM.
324-333
: Coordinate enrichment merges without clobbering role/text — LGTM.
857-913
: Remove unusedlastChangeTime
and avoid leaking JSHandle
Usepage.evaluate
instead ofevaluateHandle
to prevent leaking the handle and drop the unusedlastChangeTime
.@@ camel/toolkits/hybrid_browser_toolkit/ts/src/browser-session.ts:857-913 - let lastChangeTime = Date.now(); @@ - await page.evaluateHandle(() => { + await page.evaluate(() => {
@coderabbitai review |
✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
camel/toolkits/hybrid_browser_toolkit/ts/src/browser-session.ts (2)
121-128
: Persistent-context path misses stealth headers; UA-only applied.launchPersistentContext supports extraHTTPHeaders and viewport. Add them so stealth is consistent across all launch modes.
Apply this diff:
if (stealthConfig.enabled) { launchOptions.args = stealthConfig.args || []; - - // Apply stealth user agent if configured + // Apply stealth user agent/headers if configured if (stealthConfig.userAgent) { launchOptions.userAgent = stealthConfig.userAgent; } + if (stealthConfig.extraHTTPHeaders) { + launchOptions.extraHTTPHeaders = stealthConfig.extraHTTPHeaders; + } } if (browserConfig.userDataDir) { - this.context = await chromium.launchPersistentContext( + // Ensure viewport is honored in persistent context + launchOptions.viewport = browserConfig.viewport; + this.context = await chromium.launchPersistentContext( browserConfig.userDataDir, launchOptions );Also applies to: 130-141
112-114
: CDP mode inconsistency: earlier “newPage is not supported”, later tries newPage().Either support newPage() in CDP or remove the fallback creation in getCurrentPage(). Right now behavior is contradictory and will surprise callers.
Apply this diff to align with “no newPage in CDP”:
- if (browserConfig.connectOverCdp && this.context) { - console.log('[CDP] No active page found, attempting to create new page...'); - try { - const newPage = await this.context.newPage(); - const newTabId = this.generateTabId(); - this.registerNewPage(newTabId, newPage); - this.currentTabId = newTabId; - // Set page timeouts - newPage.setDefaultNavigationTimeout(browserConfig.navigationTimeout); - newPage.setDefaultTimeout(browserConfig.navigationTimeout); - console.log(`[CDP] Created new page with tab ID: ${newTabId}`); - return newPage; - } catch (error) { - console.error('[CDP] Failed to create new page:', error); - throw new Error('No active page available and failed to create new page in CDP mode'); - } - } + if (browserConfig.connectOverCdp) { + throw new Error('No active page available in CDP mode; frontend must pre-create blank tabs.'); + }Also applies to: 199-217
🧹 Nitpick comments (4)
camel/toolkits/hybrid_browser_toolkit/ts/src/browser-session.ts (4)
259-271
: Remove now-unused parseElementFromSnapshot to avoid divergence.buildSnapshotIndex replaced per-ref parsing and normalizes role casing; keeping this duplicate parser risks inconsistent behavior.
Apply this diff:
- private parseElementFromSnapshot(snapshotText: string, ref: string): { role?: string; text?: string } { - const lines = snapshotText.split('\n'); - for (const line of lines) { - if (line.includes(`[ref=${ref}]`)) { - const typeMatch = line.match(/^\s*-?\s*([\w-]+)/); - const role = typeMatch ? typeMatch[1] : undefined; - const textMatch = line.match(/"([^"]*)"/); - const text = textMatch ? textMatch[1] : undefined; - return { role, text }; - } - } - return {}; - }
273-287
: Minor: precompile regexes to reduce per-line allocations.Hoist refRe, roleRe, textRe as private readonly fields to reuse across calls.
Example (outside this method):
// At class level private readonly reRef = /\[ref=([^\]]+)\]/i; private readonly reRole = /^\s*-?\s*([a-z0-9_-]+)/i; private readonly reText = /"([^"]*)"/; // And use them here instead of new RegExp each loop.
359-364
: Avoid extra round-trip: reuse cached scrollPosition since viewport math ignores scroll.You call getCurrentScrollPosition() but filterElementsInViewport now operates purely in viewport space. Use the cached field to save an eval.
Apply this diff:
- const viewport = page.viewportSize() || { width: 1280, height: 720 }; - const scrollPos = await this.getCurrentScrollPosition(); - finalElements = this.filterElementsInViewport(playwrightMapping, viewport, scrollPos); + const viewport = page.viewportSize() || { width: 1280, height: 720 }; + finalElements = this.filterElementsInViewport(playwrightMapping, viewport, this.scrollPosition);
377-381
: Timing metric misreports coordinate cost as 0.coordinate_enrichment_time_ms should reflect work done when includeCoordinates=true.
Apply this diff:
- coordinate_enrichment_time_ms: 0, // Integrated into mapping - aria_mapping_time_ms: mappingTime, + coordinate_enrichment_time_ms: includeCoordinates ? mappingTime : 0, // approx + aria_mapping_time_ms: mappingTime,If you want precise breakdown, time the coordinate loop separately.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (2)
camel/toolkits/hybrid_browser_toolkit/ts/src/browser-session.ts
(8 hunks)camel/toolkits/hybrid_browser_toolkit/ws_wrapper.py
(10 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- camel/toolkits/hybrid_browser_toolkit/ws_wrapper.py
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
- GitHub Check: pytest_package_llm_test
- GitHub Check: pytest_package_fast_test
- GitHub Check: pytest_examples
- GitHub Check: pre-commit
- GitHub Check: pytest_apps
🔇 Additional comments (6)
camel/toolkits/hybrid_browser_toolkit/ts/src/browser-session.ts (6)
62-72
: Good fix: apply stealth headers on reused CDP context; warn for immutable UA.This closes the gap when reusing a pre-existing context.
78-86
: Consistent stealth on new CDP context creation.Applying headers and UA via contextOptions is correct.
311-321
: LGTM: one-pass snapshot indexing eliminates O(n²) scans.The mapping defaults are sensible and stable.
335-338
: LGTM: merge coordinates without losing role/text.Prevents accidental overwrites.
1254-1260
: LGTM: viewport math fixed to pure viewport coordinates.This resolves the scroll-offset double-count.
Also applies to: 1273-1273
1235-1241
: No action needed: browser.close() on CDP sessions only disconnects the client
Playwright docs: calling Browser.close() on an instance obtained via chromium.connectOverCDP clears all contexts and disconnects from the browser server without terminating the remote browser process. (playwright.dev)
@coderabbitai review |
✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
camel/toolkits/hybrid_browser_toolkit/ts/src/browser-session.ts (1)
348-354
: viewportLimit without coordinates yields no real filtering; force-enable coords.If viewportLimit is true and includeCoordinates is false, elements without coords pass through.
- private async getSnapshotForAINative(includeCoordinates = false, viewportLimit = false): Promise<SnapshotResult & { timing: DetailedTiming }> { + private async getSnapshotForAINative(includeCoordinates = false, viewportLimit = false): Promise<SnapshotResult & { timing: DetailedTiming }> { const startTime = Date.now(); const page = await this.getCurrentPage(); try { + if (viewportLimit && !includeCoordinates) { + includeCoordinates = true; + }
♻️ Duplicate comments (1)
camel/toolkits/hybrid_browser_toolkit/ts/src/browser-session.ts (1)
452-482
: New-tab detection can hang; bind to page 'popup' with short, non-throwing window.Context 'page' listens globally and waits full popupTimeout; prefer page.waitForEvent('popup') with a short timeout and catch.
- // Set up popup listener before clicking - const popupPromise = page.context().waitForEvent('page', { timeout: browserConfig.popupTimeout }); + // Set up popup listener with a short, non-throwing window + const popupPromise = page + .waitForEvent('popup', { timeout: Math.min(browserConfig.popupTimeout, 700) }) + .catch(() => null); @@ - try { - // Wait for new page to open - const newPage = await popupPromise; - + // Wait for new page to open (if any) + const newPage = await popupPromise; + if (!newPage) { + return { success: true, method: 'playwright-aria-ref' }; + } + - // Generate tab ID for the new page + // Generate tab ID for the new page const newTabId = this.generateTabId(); this.registerNewPage(newTabId, newPage); @@ - return { success: true, method: 'playwright-aria-ref-newtab', newTabId }; - } catch (popupError) { - return { success: true, method: 'playwright-aria-ref' }; - } + return { success: true, method: 'playwright-aria-ref-newtab', newTabId };
🧹 Nitpick comments (5)
camel/toolkits/hybrid_browser_toolkit/ts/src/browser-session.ts (5)
124-131
: extraHTTPHeaders on launchOptions is ineffective for chromium.launch (noise).LaunchOptions ignores extraHTTPHeaders; keep it only for persistent context or add a clarifying comment.
Apply this to avoid confusion and scope headers to persistent contexts only:
if (stealthConfig.userAgent) { launchOptions.userAgent = stealthConfig.userAgent; } - if (stealthConfig.extraHTTPHeaders) { - launchOptions.extraHTTPHeaders = stealthConfig.extraHTTPHeaders; - }
200-206
: Nit: comment in visitPage contradicts this behavior.getCurrentPage throws in CDP mode; update the comment in visitPage (Lines 938-941) to match.
- // If no active page is available, getCurrentPage() will create one in CDP mode + // In CDP mode, getCurrentPage() will NOT create a page; frontend must pre-create blank tabs
248-276
: Remove unused duplicate snapshot parser.parseElementFromSnapshot is dead code after adopting buildSnapshotIndex; keeping both invites divergence.
- private parseElementFromSnapshot(snapshotText: string, ref: string): { role?: string; text?: string } { - const lines = snapshotText.split('\n'); - for (const line of lines) { - if (line.includes(`[ref=${ref}]`)) { - const typeMatch = line.match(/^\s*-?\s*([\w-]+)/); - const role = typeMatch ? typeMatch[1] : undefined; - const textMatch = line.match(/"([^"]*)"/); - const text = textMatch ? textMatch[1] : undefined; - return { role, text }; - } - } - return {}; - }
289-295
: Deduplicate refs before mapping to cut coordinate lookups.Repeated refs cause redundant locator/boundingBox work.
- const refs: string[] = []; + const refs: string[] = []; let match; while ((match = refPattern.exec(snapshotText)) !== null) { refs.push(match[1]); } + const uniqueRefs = Array.from(new Set(refs));And use uniqueRefs in subsequent loops.
431-441
: Opening all links in new tabs should be configurable.Forcing target=_blank changes site semantics; guard with a config flag.
- // Open ALL links in new tabs + // Optionally open links in new tabs (configurable) + const alwaysOpenLinksInNewTabs = this.configLoader.getBrowserConfig().alwaysOpenLinksInNewTabs === true; @@ - const shouldOpenNewTab = naturallyOpensNewTab || isNavigableLink; + const shouldOpenNewTab = naturallyOpensNewTab || (alwaysOpenLinksInNewTabs && isNavigableLink);
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (1)
camel/toolkits/hybrid_browser_toolkit/ts/src/browser-session.ts
(10 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
- GitHub Check: pytest_examples
- GitHub Check: pre-commit
- GitHub Check: pytest_apps
- GitHub Check: pytest_package_fast_test
- GitHub Check: pytest_package_llm_test
🔇 Additional comments (6)
camel/toolkits/hybrid_browser_toolkit/ts/src/browser-session.ts (6)
63-72
: CDP reuse: headers applied correctly; UA warning is appropriate.Good application of setExtraHTTPHeaders on an existing context and guarding UA.
78-86
: New context: stealth injection LGTM.Context options wiring for headers and UA is correct.
153-161
: Non-persistent: stealth in context options LGTM.
300-310
: Snapshot index + coordinate merge LGTM.Single-pass index and non-destructive coordinate enrichment look good.
Also applies to: 311-339
857-904
: DOM stability wait refactor LGTM.Switch to evaluate + waitForFunction with cleanup avoids handle leaks and busy-wait.
1235-1241
: Viewport math fix LGTM.Using viewport-relative coords and removing scroll offsets is correct.
Also applies to: 1253-1254
Description
Current implemented SOM screenshot:

other open source implementation

This pr generated SOM screenshot

Checklist
Go over all the following points, and put an
x
in all the boxes that apply.Fixes #issue-number
in the PR description (required)pyproject.toml
anduv lock
If you are unsure about any of these, don't hesitate to ask. We are here to help!