Library and command-line tools to traverse the macOS accessibility tree and simulate user input actions. Allows interaction with UI elements of other applications.
macos-use-sdk.mp4
Highlight whatever is happening on the computer: text elements, clicks, typing
Listen to changes in the UI, elements changed, text changed
To build the command-line tools provided by this package, navigate to the root directory (MacosUseSDK
) in your terminal and run:
swift build
This will compile the tools and place the executables in the .build/debug/
directory (or .build/release/
if you use swift build -c release
). You can run them directly from there (e.g., .build/debug/TraversalTool
) or use swift run <ToolName>
.
All tools output informational logs and timing data to stderr
. Primary output (like PIDs or JSON data) is sent to stdout
.
- Purpose: Opens or activates a macOS application by its name, bundle ID, or full path. Outputs the application's PID on success.
- Usage:
AppOpenerTool <Application Name | Bundle ID | Path>
- Examples:
# Open by name swift run AppOpenerTool Calculator # Open by bundle ID swift run AppOpenerTool com.apple.Terminal # Open by path swift run AppOpenerTool /System/Applications/Utilities/Terminal.app # Example output (stdout) # 54321
- Purpose: Traverses the accessibility tree of a running application (specified by PID) and outputs a JSON representation of the UI elements to
stdout
. - Usage:
TraversalTool [--visible-only] <PID>
- Options:
--visible-only
: Only include elements that have a position and size (are geometrically visible).
- Examples:
# Get only visible elements for Messages app swift run TraversalTool --visible-only $(swift run AppOpenerTool Messages)
- Purpose: Traverses the accessibility tree of a running application (specified by PID) and draws temporary red boxes around all visible UI elements. Also outputs traversal data (JSON) to
stdout
. Useful for debugging accessibility structures. - Usage:
HighlightTraversalTool <PID> [--duration <seconds>]
- Options:
--duration <seconds>
: Specifies how long the highlights remain visible (default: 3.0 seconds).
- Examples:
Note: This tool needs to keep running for the duration specified to manage the highlights.
# Combine with AppOpenerTool to open Messages and highlight it swift run HighlightTraversalTool $(swift run AppOpenerTool Messages) --duration 5
- Purpose: Simulates keyboard and mouse input events without visual feedback.
- Usage: See
swift run InputControllerTool --help
(or just run without args) for actions. - Examples:
# Press the Enter key swift run InputControllerTool keypress enter # Simulate Cmd+C (Copy) swift run InputControllerTool keypress cmd+c # Simulate Shift+Tab swift run InputControllerTool keypress shift+tab # Left click at screen coordinates (100, 250) swift run InputControllerTool click 100 250 # Double click at screen coordinates (150, 300) swift run InputControllerTool doubleclick 150 300 # Right click at screen coordinates (200, 350) swift run InputControllerTool rightclick 200 350 # Move mouse cursor to (500, 500) swift run InputControllerTool mousemove 500 500 # Type the text "Hello World!" swift run InputControllerTool writetext "Hello World!"
- Purpose: Simulates keyboard and mouse input events with visual feedback (currently a pulsing green circle for mouse actions).
- Usage: Similar to
InputControllerTool
, but adds a--duration
option for the visual effect. Seeswift run VisualInputTool --help
. - Options:
--duration <seconds>
: How long the visual feedback effect lasts (default: 0.5 seconds).
- Examples:
Note: This tool needs to keep running for the duration specified to display the visual feedback.
# Left click at (100, 250) with default 0.5s feedback swift run VisualInputTool click 100 250 # Right click at (800, 400) with 2 second feedback swift run VisualInputTool rightclick 800 400 --duration 2.0 # Move mouse to (500, 500) with 1 second feedback swift run VisualInputTool mousemove 500 500 --duration 1.0 # Keypress and writetext (currently NO visualization implemented) swift run VisualInputTool keypress cmd+c swift run VisualInputTool writetext "Testing"
Run only specific tests or test classes, use the --filter option. Run a specific test method: Provide the full identifier TestClassName/testMethodName
swift test
# Example: Run only the multiply test in CombinedActionsDiffTests
swift test --filter CombinedActionsDiffTests/testCalculatorMultiplyWithActionAndTraversalHighlight
# Example: Run all tests in CombinedActionsFocusVisualizationTests
swift test --filter CombinedActionsFocusVisualizationTests
You can also use MacosUseSDK
as a dependency in your own Swift projects. Add it to your Package.swift
dependencies:
dependencies: [
.package(url: "/* path or URL to your MacosUseSDK repo */", from: "1.0.0"),
]
And add MacosUseSDK
to your target's dependencies:
.target(
name: "YourApp",
dependencies: ["MacosUseSDK"]),
Then import and use the public functions:
import MacosUseSDK
import Foundation // For Dispatch etc.
// Example: Get elements from Calculator app
Task {
do {
// Find Calculator PID (replace with actual logic or use AppOpenerTool output)
// let calcPID: Int32 = ...
// let response = try MacosUseSDK.traverseAccessibilityTree(pid: calcPID, onlyVisibleElements: true)
// print("Found \(response.elements.count) visible elements.")
// Example: Click at a point
let point = CGPoint(x: 100, y: 200)
try MacosUseSDK.clickMouse(at: point)
// Example: Click with visual feedback (needs main thread for UI)
DispatchQueue.main.async {
do {
try MacosUseSDK.clickMouseAndVisualize(at: point, duration: 1.0)
} catch {
print("Visualization error: \(error)")
}
}
} catch {
print("MacosUseSDK Error: \(error)")
}
}
// Remember to keep the run loop active if using async UI functions like highlightVisibleElements or *AndVisualize
// RunLoop.main.run() // Or use within an @main Application structure
This project is licensed under the MIT License - see the LICENSE file for details.