Skip to content

Conversation

peterfoldes
Copy link
Contributor

@peterfoldes peterfoldes commented Jun 3, 2025

$ time uv run python tests/smoke.py \
  --api-key-file ~/prod.key --wide --debug --region aws-us-west-2 --api-endpoint api.cloud.wherobots.com --version latest -s --single --presigned-url \
  "SELECT timestamp, geometry FROM wherobots_pro_data.osm.osm_all_nodes LIMIT 100"
2025-09-04 19:41:43.207 INFO                 root: Query 57f7d53f-e70d-46b8-9371-5652ba0e2b01 succeeded; full message is {'kind': 'state_updated', 'execution_id': '57f7d53f-e70d-46b8-9371-5652ba0e2b01', 'state': 'succeeded', 'result_uri': 'https://wbts-temp-user-data-aws-us-west-2-prod.s3.amazonaws.com/dfqlwcrxlk/vit9eoc0upejpv/sql-session-queries/57f7d53f-e70d-46b8-9371-5652ba0e2b01/part-00000-105dd55e-be15-49e6-aa7d-4c7235e4b9e8-c000.snappy.parquet?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=ASIAUZT33PSSQSDR7UJH%2F20250905%2Fus-west-2%2Fs3%2Faws4_request&X-Amz-Date=20250905T024143Z&X-Amz-Expires=14400&X-Amz-SignedHeaders=host&X-Amz-Security-Token=IQoJb3JpZ2luX2VjEAMaCXVzLXdlc3QtMiJGMEQCICxfgi59w3A%2Ff4VXttIugjySYZXr1LGy8AK25q9Z8TYhAiBl7nVYTcpZYZnqoEioaJVLX%2FvY%2FXU5U%2Bnk26OKaHjCryqUBQhsEAAaDDMyOTg5ODQ5MTA0NSIMXbLUBVLio2dduVgyKvEENgCTkoulZfix31oz%2FkdIDXzI9zGVs3L1bllkfcFEKdZwvsytjTwilFq06mbeVxXQKQ%2F8CL2bP8MZaPtrI6nAlCIvRNX8s6%2FL3Unr5ALXwoqCUtdH3F%2BI7NpFbGk3OJgfrqep4yg39A8ZjwygcyPlcGbjTEg8hK1iKKuJjk4n%2FIEchf7QU4lnIhIvAKUZRurM7qLfvmWYIQ4QmPHLuvKB8Ih8ufEwZdlZ%2B%2BOFQbXkbXgF1EswISR86ZbBWJ7TjdVE6r88b%2FmheYmH3Emaw7O7qZhZdI70kWOXVNKmAbEupLilsBhh5ieo8s2NwLyenPzwaz121CeNSN6GyYRGh32Oae2EaKvXgVI9lTd4NsTMAURwo3MkURKBA5PL49ehjoF90z5Ls9zMpZqlGe%2FV8G5gg%2FgUzLFsWM4uPiTp3NVjG9ivl9j2MAfOdJCrbakJzZSMFsbJ6CeBXRRr0eJgtYo3VvT1ufiN4VMhcWWQN5YZk1eKArOGCcn7jEoys8tDbXl5HSroM%2BjmLca5N%2F8fzx1fRs%2FmXxanA6JjGEUJ4SjD8GYrAesW%2FtQPJ6dBhtOe7Q2aPgFwW50LznQKgV26%2BUH27OnpC0IfhReEEDoKr%2F9vgyd0Dj5TPwuURqH6rU7ZgGbiyI4K6O%2BVS8zmeWQ5pIalPFpwhj6lIjpukWzv%2Bi4ikv7V2Vug520Qw94%2FpmEUpQIaUvnA1KiX2ZGkyIAQ8i9kOF7%2BIe%2BQk%2B5rZIVfaVsG71lbXcruHZM5%2Biw7E3KZBrrQ0%2F2I2ypv62NKFrrlwgBGHaxXKn7O3X0CZnORwVgStjAalmKLT5Z3lGvYJ4mu7fcZLzDnm%2BnFBjqcAaQWSFs6fQlOvKOOzfHNCGwusyNUmfRs%2BbeqE0h3UtBbtPaQxQMCB%2F98dekptGlR59O%2BDHB4gYH7q0jXpEd3qo9m%2Fw%2BEkqbmuyMM72wxasYHAtIqptnC5OquQ7eAtBgOau5XAst0yfRg%2BaLIRCMlcMhncH9Db7ufa%2BvKhLy3a%2FgnNDCtd43X1KdRJrtjjQgWDCOUFA8UK%2BAmMJ5aIA%3D%3D&X-Amz-Signature=1f7e7d97114bd658da86c2f04d768e3bd8f64402e3b62a3017eb26474bbd0664', 'size': 2899}

@sfishel18
Copy link
Contributor

i'm getting some errors running this sql statement: SHOW SCHEMAS IN wherobots_open_data

it fails if i try to use either geoparquet or geojson as the storage format (ex. poetry run python tests/smoke.py --api-endpoint api.staging.wherobots.com --api-key-file ./api_key.txt "SHOW SCHEMAS IN wherobots_open_data" --store --storage-format geoparquet)

@sfishel18
Copy link
Contributor

so iiuc this change doesn't provide a way to get access to the results URI from the sdk, correct? but you can opt into behind-the-scenes temp storage. which you would do because it increases the limit on the number of results you can get back? and/or you get more efficient cursoring through the results because sql-session can pull them from its s3 cache based on the execution id?

sfishel18
sfishel18 previously approved these changes Jun 5, 2025
@sfishel18
Copy link
Contributor

another interesting behavior i discovered, when passing boolean values in the execute sql web socket message, they have to be stringified. i.e. this is valid:

{
    kind: "execute_sql"
    execution_id: "bf1ffba3-bc64-485d-8bbe-475264884676"
    statement: "..."
    store: {
      "format": "geoparquet",
      "single": "true",
      "generate_presigned_url": "true"
    }
}

but this is not (note no quotes around true in store.single and store.generate_presigned_url:

{
    kind: "execute_sql"
    execution_id: "bf1ffba3-bc64-485d-8bbe-475264884676"
    statement: "..."
    store: {
      "format": "geoparquet",
      "single": true,
      "generate_presigned_url": true
    }
}

@@ -134,6 +137,9 @@ def __listen(self) -> None:
# On a state_updated event telling us the query succeeded,
# ask for results.
if kind == EventKind.STATE_UPDATED:
logging.info(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you want to keep that?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes and no. Since result_uri and size is currently part of the state updated event (and not the retrieve result) we might want to have a way to surface that info programatically. Right now it's an easy way to just log dump it, but honestly it's a poor way to surface it.

We should talk about really making it part of a request result call even if it's another extra round or have a nicer way to handle it from the client side.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed. My recommendation is to extend the interface of Connection beyond what the DB-API interface calls for, as that part is specifically for "DB driver"-like work where you get query results directly through the Cursor from the cursor() method.

Maybe we provide another, different type of handler:

def cursor(self) -> Cursor:
    return Cursor(self.__execute_sql, self.__cancel_query)

def stored(self, format: ResultFormat, single: bool, presigned: bool) -> StoredResult:
    return StoredResult(self.__execute_sql, self.__cancel_query, format, single, presigned)

Which exposes:

class StoredResult:
    def __init__(self, exec_fn, cancel_fn, format: ResultFormat, single: bool, presigned: bool):
        # ...

    def execute(self, operation: str, parameters: dict[str, Any] = None):
        # ... call exec_fn with the correct store{} parameters

    def get_result_url(self) -> str:
        # ...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

3 participants