Add Hive and Iceberg Load benchmark #55

PingLiuPing · 2025-04-24T14:58:50Z

loading (insert) benchmark is missing in pbench, this PR add the initial files for loading benchmark. It includes test files for hive and iceberg connector, both native and Java.
The data is loaded from tpch connector on the fly.

Future enhancements are required to make the benchmark run in stage such as prepare stage, main stage, cleanup stage etc.

…stissimo

benchmarks/tpch-load/create-table/customer.sql

benchmarks/tpch-load/cleanup_sf100.json

PingLiuPing · 2025-08-13T08:24:58Z

@wanglinsong @ethanyzhang Sorry for the late response, this PR slipped from my mind. I addressed your comments, can you please take another look? Thanks.

wanglinsong

I believe the DDL to create tables are the same across all scale factors. Can you parameterize or remove the hardcoded schema name: tpch.sf100.?

FROM tpch.sf100.customer;

benchmarks/tpch-load/schema/create_sf100.sql

PingLiuPing · 2025-08-21T19:19:49Z

believe the DDL to create tables are the same across all scale factors. Can you parameterize or remove the hardcoded schema name: tpch.sf100.?

FROM tpch.sf100.customer;

Thanks, at the current framework I think this needs lots of work to support that.

wanglinsong · 2025-08-21T19:30:11Z

believe the DDL to create tables are the same across all scale factors. Can you parameterize or remove the hardcoded schema name: tpch.sf100.?
FROM tpch.sf100.customer;

Thanks, at the current framework I think this needs lots of work to support that.

Oh, this is an embedded connector. This is not an issue at all. Please ignore.

PingLiuPing · 2025-08-25T08:12:26Z

Hi @wanglinsong Thanks for your comments, do you think this PR is ready to be merged? Anything else you want me to change?

PingLiuPing added 10 commits April 24, 2025 15:51

Add workload for hive iceberg insertion

8f5c569

Remove schema

d313dba

Change to hive.tpch_sf10_parquet becaues of column name change in pre…

7152357

…stissimo

Rename files and add workaround for prestissimo tpch column names

123dd9d

Add iceberg benchmark

71b953c

Add iceberg

02d5a42

plit schema

2d82ce2

Add sf1000 for hive

a41c35f

iceberg native

524c962

iceberg native

012e185

PingLiuPing marked this pull request as ready for review April 24, 2025 16:55

PingLiuPing requested review from ethanyzhang, wanglinsong, xpengahana and FelixYBW as code owners April 24, 2025 16:55

PingLiuPing changed the title ~~Add TPCH Load benchmark~~ Add Hive and Iceberg Load benchmark Apr 24, 2025

wanglinsong requested changes Apr 24, 2025

View reviewed changes

benchmarks/tpch-load/create-table/customer.sql Outdated Show resolved Hide resolved

PingLiuPing added 3 commits April 24, 2025 20:29

Add missing files for scaling factor 100

3b883a0

Rename folder

b7364f3

Fix review comments and adding newline to the end of each file

9edb174

PingLiuPing force-pushed the lpingbj_load_tpch branch from a8371c9 to 9edb174 Compare April 24, 2025 19:32

ethanyzhang reviewed Apr 29, 2025

View reviewed changes

benchmarks/tpch-load/cleanup_sf100.json Outdated Show resolved Hide resolved

Remove expected_row_counts when there is no expected row counts

4ab0745

PingLiuPing self-assigned this Aug 4, 2025

Change s3 bucket name and use timestamp as suffix for schema name

aee3644

wanglinsong requested changes Aug 21, 2025

View reviewed changes

benchmarks/tpch-load/schema/create_sf100.sql Outdated Show resolved Hide resolved

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add Hive and Iceberg Load benchmark #55

Add Hive and Iceberg Load benchmark #55

Uh oh!

PingLiuPing commented Apr 24, 2025 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

PingLiuPing commented Aug 13, 2025

Uh oh!

wanglinsong left a comment

Uh oh!

Uh oh!

PingLiuPing commented Aug 21, 2025

Uh oh!

wanglinsong commented Aug 21, 2025 •

edited

Loading

Uh oh!

PingLiuPing commented Aug 25, 2025

Uh oh!

Uh oh!

Add Hive and Iceberg Load benchmark #55

Are you sure you want to change the base?

Add Hive and Iceberg Load benchmark #55

Uh oh!

Conversation

PingLiuPing commented Apr 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

PingLiuPing commented Aug 13, 2025

Uh oh!

wanglinsong left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

PingLiuPing commented Aug 21, 2025

Uh oh!

wanglinsong commented Aug 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

PingLiuPing commented Aug 25, 2025

Uh oh!

Uh oh!

PingLiuPing commented Apr 24, 2025 •

edited

Loading

wanglinsong commented Aug 21, 2025 •

edited

Loading