Skip to content

Conversation

aheev
Copy link

@aheev aheev commented Aug 20, 2025

@github-actions github-actions bot added triage PRs from the community tools labels Aug 20, 2025
@aheev
Copy link
Author

aheev commented Aug 20, 2025

@AndrewJSchofield can you please review this?

@aheev aheev changed the title KAFKA-19487: Improving consistency of command-line arguments for consumer performance tests KAFKA-19624: Improving consistency of command-line arguments for consumer performance tests Aug 20, 2025
@AndrewJSchofield AndrewJSchofield added ci-approved and removed triage PRs from the community labels Aug 22, 2025
@brandboat
Copy link
Member

ping @aheev, could you please fix the conflicts?

# Conflicts:
#	tools/src/main/java/org/apache/kafka/tools/ConsumerPerformance.java
#	tools/src/test/java/org/apache/kafka/tools/ConsumerPerformanceTest.java
@aheev
Copy link
Author

aheev commented Aug 25, 2025

ping @aheev, could you please fix the conflicts?

done

Copy link
Member

@brandboat brandboat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the patch, left some comments below.

@aheev aheev requested a review from brandboat August 25, 2025 06:27
@aheev
Copy link
Author

aheev commented Aug 25, 2025

FAILED ❌ RestoreIntegrationTest > "shouldInvokeUserDefinedGlobalStateRestoreListener(boolean).useNewProtocol=false"

This test seems to be flaky. It succeeds on my machine and fails sometimes. Also the changes are not even related to the test

Just tested. Same issue on trunk too

Copy link
Collaborator

@m1a2st m1a2st left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this patch, some comments left

@AndrewJSchofield AndrewJSchofield self-requested a review August 25, 2025 14:42
Copy link
Member

@AndrewJSchofield AndrewJSchofield left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR. Please add some tests for the variation combinations of new/old options to make sure the rules in the KIP are correctly implemented.

Copy link
Member

@brandboat brandboat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the update, some minor comments.

@@ -136,7 +174,9 @@ public void testConfigWithoutTopicAndInclude() {

@Test
public void testClientIdOverride() throws IOException {
File tempFile = Files.createFile(tempDir.resolve("test_consumer_config.conf")).toFile();
Path configPath = tempDir.resolve("test_consumer_config.conf");
Files.deleteIfExists(configPath);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not required, tempDir will be cleanup automatically after junit test finished.
Same thing in other places.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While I was testing, a test ended abruptly and caused FileAlreadyExists exceptions on next runs. Hence, I have added this code

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, we have multiple test files with the same name. We could give them distinct names—for example, by adding a random suffix or simply choosing different names. This is just my personal nit though.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed the file names to test_name_test_class.conf

Copy link
Member

@brandboat brandboat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!

Copy link
Member

@AndrewJSchofield AndrewJSchofield left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the updates.

I see a NullPointerException from kafka-consumer-perf-test.sh if I fail to provide --bootstrap-server. Since that's a required argument, I would expect an error message.

Copy link
Member

@AndrewJSchofield AndrewJSchofield left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apart from adding --command-property to both kafka-consumer-perf-test.sh and kafka-share-consumer-perf-test.sh, this looks good to me.

commandPropertiesOpt = parser.accepts("command-property", "Kafka consumer related configuration properties like client.id. " +
"These configs take precedence over those passed via --command-config or --consumer.config.")
.withRequiredArg()
.describedAs("prop1=val1,prop2=val2...")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not think the code handles the format as described. For example, if I use --command-property client.id=c1,check.crcs=false, then the client ID is "c1,check.crcs=false". The tricky part is that kafka-producer-perf-test uses a different argument parsing library and it supports --command-property client.id=c1 bootstrap.servers=localhost:9092 with a space used as a separator. That's very non-standard but it's already there for that tool at least.

Personally, I think that the easiest is to support prop=val only and permit multiple instances of --command-property, so someone can do --command-property client.id=c1 --command-property check.crcs=false. I didn't think the details would be subtle. wdyt?

The same comment applies to the share consumer perf test also.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

makes sense as the option name implies a single prop. I will add the change, test it and ask for review

@aheev
Copy link
Author

aheev commented Sep 2, 2025

Overall, LGTM. Could you run the e2e test and share the results?

What do you mean by e2e test?

@m1a2st
Copy link
Collaborator

m1a2st commented Sep 2, 2025

Since you have modified the e2e test file consumer_performance.py, you can validate it by running the following command to ensure the test still works:
TC_PATHS="tests/kafkatest/benchmarks/core/benchmark_test.py" bash tests/docker/run_tests.sh

FYI: https://github.com/apache/kafka/blob/trunk/tests/README.md

@aheev
Copy link
Author

aheev commented Sep 2, 2025

Since you have modified the e2e test file consumer_performance.py, you can validate it by running the following command to ensure the test still works: TC_PATHS="tests/kafkatest/benchmarks/core/benchmark_test.py" bash tests/docker/run_tests.sh

FYI: https://github.com/apache/kafka/blob/trunk/tests/README.md

I will run them after andrew's comments are resolved

@AndrewJSchofield
Copy link
Member

@aheev Please resolve conflicts

Copy link
Member

@AndrewJSchofield AndrewJSchofield left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR. Once we have a green build, I'm ready to merge this.

@aheev
Copy link
Author

aheev commented Sep 3, 2025

Thanks for the PR. Once we have a green build, I'm ready to merge this.

Shouldn't we wait for this? I am trying to run the tests, but running into some issues

Since you have modified the e2e test file consumer_performance.py, you can validate it by running the following command to ensure the test still works:
TC_PATHS="tests/kafkatest/benchmarks/core/benchmark_test.py" bash tests/docker/run_tests.sh

# Conflicts:
#	tools/src/main/java/org/apache/kafka/tools/ConsumerPerformance.java
@AndrewJSchofield
Copy link
Member

Thanks for the PR. Once we have a green build, I'm ready to merge this.

Shouldn't we wait for this? I am trying to run the tests, but running into some issues

Since you have modified the e2e test file consumer_performance.py, you can validate it by running the following command to ensure the test still works:
TC_PATHS="tests/kafkatest/benchmarks/core/benchmark_test.py" bash tests/docker/run_tests.sh

Yes, we should. I'll wait then.

@aheev
Copy link
Author

aheev commented Sep 4, 2025

After a lot of struggle I managed to run the test by using a different JDK(will file a ticket for this later). Here are the results @m1a2st . Three tests failed, all of which are related to producer-perf
benchmark_test.log

@m1a2st
Copy link
Collaborator

m1a2st commented Sep 5, 2025

Here are the results @m1a2st . Three tests failed, all of which are related to producer-perf

Thanks, Could you file a ticket to trace this issue?

# Conflicts:
#	tools/src/main/java/org/apache/kafka/tools/ShareConsumerPerformance.java
@aheev
Copy link
Author

aheev commented Sep 5, 2025

Here are the results @m1a2st . Three tests failed, all of which are related to producer-perf

Thanks, Could you file a ticket to trace this issue?

https://issues.apache.org/jira/browse/KAFKA-19672

@aheev
Copy link
Author

aheev commented Sep 5, 2025

This test seems to be flaky and not related to the changes. It succeeded in a couple of runs in my local. @AndrewJSchofield should we trigger a CI re-run?

FAILED ❌ SmokeTestDriverIntegrationTest > "shouldWorkWithRebalance(boolean, boolean, boolean).stateUpdaterEnabled=false, processingThreadsEnabled=false, streamsProtocolEnabled=true"
FAILED ❌ SmokeTestDriverIntegrationTest > "shouldWorkWithRebalance(boolean, boolean, boolean).stateUpdaterEnabled=true, processingThreadsEnabled=false, streamsProtocolEnabled=true"
FAILED ❌ SmokeTestDriverIntegrationTest > "shouldWorkWithRebalance(boolean, boolean, boolean).stateUpdaterEnabled=true, processingThreadsEnabled=true, streamsProtocolEnabled=true"
Found 2 flaky test failures:
FLAKY ⚠️  SmokeTestDriverIntegrationTest > "shouldWorkWithRebalance(boolean, boolean, boolean).stateUpdaterEnabled=true, processingThreadsEnabled=true, streamsProtocolEnabled=false"
FLAKY ⚠️  SmokeTestDriverIntegrationTest > "shouldWorkWithRebalance(boolean, boolean, boolean).stateUpdaterEnabled=true, processingThreadsEnabled=false, streamsProtocolEnabled=false"

@AndrewJSchofield
Copy link
Member

@aheev I would merge latest changes into this branch. I find the test fails on your branch, and passes on trunk. I can't see that this PR would cause the tests to fail, so I'd update the branch and let the CI run again.

@@ -115,7 +115,7 @@ def start_cmd(self, node):
for key, value in self.args(node.version).items():
cmd += " --%s %s" % (key, value)

cmd += " --consumer.config %s" % ConsumerPerformanceService.CONFIG_FILE
cmd += " --command-config %s" % ConsumerPerformanceService.CONFIG_FILE
Copy link
Collaborator

@Yunyung Yunyung Sep 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will this break e2e testing with older versions? (See #20471)

@@ -100,7 +100,7 @@ def start_cmd(self, node):
for key, value in self.args().items():
cmd += " --%s %s" % (key, value)

cmd += " --consumer.config %s" % ShareConsumerPerformanceService.CONFIG_FILE
cmd += " --command-config %s" % ShareConsumerPerformanceService.CONFIG_FILE
Copy link
Collaborator

@Yunyung Yunyung Sep 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto. The shared consumer is very new, so maybe it’s not a big deal. I’m not sure though.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Share consumer was in 4.1 and we are now in 4.2.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants