genai-perf out of bounds error when choices array is null when setting "include_usage": true #8082

sre42 · 2025-03-21T10:46:27Z

Description
genai-perf fails with an out of bounds error on an empty choices array when setting:
"stream_options": {
"include_usage": true
}

choices can be null as per the spec here: https://platform.openai.com/docs/api-reference/chat-streaming/streaming#chat-streaming/streaming-choices

Log from running genai-perf

File "/usr/local/lib/python3.12/dist-packages/genai_perf/profile_data_parser/llm_profile_data_parser.py", line 229, in _preprocess_response
  text = self._extract_openai_text_output(r)
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/dist-packages/genai_perf/profile_data_parser/llm_profile_data_parser.py", line 341, in _extract_openai_text_output
  completions = data["choices"][0]
                ~~~~~~~~~~~~~~~^^^
IndexError: list index out of range
2025-03-20 09:12 [ERROR] genai_perf.main:58 - list index out of range

Triton Information
using container: nvcr.io/nvidia/tritonserver:25.02-py3-sdk

To Reproduce

Hosting nim enterprise backend with the model meta/llama-3.3-70b-instruct. deployed via the standard nim operator.

our api-gateway appends
"stream_options": {
"include_usage": true
}

genai-perf command used:

genai-perf profile
-m $MODEL
--service-kind openai
--endpoint-type chat
--streaming
-H "Authorization: Bearer ${API_KEY}"
-H "Accept: text/event-stream"
-u OUR-ENDPOINT
--synthetic-input-tokens-mean $INPUT_SEQUENCE_LENGTH
--synthetic-input-tokens-stddev $INPUT_SEQUENCE_STD
--concurrency $CONCURRENCY
--output-tokens-mean $OUTPUT_SEQUENCE_LENGTH
--extra-inputs max_tokens:$OUTPUT_SEQUENCE_LENGTH
--extra-inputs min_tokens:$OUTPUT_SEQUENCE_LENGTH
--warmup-request-count 1
--measurement-interval 100000
--extra-inputs ignore_eos:true
--
-v
--max-threads=256

Custom curl to view the empty choices array:

curl -X POST 'llama-70b-instruct-load-test.nim-namespace:8000/v1/chat/completions' --header 'Content-Type: application/json' --header "accept: text/event-stream" --data '{
"messages": [
{
"role": "user",
"content": "generate something cool"
}
],
"model": "meta/llama-3.3-70b-instruct",
"stream": true,
"max_tokens": 20,
"min_tokens": 20,
"ignore_eos": true,
"stream_options": {
"include_usage": true
}
}'

Expected behavior

genai-perf should ignore or handle when choices is empty.

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

genai-perf out of bounds error when choices array is null when setting "include_usage": true #8082

genai-perf out of bounds error when choices array is null when setting "include_usage": true #8082

sre42 commented Mar 21, 2025

genai-perf out of bounds error when choices array is null when setting "include_usage": true #8082

genai-perf out of bounds error when choices array is null when setting "include_usage": true #8082

Comments

sre42 commented Mar 21, 2025