[CB] Split token streaming and generation to different threads for all CB based pipelines #6996
Triggered via pull request
January 17, 2025 15:25
Status
Failure
Total duration
6h 1m 32s
Artifacts
–
causal_lm_cpp.yml
on: pull_request
Matrix: cpp-beam_search_causal_lm-ubuntu
cpp-multinomial-greedy_causal_lm-ubuntu
18m 53s
cpp-greedy_causal_lm-windows
36m 2s
cpp-greedy_causal_lm-Qwen-7B-Chat
11m 12s
cpp-beam_search_causal_lm-Qwen1_5-7B-Chat
16m 6s
cpp-beam_search_causal_lm-Phi-2
11m 15s
cpp-beam_search_causal_lm-notus-7b-v1
32m 9s
cpp-speculative_decoding_lm-ubuntu
6h 0m
cpp-prompt_lookup_decoding_lm-ubuntu
6h 0m
cpp-Phi-1_5
9m 46s
cpp-greedy_causal_lm-redpajama-3b-chat
12m 42s
cpp-chat_sample-ubuntu
15m 24s
visual_language_chat_sample-ubuntu-minicpm_v2_6
8m 1s
visual_language_chat_sample-ubuntu-llava_1_5
/
visual_language_chat_sample-ubuntu-llava
14m 30s
visual_language_chat_sample-ubuntu-llava_next
/
visual_language_chat_sample-ubuntu-llava
18m 6s
visual_language_chat_sample-ubuntu-internvl2
23m 55s
cpp-continuous-batching-ubuntu
14m 26s
cpp-continuous-batching-windows
24m 26s
cpp-continuous-batching-macos
19m 41s
visual_language_chat_sample-ubuntu-qwen2vl
13m 9s
ci/gha_overall_status_causal_lm
0s
Annotations
5 errors and 1 warning
cpp-prompt_lookup_decoding_lm-ubuntu
The job running on runner ubuntu-20.04-16-cores_51880b9ee4ca has exceeded the maximum execution time of 360 minutes.
|
cpp-prompt_lookup_decoding_lm-ubuntu
The operation was canceled.
|
cpp-speculative_decoding_lm-ubuntu
The job running on runner ubuntu-20.04-16-cores_5d98ff194f6b has exceeded the maximum execution time of 360 minutes.
|
cpp-speculative_decoding_lm-ubuntu
The operation was canceled.
|
ci/gha_overall_status_causal_lm
Process completed with exit code 1.
|
ci/gha_overall_status_causal_lm
ubuntu-latest pipelines will use ubuntu-24.04 soon. For more details, see https://github.com/actions/runner-images/issues/10636
|