Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merge releases/2024/3 into master #731

Conversation

Wovchena
Copy link
Collaborator

@Wovchena Wovchena commented Aug 2, 2024

No description provided.

Wovchena and others added 30 commits July 15, 2024 13:48
Workaround Python_VERSION_MAJOR and MINOR not being set by replasing
Python3 with Python

Disable generation of some of the COMPONENTs not needed for GenAI. There
are still unwanted empty archives, but they are generated
uncounditionally by rapidjson.
…envinotoolkit#604)

That allows LLMPipeline to create ContinuousBatchingPipeline as a
backend. There's also a constructor accepting ireq, which can be used if
the model was already transformed appropriately for
ContinuousBatchingPipeline. But it feels it's going to be misleading and
it simpler just to throw if such constructor is called with
ContinuousBatchingPipeline backend.
Updated default configurations based on results from CVS-143530.

(cherry picked from commit f460002)
Co-authored-by: Yaroslav Tarkan <yaroslav.tarkan@intel.com>
…#642)

OpenVINOGenAITargets.cmake was excluded from packaging because
CPACK_COMPONENTS_ALL is custom now and doesn't install Unspecified
component
Co-authored-by: Pavel Esir <pavel.esir@gmail.com>
…oop for greedy sampling (openvinotoolkit#607)

Searching for max element in a custom loop gives better performance than
using std::max_element
- Added Readme for python tests
- Added `--model_ids` option to run selectively only on specific models

---------

Co-authored-by: Zlobin Vladimir <vladimir.zlobin@intel.com>
Symbols that cause errors:
- `\u0643`
- `\u25aa`
… optional plugin_config in tokenizer (openvinotoolkit#669)

This improves performance of CB lib when tested within OVMS.
Already merged to master:
openvinotoolkit#651
This is cherry-pick
…oolkit#670)

[mixtral-8x7b-instruct-v0.1-int4-ov](https://huggingface.co/OpenVINO/mixtral-8x7b-instruct-v0.1-int4-ov/)
didn't have `generation_config.json` therefore generation continued
ininitely. EOS_TOKEN_ID was red correctly but during generation it was
not met.

Updated docs so in every generate call max_new_tokens is set either in
arguments or via default generation config
`pipe.set_generation_config({'max_new_tokens': 100, 'num_beam_groups':
3, ...)`

tickets: CVS-146933 CVS-146324
Co-authored-by: Anastasiia Pnevskaia <anastasiia.pnevskaia@intel.com>
popovaan and others added 21 commits July 26, 2024 06:51
- Added performance metrics and updated Readme with description how to
use them
- Added cpp and python sample for benchmarking

Sample to calculate and visualize performance metrics.
```
import openvino_genai as ov_genai
import tqdm
import pandas as pd
import matplotlib.pylab as pl

pipe = ov_genai.LLMPipeline('TinyLlama-1.1B-Chat-v1.0/')
config = ov_genai.GenerationConfig(max_new_tokens=15)
metrics_df = pd.DataFrame(columns=['batch_size', 'throughput', 'ttft', 'tpot', 'std_throughput', 'std_ttft', 'std_tpot'])

num_iter = 3
for batch_size in tqdm.tqdm([1, 2, 4, 16, 32, 64, 128]):
    prompts = ["The Sky is blue because"] * batch_size
    res = pipe.generate(prompts, config)
    metrics = res.perf_metrics
    
    for _ in range(num_iter - 1):
        res = pipe.generate(prompts, config)
        metrics += res.perf_metrics
    metrics_df = metrics_df._append({
        'throughput': metrics.get_throughput().mean, 'ttft': metrics.get_ttft().mean, 'tpot': metrics.get_tpot().mean,
        'std_throughput': metrics.get_throughput().std, 'std_ttft': metrics.get_ttft().std, 'std_tpot': metrics.get_tpot().std,
        'batch_size': batch_size, 
    }, ignore_index=True)

fig, axes = pl.subplots(nrows=3, ncols=1, figsize=(6, 8), sharex=True)

axes[0].plot(metrics_df['batch_size'], metrics_df['throughput'], '-o')
axes[1].plot(metrics_df['batch_size'], metrics_df['ttft'], '-o', )
axes[2].plot(metrics_df['batch_size'], metrics_df['tpot'], '-o')

axes[0].set_ylabel('Throughput'), axes[1].set_ylabel('TTFT'), axes[2].set_ylabel('TPOT')
axes[2].set_xlabel('Batch Size')
axes[0].grid(True), axes[1].grid(True), axes[2].grid(True)
pl.tight_layout()
```


![image](https://github.com/user-attachments/assets/021a94b4-fc75-4b5f-90e6-60db471a3810)

ticket: CVS-132859
Removing dockerfile from release branch due to process requirements.
Docstring for generation time metrics
Ticket: CVS-132859
Co-authored-by: Zlobin Vladimir <vladimir.zlobin@intel.com>
@Wovchena Wovchena requested a review from TolyaTalamanov August 2, 2024 13:39
@ilya-lavrenov ilya-lavrenov added this to the 2024.3 milestone Aug 2, 2024
@ilya-lavrenov ilya-lavrenov added this pull request to the merge queue Aug 5, 2024
@ilya-lavrenov ilya-lavrenov self-assigned this Aug 5, 2024
Merged via the queue into openvinotoolkit:master with commit dc9ef33 Aug 5, 2024
27 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.