-
Notifications
You must be signed in to change notification settings - Fork 200
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merge releases/2024/3 into master #731
Merged
ilya-lavrenov
merged 56 commits into
openvinotoolkit:master
from
Wovchena:merge-releases/2024/3-into-master
Aug 5, 2024
Merged
Merge releases/2024/3 into master #731
ilya-lavrenov
merged 56 commits into
openvinotoolkit:master
from
Wovchena:merge-releases/2024/3-into-master
Aug 5, 2024
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Workaround Python_VERSION_MAJOR and MINOR not being set by replasing Python3 with Python Disable generation of some of the COMPONENTs not needed for GenAI. There are still unwanted empty archives, but they are generated uncounditionally by rapidjson.
…envinotoolkit#604) That allows LLMPipeline to create ContinuousBatchingPipeline as a backend. There's also a constructor accepting ireq, which can be used if the model was already transformed appropriately for ContinuousBatchingPipeline. But it feels it's going to be misleading and it simpler just to throw if such constructor is called with ContinuousBatchingPipeline backend.
Updated default configurations based on results from CVS-143530. (cherry picked from commit f460002)
Remove unwanted archives
Co-authored-by: Yaroslav Tarkan <yaroslav.tarkan@intel.com>
…#642) OpenVINOGenAITargets.cmake was excluded from packaging because CPACK_COMPONENTS_ALL is custom now and doesn't install Unspecified component
Co-authored-by: Pavel Esir <pavel.esir@gmail.com>
…oop for greedy sampling (openvinotoolkit#607) Searching for max element in a custom loop gives better performance than using std::max_element
Cherry picked from master
@Wovchena, retarget to OV 24.3 release branch
- Added Readme for python tests - Added `--model_ids` option to run selectively only on specific models --------- Co-authored-by: Zlobin Vladimir <vladimir.zlobin@intel.com>
Symbols that cause errors: - `\u0643` - `\u25aa`
… optional plugin_config in tokenizer (openvinotoolkit#669) This improves performance of CB lib when tested within OVMS. Already merged to master: openvinotoolkit#651 This is cherry-pick
…oolkit#670) [mixtral-8x7b-instruct-v0.1-int4-ov](https://huggingface.co/OpenVINO/mixtral-8x7b-instruct-v0.1-int4-ov/) didn't have `generation_config.json` therefore generation continued ininitely. EOS_TOKEN_ID was red correctly but during generation it was not met. Updated docs so in every generate call max_new_tokens is set either in arguments or via default generation config `pipe.set_generation_config({'max_new_tokens': 100, 'num_beam_groups': 3, ...)` tickets: CVS-146933 CVS-146324
Co-authored-by: Anastasiia Pnevskaia <anastasiia.pnevskaia@intel.com>
- Added performance metrics and updated Readme with description how to use them - Added cpp and python sample for benchmarking Sample to calculate and visualize performance metrics. ``` import openvino_genai as ov_genai import tqdm import pandas as pd import matplotlib.pylab as pl pipe = ov_genai.LLMPipeline('TinyLlama-1.1B-Chat-v1.0/') config = ov_genai.GenerationConfig(max_new_tokens=15) metrics_df = pd.DataFrame(columns=['batch_size', 'throughput', 'ttft', 'tpot', 'std_throughput', 'std_ttft', 'std_tpot']) num_iter = 3 for batch_size in tqdm.tqdm([1, 2, 4, 16, 32, 64, 128]): prompts = ["The Sky is blue because"] * batch_size res = pipe.generate(prompts, config) metrics = res.perf_metrics for _ in range(num_iter - 1): res = pipe.generate(prompts, config) metrics += res.perf_metrics metrics_df = metrics_df._append({ 'throughput': metrics.get_throughput().mean, 'ttft': metrics.get_ttft().mean, 'tpot': metrics.get_tpot().mean, 'std_throughput': metrics.get_throughput().std, 'std_ttft': metrics.get_ttft().std, 'std_tpot': metrics.get_tpot().std, 'batch_size': batch_size, }, ignore_index=True) fig, axes = pl.subplots(nrows=3, ncols=1, figsize=(6, 8), sharex=True) axes[0].plot(metrics_df['batch_size'], metrics_df['throughput'], '-o') axes[1].plot(metrics_df['batch_size'], metrics_df['ttft'], '-o', ) axes[2].plot(metrics_df['batch_size'], metrics_df['tpot'], '-o') axes[0].set_ylabel('Throughput'), axes[1].set_ylabel('TTFT'), axes[2].set_ylabel('TPOT') axes[2].set_xlabel('Batch Size') axes[0].grid(True), axes[1].grid(True), axes[2].grid(True) pl.tight_layout() ``` ![image](https://github.com/user-attachments/assets/021a94b4-fc75-4b5f-90e6-60db471a3810) ticket: CVS-132859
Removing dockerfile from release branch due to process requirements.
Docstring for generation time metrics Ticket: CVS-132859
Co-authored-by: Zlobin Vladimir <vladimir.zlobin@intel.com>
ilya-lavrenov
approved these changes
Aug 2, 2024
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.