Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feature(lvgl_port): Initial support for ppa rendering in lvgl (BSP-563) #409

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

ThunderDai
Copy link

@ThunderDai ThunderDai commented Oct 15, 2024

  Mainly supports the following features:
   - color blend (simple fill)
   - color blend with opa 
   - blend normal (blend with argb8888) 
   - blend normal with color (color convert / memcpy)

This feature can only used in direct/full mode

@CLAassistant
Copy link

CLAassistant commented Oct 15, 2024

CLA assistant check
All committers have signed the CLA.

@github-actions github-actions bot changed the title feature(lvgl_port): Initial support for ppa rendering in lvgl feature(lvgl_port): Initial support for ppa rendering in lvgl (BSP-563) Oct 15, 2024
@ThunderDai ThunderDai force-pushed the feat/add_ppa_for_lvgl_render branch from 0984d25 to 9723801 Compare October 15, 2024 06:26
@ThunderDai ThunderDai marked this pull request as draft October 15, 2024 06:44
@ThunderDai ThunderDai marked this pull request as ready for review October 15, 2024 07:34
@ThunderDai
Copy link
Author

@peter-marcisovsky @espzav hi, PTAL, this is my first commit for esp-bsp. Therefore, some code related rules are not familiar enough

Copy link
Collaborator

@peter-marcisovsky peter-marcisovsky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you consider adding some tests to your feature? If you take a look at my #357 , there are functionality and benchmark unit tests. Would it be possible to do the same for this feature?

Also, can you take a look at the failing CI?

@@ -55,6 +55,21 @@ typedef struct {

extern int lv_color_blend_to_argb8888_esp(asm_dsc_t *asm_dsc);

// Just for compatibility with v9.2.0
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would not recommend this approach of solving compatibility issues with LVGL 9.2.
I tried to build lvlgl_port test app on your branch, with the HW accelerating enabled and the test app would not build, neither with LVGL9.1, nor with LVGL9.2.

peter@peter ➜ ~/esp/esp-bsp/components/esp_lvgl_port/test_apps/lvgl_port git:(pr/409) ✗ idf.py -DSDKCONFIG_DEFAULTS='sdkconfig.defaults;sdkconfig.ci.asm_render' set-target esp32s3 build

Is your feature working with LVGL9.1? If yes, could you consider using just LVGL9.1 in this MR?

We have agreed to make the HW acceleration support working with LVGL9.1, for now here, until we figure out what to do with this brekaing changes which was introduced in LVGL9.2.

@ThunderDai
Copy link
Author

Could you consider adding some tests to your feature? If you take a look at my #357 , there are functionality and benchmark unit tests. Would it be possible to do the same for this feature?

Also, can you take a look at the failing CI?

yes, can add some test. I have already included the newly submitted lvgl_port component as part of my project and have run the benchmark without any issues. I will also try running the existing test_apps. It is worth noting that because PPA has implemented some of your current functions, it conflicts with the current header file. I have added a new header file myself. The following configuration is required:

CONFIG_LV_DRAW_SW_ASM_CUSTOM=y
CONFIG_LV_DRAW_SW_ASM_CUSTOM_INCLUDE="esp_lvgl_port_lv_blend_ppa.h"
CONFIG_LV_USE_CUSTOM_MALLOC=y

@ThunderDai ThunderDai marked this pull request as draft October 21, 2024 08:47
@ThunderDai ThunderDai force-pushed the feat/add_ppa_for_lvgl_render branch 2 times, most recently from 24e762a to 0b1609e Compare October 24, 2024 02:24
@ThunderDai ThunderDai marked this pull request as ready for review October 24, 2024 02:26
@ThunderDai ThunderDai force-pushed the feat/add_ppa_for_lvgl_render branch 6 times, most recently from 461a0f8 to 47b925e Compare October 28, 2024 06:32
Copy link
Collaborator

@tore-espressif tore-espressif left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ThunderDai I'm sorry for the late start of review. I left few comments for start

Comment on lines 2 to 4
espressif/esp32_p4_function_ev_board:
version: "*"
override_path: "../../../../../bsp/esp32_p4_function_ev_board"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For esp_lvgl_port, we try to avoid dependencies on any BSP. Can we removed it?

If not, we can place the example in different location,(not directly into esp_lvgl_port component)

@@ -286,7 +296,11 @@ static lv_display_t *lvgl_port_add_disp_priv(const lvgl_port_display_cfg_t *disp
buff_caps |= MALLOC_CAP_DEFAULT;
}

/* Use RGB internal buffers for avoid tearing effect */
#if CONFIG_IDF_TARGET_ESP32P4
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
#if CONFIG_IDF_TARGET_ESP32P4
#if SOC_CACHE_INTERNAL_MEM_VIA_L1CACHE

Would this be better for forward compatibility?

Comment on lines 359 to 403
if (buf1) {
free(buf1);
if (disp_ctx->draw_buffs[0]) {
free(disp_ctx->draw_buffs[0]);
}
if (buf2) {
free(buf2);
if (disp_ctx->draw_buffs[1]) {
free(disp_ctx->draw_buffs[1]);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for this fix! It was already fixed in the master branch, so you might have to resolve some conflicts...

@@ -77,12 +77,12 @@ if("usb_host_hid" IN_LIST build_components)
endif()

# Include SIMD assembly source code for rendering, only for (9.1.0 <= LVG_version < 9.2.0) and only for esp32 and esp32s3
if((lvgl_ver VERSION_GREATER_EQUAL "9.1.0") AND (lvgl_ver VERSION_LESS "9.2.0"))
if(CONFIG_IDF_TARGET_ESP32 OR CONFIG_IDF_TARGET_ESP32S3)
if((lvgl_ver VERSION_GREATER_EQUAL "9.1.0") AND (lvgl_ver VERSION_LESS_EQUAL "9.2.0"))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it really work with 9.2.0? we had some compatibility problems with this version. If yes, why can't support 9.2.2 too?

Comment on lines 9 to 12
#define LVGL_VERSION_MAJOR 9
#define LVGL_VERSION_MINOR 1
#define LVGL_VERSION_PATCH 0
#define LVGL_VERSION_INFO ""
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why can't we use lv_version.h from the lvgl component?

Comment on lines 93 to 94
dma2d_descriptor_t *tx_link_buffer = (dma2d_descriptor_t *)heap_caps_aligned_calloc(64, 1, 64, MALLOC_CAP_SPIRAM);
dma2d_descriptor_t *rx_link_buffer = (dma2d_descriptor_t *)heap_caps_aligned_calloc(64, 1, 64, MALLOC_CAP_SPIRAM);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The cache line size is configurable. Can be 64 or 128. Can we take the value from sdkconfig.h?

Comment on lines 142 to 143
esp_cache_msync((void *)tx_dsc, 64, ESP_CACHE_MSYNC_FLAG_DIR_C2M | ESP_CACHE_MSYNC_FLAG_UNALIGNED);
esp_cache_msync((void *)rx_dsc, 64, ESP_CACHE_MSYNC_FLAG_DIR_C2M | ESP_CACHE_MSYNC_FLAG_UNALIGNED);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here, you assume cache line size == 64. But it can be 128 too.

Is the unaligned sync sage here? If yes, please provide better comment

@ThunderDai ThunderDai force-pushed the feat/add_ppa_for_lvgl_render branch 2 times, most recently from 9c82bb2 to b92dfa2 Compare March 27, 2025 09:44
@ThunderDai ThunderDai force-pushed the feat/add_ppa_for_lvgl_render branch 2 times, most recently from c76daf9 to 7a10e47 Compare March 27, 2025 12:42
@espzav
Copy link
Collaborator

espzav commented Mar 27, 2025

I am not sure, why the benchmark not printed to comment, I checked in artifacts and it seems, that this optimization is not good.

Benchmark for BOARD esp32_p4_function_ev_board

DATE: 27.03.2025 14:06

LVGL version: 9.2.2

Name Avg. CPU Avg. FPS Avg. time render time flush time
Empty screen 88% (+27) 66 (-21) 12 (+7) 5 7 (+7)
Moving wallpaper 99% (+7) 22 (-49) 40 (+30) 38 (+31) 2 (-1)
Single rectangle 97% (+74) 69 (-20) 12 (+11) 0 (-1) 12 (+12)
Multiple rectangles 97% (+53) 69 (-29) 11 (+5) 7 (+3) 4 (+2)
Multiple RGB images 98% (+6) 22 (-29) 37 (+23) 27 (+15) 10 (+8)
Multiple ARGB images 99% (+1) 16 (-8) 53 (+18) 51 (+18) 2
Rotated ARGB images 99% 3 280 (+33) 267 (+22) 13 (+11)
Multiple labels 98% (-1) 33 (+1) 23 (-3) 18 (-6) 5 (+3)
Screen sized text 10% (+3) 89 (+1) 8 (+4) 2 (-2) 6 (+6)
Multiple arcs 90% (-2) 34 (-5) 23 (+1) 19 (-1) 4 (+2)
Containers 99% (+43) 22 (-67) 34 (+29) 28 (+23) 6 (+6)
Containers with overlay 99% (+2) 13 (-4) 67 (+16) 57 (+9) 10 (+7)
Containers with opa 99% (+28) 14 (-62) 56 (+47) 49 (+40) 7 (+7)
Containers with opa_layer 99% 8 (-2) 117 (+34) 109 (+28) 8 (+6)
Containers with scrolling 99% 12 (-6) 72 (+22) 68 (+20) 4 (+2)
Widgets demo 99% 16 51 (-2) 43 (-8) 8 (+6)
All scenes avg. 91% (+15) 31 (-19) 55 (+17) 49 (+12) 6 (+5)

@ThunderDai
Copy link
Author

ThunderDai commented Mar 31, 2025

I am not sure, why the benchmark not printed to comment, I checked in artifacts and it seems, that this optimization is not good.

Benchmark for BOARD esp32_p4_function_ev_board

DATE: 27.03.2025 14:06

LVGL version: 9.2.2

Name Avg. CPU Avg. FPS Avg. time render time flush time
Empty screen 88% (+27) 66 (-21) 12 (+7) 5 7 (+7)
Moving wallpaper 99% (+7) 22 (-49) 40 (+30) 38 (+31) 2 (-1)
Single rectangle 97% (+74) 69 (-20) 12 (+11) 0 (-1) 12 (+12)
Multiple rectangles 97% (+53) 69 (-29) 11 (+5) 7 (+3) 4 (+2)
Multiple RGB images 98% (+6) 22 (-29) 37 (+23) 27 (+15) 10 (+8)
Multiple ARGB images 99% (+1) 16 (-8) 53 (+18) 51 (+18) 2
Rotated ARGB images 99% 3 280 (+33) 267 (+22) 13 (+11)
Multiple labels 98% (-1) 33 (+1) 23 (-3) 18 (-6) 5 (+3)
Screen sized text 10% (+3) 89 (+1) 8 (+4) 2 (-2) 6 (+6)
Multiple arcs 90% (-2) 34 (-5) 23 (+1) 19 (-1) 4 (+2)
Containers 99% (+43) 22 (-67) 34 (+29) 28 (+23) 6 (+6)
Containers with overlay 99% (+2) 13 (-4) 67 (+16) 57 (+9) 10 (+7)
Containers with opa 99% (+28) 14 (-62) 56 (+47) 49 (+40) 7 (+7)
Containers with opa_layer 99% 8 (-2) 117 (+34) 109 (+28) 8 (+6)
Containers with scrolling 99% 12 (-6) 72 (+22) 68 (+20) 4 (+2)
Widgets demo 99% 16 51 (-2) 43 (-8) 8 (+6)
All scenes avg. 91% (+15) 31 (-19) 55 (+17) 49 (+12) 6 (+5)

In fact, our PPA/DMA2D acceleration is currently only useful for specific scenarios:

  1. Color Conversion (RGB565<-->RGB888)
  2. Memory Copy
  3. Data filling
  4. Overlay transparent colors

And the above scenarios all need to have better effects on larger areas, while benchmarks are a comprehensive evaluation, and there are many small areas that need to be drawn separately.
If the above scenarios are evaluated separately, the acceleration effect is better, such as flashing the decoded JPEG data stream separately or filling the background color when sliding the menu.
In addition, it is recommended to enable tear resistance for all tests(CONFIG_BSP_DISPLAY_LVGL_AVOID_TEAR=y), otherwise it will not have much value in practical applications

@ThunderDai
Copy link
Author

ThunderDai commented Mar 31, 2025

Benchmark Summary (9.2.2 ) --------- All scenes avg fps (based on avoid tear and direct mode)

color using PPA not use PPA
RGB565 31fps 23fps
RGB888 28fps(*25 fps) 19fps

From this table we can see, the avg fps can add about 8-9fps after using PPA/DMA2D

If we use PPA in RGB888, then will dsi underrun, so we need limit DMA2D speed by these code, and avg fps drop from 28 to 25fps, but still higher 6fps then not use ppa

#include "hal/axi_dma_ll.h"
#include "hal/axi_icm_ll.h"
#include "soc/mipi_dsi_bridge_struct.h"

int peak_level = 5; // 4 means 800MB/s, 5 means 400MB/s, 6 means 200MB/s
axi_icm_ll_set_qos_burstiness(AXI_ICM_MASTER_DMA2D, 256, AXI_ICM_ACCESS_WRITE);
axi_icm_ll_set_qos_peak_transaction_rate(AXI_ICM_MASTER_DMA2D, peak_level, peak_level + 1, AXI_ICM_ACCESS_WRITE); 

@ThunderDai
Copy link
Author

I am not sure, why the benchmark not printed to comment, I checked in artifacts and it seems, that this optimization is not good.

Benchmark for BOARD esp32_p4_function_ev_board

DATE: 27.03.2025 14:06

LVGL version: 9.2.2

Name Avg. CPU Avg. FPS Avg. time render time flush time
Empty screen 88% (+27) 66 (-21) 12 (+7) 5 7 (+7)
Moving wallpaper 99% (+7) 22 (-49) 40 (+30) 38 (+31) 2 (-1)
Single rectangle 97% (+74) 69 (-20) 12 (+11) 0 (-1) 12 (+12)
Multiple rectangles 97% (+53) 69 (-29) 11 (+5) 7 (+3) 4 (+2)
Multiple RGB images 98% (+6) 22 (-29) 37 (+23) 27 (+15) 10 (+8)
Multiple ARGB images 99% (+1) 16 (-8) 53 (+18) 51 (+18) 2
Rotated ARGB images 99% 3 280 (+33) 267 (+22) 13 (+11)
Multiple labels 98% (-1) 33 (+1) 23 (-3) 18 (-6) 5 (+3)
Screen sized text 10% (+3) 89 (+1) 8 (+4) 2 (-2) 6 (+6)
Multiple arcs 90% (-2) 34 (-5) 23 (+1) 19 (-1) 4 (+2)
Containers 99% (+43) 22 (-67) 34 (+29) 28 (+23) 6 (+6)
Containers with overlay 99% (+2) 13 (-4) 67 (+16) 57 (+9) 10 (+7)
Containers with opa 99% (+28) 14 (-62) 56 (+47) 49 (+40) 7 (+7)
Containers with opa_layer 99% 8 (-2) 117 (+34) 109 (+28) 8 (+6)
Containers with scrolling 99% 12 (-6) 72 (+22) 68 (+20) 4 (+2)
Widgets demo 99% 16 51 (-2) 43 (-8) 8 (+6)
All scenes avg. 91% (+15) 31 (-19) 55 (+17) 49 (+12) 6 (+5)

The defaults mode maybe not use avoid tear solution? so they can not compare

@ThunderDai ThunderDai force-pushed the feat/add_ppa_for_lvgl_render branch 2 times, most recently from db70cfb to 642ef56 Compare March 31, 2025 11:57
      Mainly supports the following features:
       - color blend (simple fill)
       - color blend with opa
       - blend normal (blend with argb8888)
       - blend normal to color (color convert / memcpy)
@ThunderDai ThunderDai force-pushed the feat/add_ppa_for_lvgl_render branch from 642ef56 to cb78c54 Compare April 8, 2025 03:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants