Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

lib: pixel: towards gstreamer/ffmpeg/opencv #86671

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

josuah
Copy link
Collaborator

@josuah josuah commented Mar 5, 2025

This is a proposed implementation for:

Implementation notes:

  • Functions named pixel_<a>_to_<b>() or pixel_<operation>_<a>(), where <a> and <b> are pixel formats name with frame/line/hist/avg suffix (no reliance on <zephyr/drivers/video.h> or <zephyr/drivers/display.h> formats)
  • Heavy use of static inline: to allow the compiler to optimize the constants away and build operations that work on whole-lines without function call inside.
  • No unit tests yet, as maybe using VIDEO_SW_GENERATOR to build the test data was interesting?

@josuah josuah added area: Drivers area: Samples Samples RFC Request For Comments: want input from the community area: Video Video subsystem labels Mar 5, 2025
@kartben kartben assigned josuah and unassigned kartben Mar 5, 2025
@josuah
Copy link
Collaborator Author

josuah commented Mar 5, 2025

To test the samples on the native simulator:

west build -p -b native_sim samples/lib/pixel/print/ && west flash
west build -p -b native_sim samples/lib/pixel/resize/ && west flash
west build -p -b native_sim samples/lib/pixel/stats/ && west flash
west build -p -b native_sim samples/lib/pixel/stream/ && west flash

@josuah josuah force-pushed the pr-lib-pixel branch 2 times, most recently from 72a038e to ca0261e Compare March 5, 2025 17:15
@nashif
Copy link
Member

nashif commented Mar 5, 2025

west build -p -b native_sim samples/lib/pixel/print/ && west flash

wow, this is nice

image

size_t src_i = src_w * bits_per_pixel / BITS_PER_BYTE;
size_t dst_i = dst_w * bits_per_pixel / BITS_PER_BYTE;

memcpy(&dst_buf[dst_i], &src_buf[src_i], bits_per_pixel / BITS_PER_BYTE);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

More of a suggestion:
If it is guaranteed that src and dst addresses never overlap, then we are good but for cases where src and dst addresses might overlap, i have seen memcpy leading to pixel data corruption So it is good to either handle such scenario or use safe version of memcpy (memcpy_s) which inherently checks for address overlap.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function could be called to shrink a buffer in-place, and this would lead to a bug!

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

memcpy_s() seemed like unsupported, so I switched to memmove() to allow incrementally improve performance as use-cases shows-up. This would only copy 1~3 bytes in typical usage anyway I think.

}

#ifdef CONFIG_POLL
static int video_sw_isp_set_signal(const struct device *dev, enum video_endpoint_id ep,
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a suggestion:
Seems like one of the input parameter (video_endpoint_id ep) is not getting used in function, if we do not have plan to use it, it is good to remove it.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There was missing input validation on it!

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a side note. I think all things related to video_endpoint_id should be removed from the video framework. Video port / enpoint is already described in the devicetree and up to now, all the video_endpoint_id parameters passed in video functions are not used at all, just served as a check (which is not necessary). I plan to make a PR to remove them.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A no-op everywhere in current drivers that is true... This PR uses multiple endpoints, should I try to make it single-endpoint (i.e. 1 input device, 1 output device)? Did you already have some plan you wanted to try for this scenario?

Copy link
Collaborator

@ngphibang ngphibang Mar 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I have.

should I try to make it single-endpoint (i.e. 1 input device, 1 output device)?

It depends on the complexity of the real hw device that we should expose it as a single or multiple devices or not. But I think it's not really related to the discussed point here.

In the v4z framework, the video_endpoint_id parameter put in each API function has nothing to do with its meaning. There are current drivers such as mipicsi2rx which also has multiple endpoints : ep_in is the camera sensor and ep_out is the csi (dma) but why it does not need to distinguish the ep direction like the ISP (as in the current PR) ?

So, the difference is not because a device has multiple endpoints (direction) or not, but it's because the ISP (and other devices like the PxP) falls into another catergory which (Linux) calls m2m device. For this kind of devices, the driver needs two separate buffer queues and hence needs to distinguish between input (which Linux falsely call "output") and output (which Linux calls "capture") side. To support this kind of devices, we can do as follow depending on each API:

  • get/set_format() : add a "type" (input or output) field in struct video_format
  • enqueue/dequeue() : add a "type" field in struct video_buffer
  • set_stream() : add a type parameter in the function API. I see that you didn't implement this callback. But streaming need to be done separately for each side.
  • set/get_ctrl() : nothing to do as control IDs are already different for input and output (and also, we rarely has to set controls on both input and output)
  • set/get/enum_frmival() : no need.

I will push the PR this weekend. I plan to do more reworks in the framework to support m2m device as we also need it for rewriting PxP driver as said in my last comment here.

Copy link
Collaborator Author

@josuah josuah Mar 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the insight, I understand the underlying challenges a little bit better.

I do my best to make these drivers follow the APIs you introduce. :) Good thing that the PRs got split!

P.S.: m2m reminds of this other PR (few months ago):

@josuah josuah force-pushed the pr-lib-pixel branch 2 times, most recently from fcd4b52 to 9efe118 Compare March 6, 2025 19:22
@josuah
Copy link
Collaborator Author

josuah commented Mar 6, 2025

force-push:

  • Switch resize function from memcpy() to memmove() to allow in-place resizing (thank you @IAmAnkur)
  • Add missing input validation from video_sw_isp_set_signal (thank you @IAmAnkur)
  • Try to address CI issues at the same time, after reproducing them locally

@josuah
Copy link
Collaborator Author

josuah commented Mar 8, 2025

force-push:

  • print the performance statistics on the stream sample
  • switch samples from printf() to LOG_INF()
  • Try to address CI issues at the same time, after reproducing them locally (different issues from last time)

@josuah
Copy link
Collaborator Author

josuah commented Mar 9, 2025

force-push:

  • Rebase on latest main (to go past Zephyr 4.1 and base other PRs on this)

@josuah
Copy link
Collaborator Author

josuah commented Mar 25, 2025

force-push:

  • More consistent naming scheme for macros (STREAM_XXX_TO_YYY -> XXXSTREAM_TO_YYYSTREAM like function names)
  • Improve the doc grouping of functions (this was only documenting a single line)
  • Do not expose the single-pixel conversion functions, which:
    • Greatly reduce the number of functions
    • Still permits to convert a single pixel by specifying a width of 1 to the line conversion functions
    • Simplifies handling of formats that are not defined at a single-pixel level (i.e. YUYV)
  • Introduce the RGB332 format.
  • Simplify how the pixel printing functions are defined
  • Rebase on latest main

The global goal is to make defining a new format of operation much less bureaucratic: only a line-conversion function, wrap it in a frame-conversion function (stream), and declare a top-level macro for it.

This also reduces the API surface to just the essential:

  • Line conversion functions (the kernel operations that do the heavy lifting)
  • Stream-based frame conversion functions (stitching the kernels together)

@josuah josuah force-pushed the pr-lib-pixel branch 3 times, most recently from 2cd0a24 to 1c4bd3a Compare March 26, 2025 12:45
@josuah
Copy link
Collaborator Author

josuah commented Mar 26, 2025

force-push:

  • Enable UTF-8 back again
  • Introduce a choice Kconfig to select the printing method (printf, printk, shell_print), including "none" to skip printing (CI timed-out due to large prints)
  • Switch docstrings from @defgroup to @copydetails to document functions in batch, it seemed more intuitive and appropriate.

@josuah josuah force-pushed the pr-lib-pixel branch 4 times, most recently from bbde553 to 6dfddbc Compare March 27, 2025 23:28
@josuah
Copy link
Collaborator Author

josuah commented Mar 27, 2025

force-push:

  • Syntax sugar: pipeline one-liners in C (+ macros)
  • Add __weak attribute to line conversion functions to allow optimized version to be opted-in (i.e. SIMD).
  • minor bugfixes
  • Document the functions used by pipeline module developers
  • Add a MAINTAINERS entry to allow notifications (how to create the new area?)

The documentation is somewhat better now, but needs a proper top-level introduction outside of Doxygen.

I hope these force-push are not distracting. I add them for the sake of avoiding extra PRs after the merge, in the meantime that review happens. I'll try to batch the changes more than currently done to reduce CI load and notifications.

@josuah josuah force-pushed the pr-lib-pixel branch 3 times, most recently from d2fa47a to 14379b6 Compare March 29, 2025 19:15
@josuah
Copy link
Collaborator Author

josuah commented Mar 29, 2025

Force-push:

  • CI bugfixes
  • Rebase on main
  • Add kernel processing for both convolutional kernels (i.e. sharpen mask) or custom kernels (i.e. "median" for denoising).

Unit test result for gaussian Blur kernel:
scrot_20250329_193647_222x443

Unit test result for median (denoise) kernel:
scrot_20250329_192121_222x442

@josuah josuah force-pushed the pr-lib-pixel branch 5 times, most recently from b4765ed to f45a634 Compare March 31, 2025 15:47
josuah added 4 commits April 3, 2025 00:48
The "pixel" library aims facilitating the implementation of all image
processing tasks, such as pixel conversion, with a focus on low-resource
environments. The processing functions scale down to per-pixel "kernels"
to line-based conversion to full streams of several frames.

Signed-off-by: Josuah Demangeon <me@josuah.net>
The newly introduced lib/pixel features utilities that help composing
video pipelines together for the purpose of stream processing, as well
as debug utilities.

Signed-off-by: Josuah Demangeon <me@josuah.net>
The tests use data generated by the ffmpeg command line utilty in order
to avoid bias by having the same person implementing the tests and the
source code.

Signed-off-by: Josuah Demangeon <me@josuah.net>
Add "Pixel library" area, which covers the newly introduced lib/pixel
and associated tests and samples.

Signed-off-by: Josuah Demangeon <me@josuah.net>
@josuah
Copy link
Collaborator Author

josuah commented Apr 3, 2025

force-push:

  • I could not reproduce job 39768403923 locally: it runs well when I test on this sample platform, so removing it for now. All tests are kept and pass CI.

New features will be proposed on other branches to reduce review burden, now using 1240bfd as stable checkpoint.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area: Drivers area: Process area: Samples Samples area: Video Video subsystem RFC Request For Comments: want input from the community
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants