Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Psa crypto dcache #20260

Open
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

karstenkoenig
Copy link
Contributor

@karstenkoenig karstenkoenig commented Feb 10, 2025

Enable dcache on the crypto using code for nrf54h20

test_crypto: PR-762

@karstenkoenig karstenkoenig added DNM CI-disable Disable CI for this PR labels Feb 10, 2025
@github-actions github-actions bot added the changelog-entry-required Update changelog before merge. Remove label if entry is not needed or already added. label Feb 10, 2025
@NordicBuilder
Copy link
Contributor

NordicBuilder commented Feb 10, 2025

CI Information

To view the history of this post, clich the 'edited' button above
Build number: 26

Inputs:

Sources:

sdk-nrf: PR head: b57d288b1d33763dbbf8d042d518edcf4eba8350

more details

sdk-nrf:

PR head: b57d288b1d33763dbbf8d042d518edcf4eba8350
merge base: 20a48098b3ed3aff7cd06312b7a143d8b9514142
target head (main): 8aefc9fe6ce9f9c14e815f19bfa604647eb1c714
Diff

Github labels

Enabled Name Description
ci-disabled Disable the ci execution
ci-all-test Run all of ci, no test spec filtering will be done
ci-force-downstream Force execution of downstream even if twister fails
ci-run-twister Force run twister
ci-run-zephyr-twister Force run zephyr twister
List of changed files detected by CI (22)
include
│  ├── sdfw
│  │  ├── sdfw_services
│  │  │  │ crypto_service.h
samples
│  ├── crypto
│  │  ├── aes_cbc
│  │  │  ├── boards
│  │  │  │  │ nrf54h20dk_nrf54h20_cpuapp.conf
│  │  ├── aes_ccm
│  │  │  ├── boards
│  │  │  │  │ nrf54h20dk_nrf54h20_cpuapp.conf
│  │  ├── aes_ctr
│  │  │  ├── boards
│  │  │  │  │ nrf54h20dk_nrf54h20_cpuapp.conf
│  │  ├── aes_gcm
│  │  │  ├── boards
│  │  │  │  │ nrf54h20dk_nrf54h20_cpuapp.conf
│  │  ├── chachapoly
│  │  │  ├── boards
│  │  │  │  │ nrf54h20dk_nrf54h20_cpuapp.conf
│  │  ├── ecdh
│  │  │  ├── boards
│  │  │  │  │ nrf54h20dk_nrf54h20_cpuapp.conf
│  │  ├── ecdsa
│  │  │  ├── boards
│  │  │  │  │ nrf54h20dk_nrf54h20_cpuapp.conf
│  │  ├── ecjpake
│  │  │  ├── boards
│  │  │  │  │ nrf54h20dk_nrf54h20_cpuapp.conf
│  │  ├── eddsa
│  │  │  ├── boards
│  │  │  │  │ nrf54h20dk_nrf54h20_cpuapp.conf
│  │  ├── hkdf
│  │  │  ├── boards
│  │  │  │  │ nrf54h20dk_nrf54h20_cpuapp.conf
│  │  ├── hmac
│  │  │  ├── boards
│  │  │  │  │ nrf54h20dk_nrf54h20_cpuapp.conf
│  │  ├── pbkdf2
│  │  │  ├── boards
│  │  │  │  │ nrf54h20dk_nrf54h20_cpuapp.conf
│  │  ├── psa_tls
│  │  │  ├── boards
│  │  │  │  ├── nrf54h20dk_nrf54h20_cpuapp.conf
│  │  │  │  │ nrf54h20dk_nrf54h20_cpurad.conf
│  │  ├── rng
│  │  │  ├── boards
│  │  │  │  │ nrf54h20dk_nrf54h20_cpuapp.conf
│  │  ├── sha256
│  │  │  ├── boards
│  │  │  │  │ nrf54h20dk_nrf54h20_cpuapp.conf
│  │  ├── spake2p
│  │  │  ├── boards
│  │  │  │  │ nrf54h20dk_nrf54h20_cpuapp.conf
subsys
│  ├── nrf_security
│  │  ├── src
│  │  │  ├── ssf_secdom
│  │  │  │  │ ssf_crypto.c
│  ├── sdfw_services
│  │  ├── services
│  │  │  ├── psa_crypto
│  │  │  │  ├── Kconfig
│  │  │  │  ├── psa_crypto_service.c
│  │  │  │  │ psa_crypto_service.cddl

Outputs:

Toolchain

Version: 4ffa2202d5
Build docker image: docker-dtr.nordicsemi.no/sw-production/ncs-build:4ffa2202d5_8bf7ca4353

Test Spec & Results: ✅ Success; ❌ Failure; 🟠 Queued; 🟡 Progress; ◻️ Skipped; ⚠️ Quarantine

  • ◻️ Toolchain - Skipped: existing toolchain is used
  • ✅ Build twister - Skipped: Skipping Build & Test as it succeeded in a previous run: 25
  • ✅ Integration tests
    • ✅ test-fw-nrfconnect-chip - Skipped: Job was skipped as it succeeded in a previous run
    • ✅ test-fw-nrfconnect-nrf-iot_cloud - Skipped: Job was skipped as it succeeded in a previous run
    • ✅ test-fw-nrfconnect-nrf_crypto - Skipped: Job was skipped as it succeeded in a previous run
    • ✅ test-fw-nrfconnect-tfm - Skipped: Job was skipped as it succeeded in a previous run
    • ✅ test-sdk-find-my - Skipped: Job was skipped as it succeeded in a previous run
    • ✅ test-sdk-dfu - Skipped: Job was skipped as it succeeded in a previous run
Disabled integration tests
    • desktop52_verification
    • doc-internal
    • test_ble_nrf_config
    • test-fw-nrfconnect-apps
    • test-fw-nrfconnect-ble_mesh
    • test-fw-nrfconnect-ble_samples
    • test-fw-nrfconnect-boot
    • test-fw-nrfconnect-fem
    • test-fw-nrfconnect-nfc
    • test-fw-nrfconnect-nrf-iot_libmodem-nrf
    • test-fw-nrfconnect-nrf-iot_lwm2m
    • test-fw-nrfconnect-nrf-iot_mosh
    • test-fw-nrfconnect-nrf-iot_positioning
    • test-fw-nrfconnect-nrf-iot_samples
    • test-fw-nrfconnect-nrf-iot_serial_lte_modem
    • test-fw-nrfconnect-nrf-iot_thingy91
    • test-fw-nrfconnect-nrf-iot_zephyr_lwm2m
    • test-fw-nrfconnect-proprietary_esb
    • test-fw-nrfconnect-ps
    • test-fw-nrfconnect-rpc
    • test-fw-nrfconnect-rs
    • test-fw-nrfconnect-thread
    • test-low-level
    • test-sdk-audio
    • test-sdk-mcuboot
    • test-sdk-pmic-samples
    • test-sdk-wifi
    • test-secdom-samples-public

Note: This message is automatically posted and updated by the CI

@NordicBuilder
Copy link
Contributor

You can find the documentation preview for this PR at this link.

Note: This comment is automatically posted by the Documentation Publish GitHub Action.

@karstenkoenig karstenkoenig marked this pull request as ready for review February 19, 2025 15:35
@karstenkoenig karstenkoenig requested review from a team as code owners February 19, 2025 15:35
@karstenkoenig karstenkoenig removed DNM CI-disable Disable CI for this PR labels Feb 20, 2025
@karstenkoenig karstenkoenig force-pushed the psa_crypto_dcache branch 2 times, most recently from 6ae8aa0 to 51a1eff Compare February 21, 2025 10:01
Copy link
Contributor

@tomi-font tomi-font left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🥵

Copy link

After documentation is built, you will find the preview for this PR here.

Preview links for modified nRF Connect SDK documents:

https://ncsdoc.z6.web.core.windows.net/PR-20260/nrf/app_dev/device_guides/fem/fem_nrf21540_gpio.html
https://ncsdoc.z6.web.core.windows.net/PR-20260/nrf/app_dev/device_guides/fem/fem_nrf21540_gpio_spi.html
https://ncsdoc.z6.web.core.windows.net/PR-20260/nrf/app_dev/device_guides/fem/fem_nrf2220.html
https://ncsdoc.z6.web.core.windows.net/PR-20260/nrf/app_dev/device_guides/fem/fem_simple_gpio.html
https://ncsdoc.z6.web.core.windows.net/PR-20260/nrf/app_dev/device_guides/fem/index.html
https://ncsdoc.z6.web.core.windows.net/PR-20260/nrf/app_dev/device_guides/nrf54h/index.html
https://ncsdoc.z6.web.core.windows.net/PR-20260/nrf/app_dev/device_guides/nrf54h/ug_nrf54h20_architecture_lifecycle.html
https://ncsdoc.z6.web.core.windows.net/PR-20260/nrf/app_dev/device_guides/nrf54h/ug_nrf54h20_gs.html
https://ncsdoc.z6.web.core.windows.net/PR-20260/nrf/app_dev/device_guides/wifi_coex.html
https://ncsdoc.z6.web.core.windows.net/PR-20260/nrf/dev_model_and_contributions/adding_code.html
https://ncsdoc.z6.web.core.windows.net/PR-20260/nrf/libraries/networking/downloader.html
https://ncsdoc.z6.web.core.windows.net/PR-20260/nrf/protocols/bt/bt_mesh/dfu_over_bt_mesh.html
https://ncsdoc.z6.web.core.windows.net/PR-20260/nrf/protocols/lte/index.html
https://ncsdoc.z6.web.core.windows.net/PR-20260/nrf/protocols/matter/index.html
https://ncsdoc.z6.web.core.windows.net/PR-20260/nrf/releases_and_maturity/abi_compatibility.html
https://ncsdoc.z6.web.core.windows.net/PR-20260/nrf/releases_and_maturity/known_issues.html
https://ncsdoc.z6.web.core.windows.net/PR-20260/nrf/releases_and_maturity/migration/migration_guide_2.8.html
https://ncsdoc.z6.web.core.windows.net/PR-20260/nrf/releases_and_maturity/migration/migration_guide_2.9.0-nrf54h20-rc1.html
https://ncsdoc.z6.web.core.windows.net/PR-20260/nrf/releases_and_maturity/release_notes.html
https://ncsdoc.z6.web.core.windows.net/PR-20260/nrf/releases_and_maturity/releases/release-notes-2.9.0-nrf54h20-rc1.html
https://ncsdoc.z6.web.core.windows.net/PR-20260/nrf/releases_and_maturity/releases/release-notes-changelog.html
https://ncsdoc.z6.web.core.windows.net/PR-20260/nrf/samples/matter.html
doc/nrf/samples/matter.rst
https://ncsdoc.z6.web.core.windows.net/PR-20260/nrf/samples/bluetooth/fast_pair/locator_tag/README.html
https://ncsdoc.z6.web.core.windows.net/PR-20260/nrf/samples/cellular/nrf_cloud_multi_service/README.html
https://ncsdoc.z6.web.core.windows.net/PR-20260/nrf/samples/suit/recovery/README.html

tomi-font
tomi-font previously approved these changes Feb 24, 2025
Copy link

github-actions bot commented Mar 4, 2025

You can find the documentation preview for this PR here.

@karstenkoenig karstenkoenig force-pushed the psa_crypto_dcache branch 3 times, most recently from d4c3de6 to 854256e Compare March 5, 2025 13:53
@karstenkoenig
Copy link
Contributor Author

@jonathannilsen @tomi-font thanks for the previous review, can you please have another look? I tried making it fully correct now, that means we need to create an extra buffer for pointers to data getting written from the remote, as that could lead to a situation where the local domain and the remote domain both could write to the same dataunit, e.g. a static uint8_t buffer[5] and bool in the same struct. It tries avoiding the copy by checking for alignment. I tried having it in a way that it gets optimized away if the referenced structure is already aligned, but couldn't have it look acceptable so figured this way it is at least consistent.
I feel this out buffer bouncing should be default, but have an option to turn it off, can this be left out for an eventual follow up? We don't even know yet if that'd be something impacting performance that badly

@karstenkoenig karstenkoenig requested a review from a team as a code owner March 10, 2025 15:26
@karstenkoenig karstenkoenig force-pushed the psa_crypto_dcache branch 6 times, most recently from 63c8f9a to 56d2a3a Compare March 17, 2025 14:15
@karstenkoenig karstenkoenig force-pushed the psa_crypto_dcache branch 2 times, most recently from 47a0d0c to 137c871 Compare March 20, 2025 13:47
Copy link

Quality Gate Failed Quality Gate failed

Failed conditions
13.2% Duplication on New Code (required ≤ 3%)

See analysis details on SonarQube Cloud

@karstenkoenig karstenkoenig requested a review from tomi-font March 21, 2025 08:56
@karstenkoenig
Copy link
Contributor Author

@tomi-font @jonathannilsen hei, please check again now, CI is green after the testcases were adjusted to not pass in full buffer sizes. The bounce buffer handling can be optionally disabled.

We can skip remote psa init, so remove the call but leave the handler
intact to not break the interface.
Also add new functions returning not supported.

Signed-off-by: Karsten Koenig <karsten.koenig@nordicsemi.no>
Removed pointless whitespaces all over the service.

Signed-off-by: Karsten Koenig <karsten.koenig@nordicsemi.no>
Need to writeback and invalidate all buffers that get sent/received from
the remote to make sure we have consistent view of the data for all
involved parties.

Signed-off-by: Karsten Koenig <karsten.koenig@nordicsemi.no>
Reduced the cache handling calls based on pointer usage, data only going
in to the psa call only needs to be flushed, data only getting read back
just needs to be invalidated and data going in and out needs both.

Signed-off-by: Karsten Koenig <karsten.koenig@nordicsemi.no>
int "Size of the heap buffer used for out buffer"
default 4096
help
Size of the heap buffer used for out buffer
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should document to the user that with SSF_PSA_CRYPTO, performance will improve when PSA calls use output buffers that are aligned to the DataUnit size.

And specify that we consider a buffer to be aligned when it both starts and ends at an address aligned with the DataUnit size. As opposed to the C concept of aligned, that just deals with the start address.

And also that PSA calls to unaligned buffers larger than SSF_PSA_CRYPTO_SERVICE_OUT_HEAP_SIZE will return PSA_ERROR_INSUFFICIENT_MEMORY.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this could be a good place:

https://docs.nordicsemi.com/bundle/ncs-latest/page/nrf/libraries/security/nrf_security/doc/configuration.html

Although we don't need to do this in this PR...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nrf54h20 doesn't seem to be covered much anyways there, I'll ask around if there is a plan for this.


if ((IS_ALIGNED(original_buffer, CACHE_DATA_UNIT_SIZE)) &&
(IS_ALIGNED(size, CACHE_DATA_UNIT_SIZE))) {
out_buffer = original_buffer;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should not buffer pointers to uncached memory areas.

Use-case:
Output buffer is too large to practically buffer it.
The protocol has determined the size, and it's not word-aligned.

Maybe this use-case doesn't exist, but if it did, the user would have no choice except to configure the MPU to not cache the buffer. And we would need to check if original_buffer is pointing to an uncached area at this point.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose this use-case is rare.

So I'm OK with doing nothing here wrt. the review.

But our team should at least be aware that we could be unnecessarily buffering uncached memory :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This only happens on output buffers, and those can be allocated to any size and the server will just fill what is needed and report the length it actually filled, so it is easy for the client to just pass in 4B aligned buffers for output, even if they expect an unaligned size of data back. That'd be a lot easier than adding an MPU region at least.

I guess we could try harder here to figure out if it is uncached, but to my mind that'd already be an active part on the client side and then they might just as well turn off the bounce buffers if they are confident they aren't using unaligned buffers. That also means they get to save 4k of RAM.

Copy link
Contributor

@SebastianBoe SebastianBoe Mar 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there are some PSA operations where the user cannot pad the output buffer themselves.

i.e.

https://arm-software.github.io/psa-api/crypto/1.1/api/ops/kdf.html#c.psa_key_derivation_output_bytes

For this operation, the output_length must match exactly the amount of bytes that is actually written by PSA. It cannot be an oversized/padded buffer.

Because of:

image

I hope I am mistaken though!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"That'd be a lot easier than adding an MPU region at least."

I am hoping that, if not supported by NCS already, then at least eventually, the user can just do:

__nocache my_buffer;

And Zephyr will aggregate all the uncached buffers and put them in the same memory region.

See __nocache in https://docs.zephyrproject.org/latest/hardware/cache/index.html

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but you'll have to add that memory region into the already fairly convoluted memory layout. But maybe it makes sense for users to do this out of principle anyways for crypto operations as those will always cause an extra copy of the data that potentially also needs to be erased when trying to handle those safely.
I'd still leave that up for a potential follow up performance improvement/simplification in use

psa_status_t ssf_psa_crypto_init(void)
#if defined(CONFIG_SSF_PSA_CRYPTO_SERVICE_OUT_BOUNCE_BUFFERS_ENABLED)

#define CACHE_DATA_UNIT_SIZE 4
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we get this from DT?

Joking aside, we subtly rely on CACHE_DATA_UNIT_SIZE == sizeof(uintptr) as k_heap_alloc returns addresses aligned to the pointer.

I'm worried this requirement will subtly fail if they bump the cache line size on the next platform. Adding an assert and a comment would ease portability I think.

/* k_heap_alloc allocated memory is aligned on a multiple of pointer sizes. The HW's DataUnit size must match this Zephyr behaviour. */
BUILD_ASSERT(DCACHEDATA_DATAWIDTH * 4 == sizeof(uintptr))

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah nice catch, yes I meant to check for this at build time but forgot. Also good catch that DCACHEDATA_DATAWIDTH is in the MDK, I didn't realize we had that DataUnit size information in there. I'll use this instead of harcoding it to 4.

Copy link
Contributor

@tomi-font tomi-font left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reviewed all but psa_crypto_service.c

Comment on lines 19 to 23
psa_status_t ssf_psa_crypto_init(void)
{
int err;
struct psa_crypto_req req = { 0 };
struct psa_crypto_rsp rsp = { 0 };

req.psa_crypto_req_msg_choice = psa_crypto_req_msg_psa_crypto_init_req_m_c;

err = ssf_client_send_request(&psa_crypto_srvc, &req, &rsp, NULL);
if (err != 0) {
return err;
}

return rsp.psa_crypto_rsp_status;
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The implementation of this function is added in the previous commit, and removed in this commit.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was to align different versions of the generator script that I wasn't the author of, I just tried to have a synchronized history between the script and the commits here

Comment on lines 33 to 34
help
Size of the heap buffer used for out buffer
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Useless help message?

Added bounce buffers for any field written from secure domain to local
domain, as we can't prevent the local domain to have dirty data in the
DataUnit that the secure domain writes to.

Signed-off-by: Karsten Koenig <karsten.koenig@nordicsemi.no>
Enabled DCache is now fully supported by the PSA API and hence not
disabled anymore in the respective crypto samples.

Signed-off-by: Karsten Koenig <karsten.koenig@nordicsemi.no>
Update the parameter names to follow the change in the header file.
Also used latest zcbor which introduced a style change.

Signed-off-by: Karsten Koenig <karsten.koenig@nordicsemi.no>
@karstenkoenig
Copy link
Contributor Author

@tejlmand @nordicjm Could you have a look please? The Jenkins failure passed on a rerun and the sonarcloud failure is certainly correct in the duplication of code but that's because it is a generated file. We have a plan to migrate to a different solution eventually but for now we need to fix the DCache handling when using cryptography

@nordicjm
Copy link
Contributor

you need to re-run the whole job, not an individual sub-job, it will only re-run the sub-jobs that have failed and update the github status. The sonar one is advisory, it is not mandatory

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
changelog-entry-required Update changelog before merge. Remove label if entry is not needed or already added.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants