Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Flashing ek_ra8m1/mck_ra8t1 may sometimes permamently change reserved part OFS area at 0300A270 preventing it from correctly flashing ever again #83300

Closed
piotrrak opened this issue Dec 21, 2024 · 9 comments
Assignees
Labels
bug The issue is a bug, or the PR is fixing a bug platform: Renesas Renesas priority: low Low impact/importance bug Stale

Comments

@piotrrak
Copy link

piotrrak commented Dec 21, 2024

Describe the bug

Upon flashing zephyr binary OFS flash part of flash area changes at address 0x300A270 value at first to 0x00_00_00_ff, preventing from passing failing validation after flashing.

According to specification
PBPS_SEC: 0x0300_A260, 0x0300_A264, 0x0300_A268, 0x0300_A26C which is factory shipped with with value 0xFF
One is able to block flash writes by fusing bits of this area to '0'

Location 0x0300_A270 is reserved and according to specification should be written with values '0xFF'
For some unknown to me reason flashing zephyr binaries may result in fusing those bits also to 0
After that it is impossible to flash any binary with verification using JLink.

Currently zephyr GAP fills this area with correct values according to specification 'FF'.
Linker script puts correct values in .option_setting_s at specific locations.
Reserved locations in between are correctly filled with 0xff yet sometimes this area becomes fused first to 0x00_00_00_FF

Issue exhibited on two board mck_ra8t1 and later on ek_ra8m1

To Reproduce
Unknown?

Expected behavior
Area at 0x0300_A270:0x300_A273 is never written to or always written with value '0xFF' to prevent board from ever being flashed by usual means.

Impact
Board won't pass validation during flashing and is impossible to correctly flash using method used by zephyr.

Logs and console output

Incorrect value in flash that can't be reverted back to 0xFF_FF_FF_FF:

rfp-cli -d RA -tool jlink -if swd -rv 0300A270 4 
Renesas Flash Programmer CLI V1.11
Module Version: V3.17.00.000

Connecting the tool (J-Link)
      J-Link Firmware: J-Link OB-RA4M2 compiled Oct  9 2024 11:01:47
Tool: J-Link (SEGGER J-Link (unknown))                                          
Interface: SWD

Connecting the target device
Speed: 1,500,000 Hz
Connected to R7FA8M1AHECBD

Reading data from the device
  [Config Area 2]      0300A270 - 0300A273
                                                                                
Address  | 00 01 02 03 04 05 06 07 08 09 0A 0B 0C 0D 0E 0F
---------+------------------------------------------------
0300A270 | 00 00 00 FF 

Disconnecting the tool

Operation successful

Zephyr correct ELF.

eu-readelf -x .option_setting_s /home/prak/DEV/ra8-head/zephyr/build/zephyr/zephyr.elf
Hex dump of section [22] '.option_setting_s', 208 bytes at offset 0x6290:
  0x00000000 fffffffd ffffffff ffffffff ffffffff ................
  0x00000010 ffffffff ffffffff ffffffff ffffffff ................
  0x00000020 ffffffff ffffffff ffffffff ffffffff ................
  0x00000030 ffffffff ffffffff ffffffff ffffffff ................
  0x00000040 ffffffff ffffffff ffffffff ffffffff ................
  0x00000050 ffffffff ffffffff ffffffff ffffffff ................
  0x00000060 ffffffff ffffffff ffffffff ffffffff ................
  0x00000070 ffffffff ffffffff ffffffff ffffffff ................
  0x00000080 00000000 ffffffff ffffffff ffffffff ................
  0x00000090 ffffffff ffffffff ffffffff ffffffff ................
  0x000000a0 ffffffff ffffffff ffffffff ffffffff ................
  0x000000b0 ffffffff ffffffff ffffffff ffffffff ................
  0x000000c0 ffffffff ffffffff ffffffff ffffffff ................

Environment (please complete the following information):

  • OS: Linux
  • Toolchain Zephyr SDK
  • SHA: f786ecc

Additional context

The same issue was observed before with mck_ra8t1 and exactly same value was written there.
Please see an issue: #83010 where I described when I hit this exact problem first time.
I have too little exposure if this issue is only related to zephyr since I have too little exposure to this MCU family.
Although it did happen to me with zephyr twice so far and in my eyes has pretty severe consequences.

I have raised this issue asking for help on Renesas forum asking for help few days before I knew little more, but it is pending still moderation. Thus for now I have no input from board manufacturer.

@piotrrak piotrrak added the bug The issue is a bug, or the PR is fixing a bug label Dec 21, 2024
@piotrrak
Copy link
Author

Can't confirm that it is an binary that caused it, issue is observable next time flashing with different binary.
This part of hex also looks fine to my eyes, dunno format but it looks pretty simple and makes sense.

Relevant part of zephyr.hex:

:020000040300F7
:10A10000FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF5F
:04A11000FFFFFFFF4F
:04A13400FFFFFFFF2B
:10A20000FFFFFFFDFFFFFFFFFFFFFFFFFFFFFFFF60
:10A21000FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF4E
:10A22000FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF3E
:10A23000FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF2E
:10A24000FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF1E
:10A25000FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF0E
:10A26000FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFE
:10A27000FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFEE
:10A2800000000000FFFFFFFFFFFFFFFFFFFFFFFFDA
:10A29000FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFCE
:10A2A000FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFBE
:10A2B000FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFAE
:10A2C000FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF9E
:040000050200119D47
:00000001FF

@duynguyenxa
Copy link
Member

hi @piotrrak , have you flashed these board before using e2studio?

@piotrrak
Copy link
Author

@duynguyenxa
That would be true for ra8m1 (have two of those and tested those ~Feb 2024 with e2studio once I got them)
One of them is affected, don't dare to use second now (keep it as spare).
I was sometimes using them with e2 for some AI inference with MVE before I got access to cortex-m55.
I've also used them zephyr with custom patches (eg. my own clock driver) without hal_renesas since it was not yet upstreamed.

I am bit worried to use those now, because if it did reproduce for example on PBPS_SEC: 0x0300_A260 the boards would be pretty much useless for me as I understand it right now at least.

And that would be no in the case mdk-ra8t1.
I used it only to verify #83010 and I will be needing for 6-phase motor control algorithm development project.
However after it happened on ra8t1 I've tried minimal e2studio template on ra8t1 and using it to restore flash tool on it without success.
Did it while trying figured out what's really going (why I can't flash it anymore) if it is only zephyr binaries.

I am getting spare mdk-ra8t1 (just a board), one has become available on DigiKey and is already in shipping. With bit of luck I should have it soonish.

Would you mind to tell me why do you ask?

@duynguyenxa
Copy link
Member

@piotrrak , could you try to use the "Renesas Device partition manager" on the EKRA8M1 and reset the board back to factory state then flash the Zephyr image.
image
image

The e2studio when use for flashing will automatically calculate the memory boundaries of the image then pass them e2-server-gdb program which then sets those settings on the device TZ IDAU, this will prevent flashing other image that has boundary outside of the previous image flash using e2studio, you can check this for reference: https://community.renesas.com/mcu-mpu/ra/f/forum/18083/tz-idau-memory-boundaries-calculation/59655

@piotrrak
Copy link
Author

@duynguyenxa I can't do that today but I'll try that tomorrow.
I've attempted to do just that with RA8T1 before without success, but maybe I've messed up.
Anyways will read up and I will do check that.

Please do note that after it happens, command:

rfp-cli -d RA -tool jlink -if swd -write32 0300A270 ffffffff

Ie. flash write this location won't revert bits that are already 0 to value 1
This flash location seem to act as if it were fuse bits from my limited observations.
After issuing this command the value of 0300A270 won't change for me;

In C I would model behavior I did observe as:

void write_0300_A270(uint32_t new_value32)
{
  uint32_t prev_flash_value = *((uint32_t*)0x300A270);
  *((uint32_t*)0x300A270) = prev_flash_value & new_value32;
}

Also by mistake I've bought RA6T2 not RA8T1 in crazy pre-xmas euphoria (was happy that it became available).

@fabiobaltieri fabiobaltieri added the priority: low Low impact/importance bug label Jan 7, 2025
@fabiobaltieri
Copy link
Member

@KhiemNguyenT @soburi (RA platform) could you take a look?

@piotrrak
Copy link
Author

piotrrak commented Jan 7, 2025

@fabiobaltieri Hi, this one is puzzling. In the search for an answer I've checked /dev/kmsg system logs from that period to checking USB errors. There was really nothing out of ordinary there. I'll setup this machine with CONFIG_DYNAMIC_DEBUG enabled kernel for extra traces anyhow with the expectation it'll occur again.

Copy link

github-actions bot commented Mar 9, 2025

This issue has been marked as stale because it has been open (more than) 60 days with no activity. Remove the stale label or add a comment saying that you would like to have the label removed otherwise this issue will automatically be closed in 14 days. Note, that you can always re-open a closed issue at any time.

@github-actions github-actions bot added the Stale label Mar 9, 2025
@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Mar 23, 2025
@soburi
Copy link
Member

soburi commented Mar 23, 2025

@KhiemNguyenT

I don't have the RA8 evaluation environment, but can you give any insight?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug The issue is a bug, or the PR is fixing a bug platform: Renesas Renesas priority: low Low impact/importance bug Stale
Projects
None yet
Development

No branches or pull requests

7 participants