memcpy fails on S5D9

Hello,

I am running into situation where memcpy is not copying correctly, when crossing the 0x20000000 RAM boundary on S5D9. I did apply the workaround suggested in 

http://renesasrulz.com/synergy/synergy_tech_notes/f/technical-bulletin-board-notification-postings/15610/gcc-memory-operations-memcpy-and-memmove-cause-incorrect-data-read-when-crossing-sram-boundaries but that does not seem to be enough.

I use newlib (not newlib-nano).

Is there anything else that needs to be applied, in addition to the compiler flag mentioned in the above technical bulletin?

My environment is e2studio 7.7.0, SSP 1.7.5, toolchain GCC ARM Embedded 7.2.1.20170904, board - PK-S5D9.

Attached is a sample project where the problem is demonstrated - it copies some Base64 encoded data that crosses the 0x20000000 RAM boundary. It copies the data with memcpy and then checks that the copy contains Base64 valid characters.

Thank you,

DimitarMemAlignTest.zip

  • The SRAM in the S5D9 is located at :-

    so 0x1FFE0000 to 0x2007FFFF.

     

    In your code :-

     

        const char* src = getData();
        uint32_t srcLen = strlen(src);
        char* p1 = (char*) 0x1FFFE528U;
        memcpy(p1, src, srcLen + 1);
        if (memcmp(p1, src, srcLen + 1) == 0)
        {
            char* p2 = (char*) 0x2007C9D0U;
            memcpy(p2, p1, srcLen + 1);

    in the second memcpy, p2, the destination, is set to 0x2007C9D0U and the length is 26809 (0x68B9), so this memcpy goes past the end of available SRAM on the S5D9, 0x2007FFFF.

  • Hi,

    Yes, you are correct. The second pointer is wrong, and I failed to note that it is outside of the RAM region.
    I tried the sample project with a correct pointers, but I cannot reproduce it with the sample project.

    When it happens in the code I have, it is always the three bytes starting at 0x20000000, which are copied incorrectly. For example if the three bytes are 0x32, 0x4A, 0x33, what is copied is 0x48, 0x56, 0x07. All bytes before and after these three are Ok.

    Any ideas?

    Thank you,

    Dimitar
  • You are doing a memcpy of 26809 bytes, so by the details in the link :-

    renesasrulz.com/.../gcc-memory-operations-memcpy-and-memmove-cause-incorrect-data-read-when-crossing-sram-boundaries

    the library (newlib) version of memcpy will be used. Setting -mno-unaligned-access will not cause newlib to be rebuilt (newlib library is only built when the toolchain is built), so it will probably have been built to allow unaligned accesses.

    If the target address of the first memcpy is set to an odd address, the call to the newlib library version of memcpy will generate unaligned accesses (the source address in your test project is a 4 byte aligned address, so the write to the destination has caused the unaligned access) :-

    SCB->CCR = SCB->CCR | SCB_CCR_UNALIGN_TRP_Msk; //TRAP unaligned accesses

        const char* src = getData();
        uint32_t srcLen = strlen(src);
        char* p1 = (char*) 0x1FFFE529U;
        memcpy(p1, src, srcLen + 1);

    This code will end up in the default handler, because an unaligned access has occurred, and been trapped.

    You could provide your own version of the function memcpy, so the version from newlib is not used (or do the copy another way).

  • Hi,

    Thank you for the detailed explanation.
    In our application (code + third party libs, all built with e2studio), we have a lot of memcpy invocations. Most are with size that is not known at compile time, and with addresses that may or may not be word aligned - also not known at compile time.
    Are there plans to have the toolchain provided memcpy working in cases with unaligned access (for ex. - re-built newlib maybe?)?
    Or the only option will be to implement own memcpy?

    Thank you,

    Dimitar
  • The issue seems to be caused by the use of newlib, rather than newlib-nano. I am not aware of any plans to change the toolchain.
  • Unfortunately, switching to newlib-nano is not a solution for us. We discovered that with newlib-nano memcpy is two, two-and a half times slower, thus we cannot use newlib-nano.
    It is not too encouraging that this is not going to be fixed in the toolchain.
  • Here is a simplified version of your test project, that includes the optimised ARM memcpy() routine for the ARMv7m architecture (Cortex-M3/M4) (src\memcpy-armv7m.S). since this is built from source it correctly does aligned accesses to memory.

    S5D9_PK_memcpy_newlib.zip

  • Thank you for the sample project with the newlib ARM optimized memcpy.
    This will resolve the issue where memcpy is called.

    But what about the compiler builtin memcpy? In that case the C library memcpy us not going to be called.
    Do you know if the compiler builtin function is affected by the RAM boundary issue?

    Also, what about memcmp, strcpy, strcmp, etc. Are these functions also impacted and need to be re-built with unalgined access disabled?

  • Hello Dimitar,

    Have you tried relocating your buffers such that neither one of them crosses the SRAMHS-SRAM0 boundary? This sounds like the best solution as it should solve the problem for any toolchain and library version. Depending on how much memory you need and how it's used, it may be better to move it all to SRAMHS or SRAM0/1 but not both (as is the case now).

    Also, per Jeremy's recommendation, it is possible to supersede any library function with your own implementation, including the functions you listed.

    Regards
  • For the built in version of memcpy() in GCC, you can should specify -mno-unaligned-access for the project (gcc.gnu.org/.../ARM-Options.html. To stop the use of the built in memcpy specify -fno-builtin or -fno-builtin-memcpy (gcc.gnu.org/.../Other-Builtins.html).

    I am not aware of any issues with memcmp, strcpy, strcmp, etc.
  • We are exploring the option to partition RAM and avoid the RAM boundary situation in first place. That seems to be the safest option for eliminating the issue.
    Thank you Karol and Jeremy for your help and input, in this discussion.