S7G2 Custom board : Ethernet issue

Hi all,

 

We're using S7G2 on custom board.

Ethernet works for some time (example, for 5 minutes/10 minutes or sometimes for few hours), and then Ethernet stops working.

We checked it further in Ethernet driver, and found that, after some time of Ethernet working, packet_type in nx_receive gets wrong value, and this continues and  never gets back to working condition. Further debugging, we came to know that. the data buffer that's received in nx_receive is shifted by 4 bytes (32 bits).

Did anyone face this (or similar) issue on S7G2?

In forum I found a similar issue posted sometime before, where they mentioned, increasing the pool size to 1568 solved the issue, but our issue is not solved by this change.

I've also seen somewhere that there's a known issue similar to this in S5D5 in some silicon. This issue sounds similar. Anyone has idea?

 

This is a long standing issue in our project. Let me know if anyone from Renesas knows about this, and let me know if I can implement (at least) any workaround for this issue.

  • any ideas on this?
  • In reply to gnk:

    Hi GNK-
    What version of SSP are you using? What NetX Frameworks?

    Thanx,
    Warren
  • In reply to WarrenM:

    Hi Warren,

     

    SSP version 1.4.0
    We're using sf_el_nx frame work for Ethernet

  • In reply to gnk:

    Hi GNK-
    Can you run your program on a Synergy Kit? That would help us better understand what is going on.

    One thought: Have you checked the drive level of the Ethernet signals to your PHY? I believe the are set for high output drive on our kits and that seems to make a difference.

    Warren
  • In reply to WarrenM:

    Hi Warren,

    1. It works with dev kit and starter kit.
    2. Drive level are high.
  • In reply to gnk:

    Hi GNK-
    Looks like there is something going on with your board if the program works on the Synergy Kits. You might compare the layout of our kits to your board to see if there are any issues that might impact signal integrity. I think your best approach is to run in a test set-up with wireshark or something similar tracking activity on the network and then reviewing what was captured prior to your error. Since we can't duplicate your set-up it is difficult for us to suggest anything other than normal debugging approaches...

    Warren
  • In reply to WarrenM:

    Hi GNK, Warren,

    I am also facing similar issue. Ethernet works fine for several hours on custom S7 board. And then it stops working. After debugging notices that I am getting 8 bytes shift. The received packet is prefixed with right zeros. Have you guys got any solution for this issue?

    Thanks,
    Meenanath
  • In reply to Meenanath:

    Hi Meenanath-
    Do you have a wireshark log that shows where the issue started? Any indication that the PHY is having trouble? How about buffer and stack sizes? Have you checked to see if they are staying within the expected bounds and not overflowing or being incorrectly updated (run away pointer. Can you check the design on a Renesas board to see if it works OK?

    Since we don't have your board, all I can really suggest are some generally helpful debugging techniques- some of which you may already have used...

  • In reply to WarrenM:

    Warren,

    There is a flag, g_hw_padding_enable, that is disabled for S5D5. Not sure if that is related but it would certainly affect shifts in the packet buffer processing:

    /* Test for S5D5 Mask Rev. 2 to disable padding */
    if ((product_info_ptr->product_name[4] == (uint8_t) '5') &&
    (product_info_ptr->product_name[5] == (uint8_t) 'D') &&
    (product_info_ptr->product_name[6] == (uint8_t) '5') &&
    ((uint8_t)(product_info_ptr->mask_revision) == 2U))
    {
    g_hw_padding_enable = false;
    }
    else
    {
    g_hw_padding_enable = true;
    }

    If it is set to false, the padding is done in nx_receive. Odd that one person would see 4 bytes shifted, and another sees 8 bytes.

    The problem is, if that is not right, no packets would be received. I see this thread started in March 2019 but I seem to remember someone else running into this byte shift problem in another thread. I'll dig around and see if I can find it.

    Janet
  • In reply to JanetC:

    HI Warren, Janet,

    Thanks for your replies.

    Warren - I made sure there is no stack overflow or packet and buffer descriptor corruption when this problem occurs. I still receive the packets but with leading eight zeros.

    I am not able to reproduce the issue on the DK-S7 yet. The application I am running on both the boards is different though. The custom board runs our full application with Ethernet, while the DK-S7 runs only Ethernet part of it.

    There might be some issue with our layout as you suggested above. The hardware engineer is looking in to that. As the product is intrinsically safe, we have to add few diodes which adds capacitance on the incoming lines from the Ethernet connector. That might be affecting the communication. We have reduced the speed to 10 mbps from 100 mbps now as per suggestion from hardware engineer and people on the forum. It is working for couple of hours now. I will keep it testing for few days.

    Still I don't understand - The communications works fine with 100 mbps speed for several hours on our hardware, 2 to 3 days sometimes. So what causes the data shift after such a long time of working which never re-covers.

    Thanks,
    Meenanath
  • In reply to Meenanath:

    Hi Warren, Janet,

    There were couple of hardware issues in our design.
    1. There was voltage drop at KSZ8081RNA power supply due to a current limiting fuse. We observed the voltage to the IC was 2.8 volts which is not in the recommended list. We fixed this by removing the fuse.
    2. We had few resistors in the path between KSZ8081RNA and S7G2 because of IS reasons. The waveform was not looking good, so we removed the resistors and now the waveform is looking good.

    After fixing above hardware issues, we have not seen the eight leading zeros issue so far.

    Though we are facing another issue with the Ethernet reception. When we keep the system running for 2 to 3 days connected to the company network, the Ethernet reception stops working and when debugged we saw that the EDRRR.RR bit is cleared and does not set again. Usually we run the test on the weekend without any communication with the system over Ethernet. As the system is connected to the company network, it keeps receiving company network traffic, none of that goes to our application, so gets ignored or replied by the NetX stack. Monday morning most of the time the system does not respond to any messages and debugging shows that RR bit is 0.

    We saw that in nx_rx_interrupt() function there is code to set the RR bit to 1 if cleared, but it only happens when interrupt occurs. Our application does not transmit anything by itself and replies to the received queries only, so the interrupt does not occur.

    /** If reception is in suspended state, resume it. */
    if (nx_rec_ptr->edmac_ptr->EDRRR_b.RR == 0U)
    {
    /** Resume reception. */
    nx_rec_ptr->edmac_ptr->EDRRR_b.RR = 1U;
    }

    Can you please tell me what could be the reasons for EDRRR.RR bit to be cleared run time.

    Thanks,
    Meenanath
  • In reply to Meenanath:

    Hi Warren, Janet,

    Few more things that I observed while debugging this issue -
    1. RMFCR register value was 0x0000FFFF.
    2. EESR.RFCOF flag was set.

    The datasheet says when RFCOF flag is set, the reception stops. From the above register values looks like we are losing frames. Also the nx_renesas_synergy.c file does not handle RMFCR overflow and RFCOF flag.

    Not sure why we are losing the frames though.

    When we manually cleared the RMFCR register, cleared the RFCOF flag and set the EDRRR.RR flag the Ethernet started working with Rx buffer descriptors out of sync though.

    Thanks,
    Meenanath