TB-S5D5 Filex on SD Card 4bit not working - SSP 2.0.0

I'm trying to use 4bit transfer bus size using filex on block media, but file can not be closed/media flush takes forever.

I'm able to operate at 1bit, using "Embedded" instead of "SD Card" type (not sure why), but I would like to speed this up as I should have the ability to do so.

I'm using a TB-S5D5 board, with a PMOD adapter hooked to SDHI1 channel (SDHI0 doesn't want to initialise media in 4bit mode, 1 bit is fine though).

PMOD adapter -> https://digilent.com/shop/pmod-microsd-microsd-card-slot/

I'm not formatting the media, as it takes forever and that's ok, I tried with 4GB and 32GB microSD, and again, with 1bit either SDHI channel 0 and 1 works, but if I want to go to 4bit (connecting all DATA0 through DATA3) it recognize the fx_init() only on channel 1 (SDHI1) and the file close takes forever, actually if I step through writing few bytes it works, but gets stuck if I write hundreds). 

The following code works fine on 1bit mode either SDHI0 or SDHI1(140KBytes per second, roughly when writing 3MB of data). but it doesn't on 4bit mode (file close takes forever and doesn't return any error).

Card Detect and Write Protection are disabled from Filex config, and not assigned in any pin config. Wire Card Detect from PMOD is not attached to anything.

#define FILE_SIZE 336

ULONG file_create_verify(FX_MEDIA* , FX_FILE* , char* , uint8_t* , uint32_t , bool);
uint8_t test[FILE_SIZE] = "Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur.\r\n";

/* SD Card Thread entry function */
void sd_card_thread_entry(void)
{
ssp_err_t err;
UINT res;
ULONG space_available;
ULONG new_space_available;
FX_FILE g_file;

CHAR entry_name[11] = "0000";

fx_media_init0();

res = fx_media_space_available(&g_fx_media0, &space_available);
if ((uint8_t)FX_SUCCESS != res)
{
__BKPT(0);
}

//Create from data (test) and verify
err = file_create_verify(&g_fx_media0, &g_file, entry_name, test, (uint32_t) sizeof(test), 1);
if ((uint8_t)FX_SUCCESS != err)
{
__BKPT(0);
}

res = fx_media_space_available(&g_fx_media0, &new_space_available);
if ((uint8_t)FX_SUCCESS != res)
{
__BKPT(0);
}

/* TODO: add your own code here */
while (1)
{
g_ioport.p_api->pinWrite(IOPORT_PORT_01_PIN_06, IOPORT_LEVEL_HIGH);
tx_thread_sleep (50);
g_ioport.p_api->pinWrite(IOPORT_PORT_01_PIN_06, IOPORT_LEVEL_LOW);
tx_thread_sleep (50);
}
}

ULONG file_create_verify(FX_MEDIA* media_hdl, FX_FILE* file_hdl, char* filename, uint8_t* data, uint32_t length, bool create_verify){
ULONG res = FX_SUCCESS;
ULONG file_size;
uint8_t read_buffer[FILE_SIZE];

//Create and verify, if create_verify is true
//cleanup read buffer
memset(read_buffer, 0, FILE_SIZE * sizeof(uint8_t));
if(create_verify){
/* Create a file */
res = fx_file_create(media_hdl, filename);

if ((uint8_t)FX_SUCCESS != res)
{
return res;
}


/* Write data to the file */
res = fx_file_open(media_hdl, file_hdl, filename, FX_OPEN_FOR_WRITE);
if ((uint8_t)FX_SUCCESS != res)
{
return res;
}

res = fx_file_truncate(file_hdl, 0u);
if ((uint8_t)FX_SUCCESS != res)
{
return res;
}
for(uint32_t i = 0; i < 10000; i++){
res = fx_file_write(file_hdl, data, length);
if ((uint8_t)FX_SUCCESS != res)
{
return res;
}
}
res = fx_file_close(file_hdl);
if ((uint8_t)FX_SUCCESS != res)
{
return res;
}
res = fx_media_flush(media_hdl);
if ((uint8_t)FX_SUCCESS != res)
{
return res;
}


}

return res;
}