HAL Flash API: why write with uint32, read with uint8?

Trying to use Flash module to store string data (ASCII characters) into the data flash and running into problems...

Code:
I've set up an char array "ascwrite" which is then filled with 0-9 then A,B,C <etc.>.  In other words "ascwrite" is filled with 30h, 31h, 32h, etc.

Next I took the Flash write demo project (see R11AN0087EU0101) and modified it: instead of writing 0xAAAAAAAA into every byte as the demo does, I want to write "ascwrite" into the DF.  I tried two ways of filling the "write" array:

First I tried--

for(j=0;j<64;j++) {
outstring=(uint32_t)(ascwrite[j]);

write[j] = outstring; }

where outstring and the 'write' array are both uint32_t.

Upon writing and then reading the data back out I get:
30 0 0 0 31 0 0 0
32 0 0 0 33 0 0 0
34 0 0 0 35 0 0 0
36 0 0 0 37 0 0 0
38 0 0 0 39 0 0 0
41 0 0 0 42 0 0 0
43 0 0 0 44 0 0 0
45 0 0 0 46 0 0 0

which looks like the 32 bit write is putting my data in the first position and then putting \0 in the next 3 bytes.

To test this I modified the filling of the 'write' array to repeat the byte so each uint32 element in the array had the byte bit-shifted into all 4 bytes: 

for(j=0;j<64;j++) {

outstring = (uint32_t)(ascwrite[j]) << 24 |
(uint32_t)(ascwrite[j]) << 16 |
(uint32_t)(ascwrite[j]) << 8 |
(uint32_t)(ascwrite[j]);

write[j] = outstring; }

Now reading out the data shows:
30 30 30 30 31 31 31 31
32 32 32 32 33 33 33 33
34 34 34 34 35 35 35 35
36 36 36 36 37 37 37 37
38 38 38 38 39 39 39 39
41 41 41 41 42 42 42 42
43 43 43 43 44 44 44 44
45 45 45 45 46 46 46 46   

Which is semi-expected.... but begs the question(s), do I need to write 4 bytes at a time, filling each uint_32 with sequential data (four array elements)?  This makes no sense.

Moreover, it makes one wonder: other than some internal configuration issue, why do we write 32 bits but only read 8?  Which of the 32 write bits are important (what do the other 24 bits do or where do the other 24 bits go?)?  If this an endian issue it is not clear how.

This seems much more complex than it should be-- so I'm assuming I'm doing something wrong.  Are there any examples that show more detailed write process, other than a fixed byte?