[ECOS] Re: FTP runs out of JFFS2 nodes and trashes the file system

Jürgen Lambrecht J.Lambrecht@televic.com
Tue Apr 8 08:41:00 GMT 2008


Andrew Lunn wrote:

>> In this test, after FTP'ing 189 files with in total 3706kB, the file  
>>system crashes, but in a "valid" way: indeed, as the counter proofs, the  
>>raw node pool is empty, *but how is this possible?*
>>    
>>
>
>Jffs2 keeps an in memory copy of the FS meta information. Each write
>to the FS results in the creating of a node. Smaller writes results in
>more nodes/meta information and so greater RAM usage for the meta
>information. Hence you can get into the state of being out of RAM for
>meta information but still have space on the filesystem. So i would
>suggest writing bigger blocks, use fwrite, not write and use setbuf to
>set the buffering to 4K or similar. 
>  
>
I already use fwrite, but I'll try 'setbuf'.
But Gary says 
(http://ecos.sourceware.org/ml/ecos-discuss/2008-04/msg00102.html) that 
jffs2 automatically buffers data to fill a node, and I read yesterday 
something about that on infradead.org ???
But I must say: we adapted the eCos TFTP server to buffer 8 512B packets 
to have 4kB per write and with TFTP there is no problem with running out 
of raw nodes.

But there is something more wrong: I have 61MB in flash and 16000 
(statically allocated) raw nodes available.
3706kB/16000 = 237.184 Bytes per node. (ok, a bit more, because I have 
have directories /ERR, /AUDIO/PRM, /AUDIO/JINGLE; and /ERR always 
contains a log file of a few kB).
Is this really possible: only approx. 300B per node? I specially 
replaced my flash chip with a new one on my test board, created the 
directories in the jffs2 partition and then started FTP.
With Ethereal I see that a transfer always has the same behavior, the 
data is transferred with packets of this size in Bytes: 512, 1448, 88, 
and then max. sized 1448 B packets until the complete file is 
transferred. There is 1 strange thing: all data packets have this 
comment in Ethereal: "TCP segment of a reassembled PDU". I don't 
understand why because there is no IP fragmentation (don't fragment bit 
is on, and there are no fragments).
Of course, with TCP, there is no concept of packets anymore: I just do 
recv(socket,buf..);fwrite(buf..);

I will now add debugging to my code to see the actual size of fwrite().
I will also check the mounting - set jffs2 debug to 2 - to see the 
scanning of all nodes.

>>Therefore my question about FTP in  
>>http://ecos.sourceware.org/ml/ecos-discuss/2008-04/msg00110.html.
>> I have to unmount/mount or reboot to be able to delete files again. If  
>>I then don't delete but add files instead, it fails with the same error  
>>after a few files. When I repeat this cycle of reboot-add files a few  
>>times (depending on previous state of the file system) *jffs2  
>>"crashes"*: I cannot anymore delete or add any file - listing  
>>directories still works, also the application still runs.
>>I have to format the flash to solve the problem.
>>    
>>
>
>There are a couple of possibilities here. 
>
>umount/mount causes it to rebuild its in RAM meta information which
>allows it to remove old nodes which are no longer needed.
>
>It has chance to do a garbage collect of the filesystem so freeing up
>some meta information and blocks on the FS.
>  
>
indeed that's the case

>Note that JFFS2 has unusual behaviour when full. It needs 4 to 5 free
>erase blocks in order to do garbage collection. So things like df will
>say there is free space, but the filesystem may refuse to write
>because garbage collection does not have enough blocks to be able to
>run.
>
>  
>
that should not be the case at all: I still have 57 MB free! But I'll 
check it.

>> But I read this in the documentation: "However the library may not be  
>>interrupt safe. An interrupt must not cause execution of code that is  
>>resident in FLASH." If I understand it well, *this means the library is  
>>thread safe on the condition that all code is always executed in RAM?*
>>This is ok in my case.
>>    
>>
>
>Just watch out for redboot and the virtual vectors. If you are using a
>ROM redboot, diag_printf can jump into the ROM redboot. Similar an
>exception, or using the debugger could cause the gdb stub in the ROM
>redboot to get called. 
>
>        Andrew
>
>  
>
As far as I know that is OK: my ROM redboot jumps to the application 
ROMRAM binary, and ROMRAM ecos sets 
CYGSEM_HAL_VIRTUAL_VECTOR_INIT_WHOLE_TABLE to 1 (I checked). So the 
diag_printft in the VVT is overwritten by the RAM version I guess?
And I don't use gdb stubs, they are even disabled to save code size.

Kind regards,
Jürgen

-- 
Before posting, please read the FAQ: http://ecos.sourceware.org/fom/ecos
and search the list archive: http://ecos.sourceware.org/ml/ecos-discuss



More information about the Ecos-discuss mailing list