This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

File Creation consumes time as number of files increases

I have created a folder in the Nand Flash memory. And the folder consists of multiple files.
The time taken to create a file in the folder is directly proportional to the total number of files in that folder. Reading or searching for a file in that folder hardly consumes any time.

Also if i create a new folder (and the previous folder still exists) then time taken for creating file increases as total number of files in the folder increase. (i.e. time consumed to create file in empty folder is hardly noticeable but as more files are created, time taken to create files increases).

IMO, This should not happen ideally.

even worse observation.
I have few files created in a folder, lets say 100. i delete few files randomly [lets say, the 10th file created (filename: X10), the 20th (XP20), 25th (XYZ25), 35th(X35), 49th(P49), etc.] and then continue to create new files [ab1, Ab3, ab2, ab5]. The newly created file 'ab1' is created in place of 'X10'. 'Ab3' is created in place of 'XP20'. The observation is that the files are created in the place where the last files where deleted (my observation comes from listing the files and the file listing code just reads the files names and sends it on UART). IMO, the files should be created after the last created file. No sorting date wise or in alphabetical order or in the order of creation.

Appreciate your patience & time.

PS: LPC1788. 120MHz core clk. Using Keil File system library.

Parents
  • also as i said, i have implemented linked-list
    You're comparing the performance of operating on reduced, ordered data in internal RAM to that of operating on expansive, unordered data in external flash. Is it really so hard to understand that you get wildly different results?

    doesnt Date wise sorting achieve same result as order of creation? (considering that the directory list was written sequentially not over writing the deleted entries)
    Any kind of sorting would make your initial problem orders of magnitude worse. In the interest of performance, the only sensible write ordering for file system entries on the medium is no ordering at all, i.e. using the first empty slot found. Any other method would mean that each newly created file might force the system to write a new copy of the entire directory.

    Be careful what you wish for! Odds are you'll like it a lot less than you thought you would.

Reply
  • also as i said, i have implemented linked-list
    You're comparing the performance of operating on reduced, ordered data in internal RAM to that of operating on expansive, unordered data in external flash. Is it really so hard to understand that you get wildly different results?

    doesnt Date wise sorting achieve same result as order of creation? (considering that the directory list was written sequentially not over writing the deleted entries)
    Any kind of sorting would make your initial problem orders of magnitude worse. In the interest of performance, the only sensible write ordering for file system entries on the medium is no ordering at all, i.e. using the first empty slot found. Any other method would mean that each newly created file might force the system to write a new copy of the entire directory.

    Be careful what you wish for! Odds are you'll like it a lot less than you thought you would.

Children
  • You could always implement your file system code to cache, or maintain state information, about what's going on at various levels on the media where the structures are a lot more packed and concise.

    This will tend to eat a lot of memory, be more complicated to pull off, and might not pay off much in the way of dividends if the use case isn't touching the same deep directory over-and-over. Say by opening/creating files in multiple different directories, or traversing structures sparsely distributed across the media. You'd also have to be very careful that the internal state information you're relying on doesn't become stale, and is thread safe.

    I'd also expect the structures on the media to be large blobs or arrays, who's only linkage to the next is immediate proximity, or a linking chain to the next large blob. How the software side maintains it's own tables, is a whole other matter.

    Generally speaking you want structures on the media to be free-standing, if you change one pointer, it shouldn't require you to go off and fix half-a-dozen other nodes/sectors to maintain internal integrity. Contain changes to the smallest possible area.

    Flash media is certainly one where journalling might be used, in which case finding/opening files may require you to navigate quite deeply to find the last or most current version, or to add a new instance at the end.

    As Hans suggest if you focus on one particular use case or optimization, it's likely to result in the degradation of many others, so don't focus on human centric cases like being alphabetical, or how the data/names are ordered on the media.