I have created a folder in the Nand Flash memory. And the folder consists of multiple files. The time taken to create a file in the folder is directly proportional to the total number of files in that folder. Reading or searching for a file in that folder hardly consumes any time.
Also if i create a new folder (and the previous folder still exists) then time taken for creating file increases as total number of files in the folder increase. (i.e. time consumed to create file in empty folder is hardly noticeable but as more files are created, time taken to create files increases).
IMO, This should not happen ideally.
even worse observation. I have few files created in a folder, lets say 100. i delete few files randomly [lets say, the 10th file created (filename: X10), the 20th (XP20), 25th (XYZ25), 35th(X35), 49th(P49), etc.] and then continue to create new files [ab1, Ab3, ab2, ab5]. The newly created file 'ab1' is created in place of 'X10'. 'Ab3' is created in place of 'XP20'. The observation is that the files are created in the place where the last files where deleted (my observation comes from listing the files and the file listing code just reads the files names and sends it on UART). IMO, the files should be created after the last created file. No sorting date wise or in alphabetical order or in the order of creation.
Appreciate your patience & time.
PS: LPC1788. 120MHz core clk. Using Keil File system library.
IMO, This should not happen ideally. And what is this opinion of yours based on, other than wishful thinking?
How do you propose a resource-constrained algorithm might work that finds out whether a newly opened file actually has the same name as one of the files already existing in the target directory, so the system can overwrite that instead of creating a new file (or throw an error if that's not allowed), other than comparing it against every existing file name?
IMO, the files should be created after the last created file.
Again, what is that opinion based on?
Actually the file system does that, among other reasons, to avoid the very behaviour you complained about above: O(number of directory entries) file open / creation time. It does that by recycling discarded directory entries, to keep the size of the total list of directory entries smaller. In other words, these two behaviours you wished to see are in conflict with each other.
No sorting date wise or in alphabetical order or in the order of creation.
Huh? Just one sentence before you expressed a wish that you think they should be kept in order of creation, now you say you don't want that. That's the second time you've contradicted yourself in stating requirements in a single posting.
No implementation in the world can fulfill your wishes if those wishes directly contradict each other like that!
other than comparing it against every existing file name? Searching a file or reading it doesnt consume much time though (just a fraction of second). hence the question arose.
also as i said, i have implemented linked-list. At initialization, the linked-list is created by reading file names from the file system. also comparison is performed for the last opened/selected file. [strcmp is performed with each file name) The comparison doesnt consume much time (not more than a second even when i have more than 250 files (or so) with each file name exceeding minimum 15 characters] even if a file system performs character comparison before creating new file, reading the directory list and comparing file names must not consume much time.
Just one sentence before you expressed a wish that you think they should be kept in _order of creation_, now you say you don't want that. doesnt Date wise sorting achieve same result as order of creation? (considering that the directory list was written sequentially not over writing the deleted entries)
I dont know how a directory list is maintained in any file system, but till date i had guessed that i would be a singly linked list.
also as i said, i have implemented linked-list You're comparing the performance of operating on reduced, ordered data in internal RAM to that of operating on expansive, unordered data in external flash. Is it really so hard to understand that you get wildly different results?
doesnt Date wise sorting achieve same result as order of creation? (considering that the directory list was written sequentially not over writing the deleted entries) Any kind of sorting would make your initial problem orders of magnitude worse. In the interest of performance, the only sensible write ordering for file system entries on the medium is no ordering at all, i.e. using the first empty slot found. Any other method would mean that each newly created file might force the system to write a new copy of the entire directory.
Be careful what you wish for! Odds are you'll like it a lot less than you thought you would.
You could always implement your file system code to cache, or maintain state information, about what's going on at various levels on the media where the structures are a lot more packed and concise.
This will tend to eat a lot of memory, be more complicated to pull off, and might not pay off much in the way of dividends if the use case isn't touching the same deep directory over-and-over. Say by opening/creating files in multiple different directories, or traversing structures sparsely distributed across the media. You'd also have to be very careful that the internal state information you're relying on doesn't become stale, and is thread safe.
I'd also expect the structures on the media to be large blobs or arrays, who's only linkage to the next is immediate proximity, or a linking chain to the next large blob. How the software side maintains it's own tables, is a whole other matter.
Generally speaking you want structures on the media to be free-standing, if you change one pointer, it shouldn't require you to go off and fix half-a-dozen other nodes/sectors to maintain internal integrity. Contain changes to the smallest possible area.
Flash media is certainly one where journalling might be used, in which case finding/opening files may require you to navigate quite deeply to find the last or most current version, or to add a new instance at the end.
As Hans suggest if you focus on one particular use case or optimization, it's likely to result in the degradation of many others, so don't focus on human centric cases like being alphabetical, or how the data/names are ordered on the media.
Just about all operating systems uses the first free directory entry when adding files.
Then they leave it to the "dir" or "ls" (or whatever your OS calls it) command to perform any sorting.
The goal is always to minimize the number of "pages" (sectors, blocks, ...) spanned by the directory.