11-11-2013 02:02 PM
@QFang wrote:
One big problem with the approach you have outlined above is that when files need to be added to a folder, in your setup you would need to re-scan the whole folder.
This becomes an unacceptable time cost since the time it takes to "list folder and get file properties for all files within" frequently is measured in units of minutes rather than milliseconds, then when that gets multiplied with 300+ folders (not at all an extreme number in our case, e.g. 12 channels * 31 days of the month + 12 = 384 folders) and suddenly it would take potentially hours to re-catalogue all the folders.. meanwhile each active folder continues to grow.. in the end, the cataloging falls catastrophically behind and the disk is no longer maintained and will eventually overfill/run out of space.
This is why it is important that each file is only "scanned" once (and only once). Which is why we need a managed, persistent index system.
Actually, no. I only read the folder structure once (when you select a base path), and if you add a file, you add it to the corresponding array, except file size you dont even need to read from disc (and if you want to check the file size you just get that single file info).
/Y
11-11-2013 02:10 PM
I see. . .
In order to "add it to the corresponding array" would you use a variant to hold the resulting "parents" array in your example, named by its own path or some such? This would allow quick retrieval of the correct array, and since the name and number of parent folders is static after the initial population/disk-scan that happens at boot-time, it shouldn't need to re-sort itself (just replace/store a new value in the same named location).?
Or did you have a different vision for how to manage the "corresponding folders"?
11-11-2013 02:54 PM
I would probably keep the Parent array as it is, but you're right that the variant will be big. An easy solution to that is the change it to a DVR, but i dont fully like having 2 different arrays when it's one file system. I'm trying to wrap my head around a better way.
/Y
11-11-2013 05:50 PM
DVRs can be quite fun, and recursion (although the file list isn't stricly recursive).
/Y
11-13-2013 04:11 PM
Better implementation, i'm pleasantly surprised. 🙂
/Y
11-14-2013 07:49 AM
That is pretty neat, thank you very much for sharing!
I just noticed this thread went completely off my opening subject, but I have enjoyed the journey!
Your code has evolved in a different direction than what my use case is, so I probably won't use any of it for this project, but its a nice piece of work and for a windows/user oriented software there is a lot of good examples here!
(For my project, headless autonomous "24/7" operation; I avoid recursion at "all most cost" because of the extra overhead and unpredictability in terms of memory and thread-bashing(?) and because of the overall unpredictable'ness of recursion. I also don't lke to do "search array" with every lookup (I would rather take the hit on write/modify so I gravitate to variants, and I minimize what data I keep as much as possible, e.g. in my version I keep only "size" and "name". Time-stamp is kept implicitly since its the owning variant value name, and the file path can be reconstructed from the parent folder variant value name + the file-name.)
11-14-2013 12:22 PM
Come to think of it, since the file-cluster has an index-field you dont need to search, it's that index i searched for ... 🙂
/Y
11-14-2013 01:38 PM
Ok, some cleaning up, added possibility to list things recursively which also gives the recursive size and no Searching for folder ref.
Since it's a list of lists there has to be some level of recursiveness, but the general lists are normal for loops.
In my tests, after creating a list of some 5k files, activating the recursive function, getting the list and total size is basically instantaneous.
/Y