Variant (in-memory) byte size?

QFang · ‎11-11-2013

@Yamaeda wrote:

Making the Flatten+String length on the bottom cluster should be fast, and multiply with array size should be easy enough. If you have several you'll need to loop it ...

/Y

The top level variant contains named attributes where the name is "parent folder path" converted to string (variable lengths, size not part of any array).

Each "parent folder" variant contains named attributes (YYYYMMDDHHMMSS names) where the data is an array of clusters. Since the size of each element in the array is also non-trivial (cluster contains a path data type), inferring size from array length will not work.

QFang
-------------
CLD LabVIEW 7.1 to 2016

Yamaeda · ‎11-11-2013

Maybe i misunderstodd, but a general 2 level file structure could be handels something like what i've attached, or am i way off?

/Y

G# - Award winning reference based OOP for LV, for free! - Qestit VIPM GitHub

Qestit Systems

Yamaeda · ‎11-11-2013

Added datetime sorting and total folder size as you click on a parent folder.

/Y

G# - Award winning reference based OOP for LV, for free! - Qestit VIPM GitHub

Qestit Systems

QFang · ‎11-11-2013

You are somewhat close.

However in my code, "parents" in your diagram is a variant itself since I need fast lookup by parent folder name.

Also, I need to quickly get file(s) by oldest time-stamp, so the format you are storing the file-details in is not great. I instead name the variant by using the file modify converted to YYYYMMDDHHmmSS string (e.g. 20131111132952).

And it needs to operate on all subfolders so the VI you posted would need to be called on each subfolder.

Lastly I saw your VI description on not deleting variants and to do this maintenance at a higher level and not on variant data, however, how would you propose managing sorted (by time-stamp) data without using variants? I know I can sort 1D array, but since there is no output to "link" assoicated data and have that data "move with" the sorting position changes, I would need to custom code that, and in the end I would still be doing inserts/replace on growing arrays of data, at least with variants, I get to keep the code fairly readable?

I'm not dead-set on variants, but I am still not seeing a solution that would work better?

I wish I could post some code, but 1) my boss would not be too happy about it, 2) it has undergone several hardware optimizations at the expense of readability... exmamples: list folder takes a significant number of ms to process, so need to add in "wait ms" after to allow thread swapping to other tasks. Same thing with Get File/folder properties (can be ~15ms to execute for a single file). etc etc.

I stripped out an example of the core. Please note this is already quite far from how it is actually implemented, but the data structure is preserved. See snippet:

In the snippet above the "top level variant" is presumably maintained in a shift register in the calling VI, (and I forgot to add the toplevel variant out at the end). (The snippet also doesn't show the "status" variant, which simply tracks a cluster with "total folder size in MB" and "total files in folder" for each ParentFolder name. This could be a candidate for adding size information as I could detail-account for the size change everytime I add or remove data in the main variant same way I account for changes to the content of PFolder..)

In concept, this (managing vast number of files across a whole drive, based on modify date, size and folder file count) is one of those things that seems like it would be very simple to implement, until you try to work with the constraints of a resource limited RT system and keeping in mind that this whole thing should be self-maintaining and a side/background task that should interrupt/use as little resources as possible. 🙂

Thats when you start realizing just how bad of a hit you take for e.g. file/disk function calls etc. Even a "get file size" call is ~5ms..

QFang
-------------
CLD LabVIEW 7.1 to 2016

QFang · ‎11-11-2013

@Yamaeda wrote:

Added datetime sorting and total folder size as you click on a parent folder.

/Y

Please note that the event driven design in your examples preclude any easy benchmarking of your suggestions on my cRIO... if you re-write to not use that, I would be happy to run benchmarks on one of our cRIO's..

EDIT:: Also, the example you posted does have folder size, but not date/time sorting. at least not that I can see?

QFang
-------------
CLD LabVIEW 7.1 to 2016

QFang · ‎11-11-2013

I see the sort now... how would I go about qualifying that this will always sort in the expected manner? You are doing the sort on an array of clusters, whats to say some combination of file-name/path/count will invalidate the sort order you expect and want? I ask this because I have many times been tempted to do sort on complex data like that, but I have never felt comfortable with the assumption that "it performed as expected in this one test case, so I'm sure it will always work".... so if you have any insight/detail on how the sort 1D array works when presented with complex (cluster) data like in your example, that would be great!

Thanks also for your time, energy and examples.. I greatly appreciate and value it. My comments and questions are exclusively made either to help me understand something, or give feedback as to why I have or have not done something a certain way.. Appologies in advance if any come across as confrontational!

QFang
-------------
CLD LabVIEW 7.1 to 2016

Yamaeda · ‎11-11-2013

When you click on a parent the child list is sorted on datetime. I didn't include it on the parent, but it should be easy to add (just add the datetime as the clusters first element and sort the array).

The event design is ofc since i only show 1 child folder at a time, the generation (path change-event) can easily be benchmarked, as can the extraction time when clicking a parent.

I haven't worked with variant attributes, so cant really give any feedback on them, but to back it all up to the start: The reason you're trying to recreate the entire file structure in memory is so you can see if a folder is too big or there's too many files?

/Y

G# - Award winning reference based OOP for LV, for free! - Qestit VIPM GitHub

Qestit Systems

Yamaeda · ‎11-11-2013

If sorting on an array of clusters, it gets sorted in the tab order of the cluster. In this case, datetime is #0. 🙂

/Y

G# - Award winning reference based OOP for LV, for free! - Qestit VIPM GitHub

Qestit Systems

QFang · ‎11-11-2013

One big problem with the approach you have outlined above is that when files need to be added to a folder, in your setup you would need to re-scan the whole folder.

This becomes an unacceptable time cost since the time it takes to "list folder and get file properties for all files within" frequently is measured in units of minutes rather than milliseconds, then when that gets multiplied with 300+ folders (not at all an extreme number in our case, e.g. 12 channels * 31 days of the month + 12 = 384 folders) and suddenly it would take potentially hours to re-catalogue all the folders.. meanwhile each active folder continues to grow.. in the end, the cataloging falls catastrophically behind and the disk is no longer maintained and will eventually overfill/run out of space.

This is why it is important that each file is only "scanned" once (and only once). Which is why we need a managed, persistent index system.

QFang
-------------
CLD LabVIEW 7.1 to 2016

QFang · ‎11-11-2013

@Yamaeda wrote:

If sorting on an array of clusters, it gets sorted in the tab order of the cluster. In this case, datetime is #0. 🙂

/Y

..just re-read the sort help entry where they do in fact detail this.. very convenient!! making the name the second entry will sort by time then by file-name, so yeah, that will work great!

BTW, keeping a "flat" variant with one entry for each file as clusters then just sorting the cluster as you shown will be slow because your top variant becomes very large (and add/remove operations become very costly), also you would loose track of the "by folder" requirement, which for removal isn't a big deal (removing the oldest file(s) on disk vs oldest files in each folder is ok'ish, but tracking folder count would need to be a separate function.. and you would not be able to list folder content without resorting back to a folder "list" function which is very heavy. (Thats why I have two levels of variants)..

That said, maybe your suggested structure on a by folder basis can be made to work now.. but if PFolder contains an array of "file-property cluster" that is sortable by date/time, removing/replacing/adding files in that array becomes cumbersome. <Goes back to think about it>

QFang
-------------
CLD LabVIEW 7.1 to 2016

LabVIEW

Variant (in-memory) byte size?

Re: Variant (in-memory) byte size?

Re: Variant (in-memory) byte size?

Re: Variant (in-memory) byte size?

Re: Variant (in-memory) byte size?

Re: Variant (in-memory) byte size?

Re: Variant (in-memory) byte size?

Re: Variant (in-memory) byte size?

Re: Variant (in-memory) byte size?

Re: Variant (in-memory) byte size?

Re: Variant (in-memory) byte size?