Variant (in-memory) byte size?

QFang · ‎11-11-2013

Hi all,

Is there an elegant/efficient way of getting the in-memory byte-size of a Variant? Or at least an estimate?

I have an "index" variant which grows and shrinks conditionally during run-time, and I would like to occasionally check into the memory size of it, at least approximately.

I have only found two ways to do this, both are sub-optimal: "Flatten to string" followed by "string lenght" or "write to binary" followed by "file-size".

This code is running headless on a real-time controller. This means memory is limited and disk-writes take a lot of resources (relatively) to execute. Flattening to string makes a whole copy, writing to disk uses disk IO.

I'm hoping someone here has a better suggestion.. I looked at a variant "indicator" for property or invoke node options without finding any likely candidates..

Any suggestions are welcomed, however, "don't use variants" are not helpful. A lot of time was spent trying to come up with a different way to solve the problem and the variant was the only efficient and workable solution despite the current question/problem.

Thanks,

-Q

QFang
-------------
CLD LabVIEW 7.1 to 2016

Hooovahh · ‎11-11-2013

The only other method I can think of is similar to one you already have. Use a Type Cast going to an Array of U8, then get the array size. I doubt this is any better on memory then the other method you mentioned. It looks like nothing in OpenG, or the variant functions in the vi.lib\Utility can find the in memory size of a Variant. Unless I missed something.

Unofficial Forum Rules and Guidelines
Get going with G! - LabVIEW Wiki.

17 Part Blog on Automotive CAN bus. - Hooovahh - LabVIEW Overlord

Yamaeda · ‎11-11-2013

I'd say the Flatten variant + string length is the only generic way (and seems like a good solution), but if you know the format of your index, e.g. an array of I32, you can convert to that array and check array length.

/Y

G# - Award winning reference based OOP for LV, for free! - Qestit VIPM GitHub

Qestit Systems

QFang · ‎11-11-2013

At least in LabVIEW 2011, Type-Cast does not allow "variant" as the input wire, it breaks the output, even with the "type" input as Array of U8.. Did I miss something there?

Also, yes, I looked at the variant lib myself, and don't see any way of getting that number... seems like this must be a relatively low cost operation at some low level... I guess I can ask NI if they have/can make available that function for me to call..

With a real-case example variant previously generated and saved to file (~1.5MB in binary file size), it takes about 947 ms (average of 10) to flatten to string and get string length. (not counting the binary load to variant of course).. and thats at 100% cpu.. this is less than ideal to say the least..

QFang
-------------
CLD LabVIEW 7.1 to 2016

QFang · ‎11-11-2013

Yamaeda,

Unfortunatley the underlying structure is more complicated than that. The top level variant contains a (named) list of "sub-variants", each of these sub-variants contains a (named) list of array(cluster).

So: Topindex.PFolder.TimeStamp[array of cluster] or some such in pseudo-code.

At this point, the only way I can see to avoid a significant processing hit to obtain the size, is to track the size as variant elements are added/removed, however, that almost defeats the purpose as I wanted to check that I don't have a bug in the variant add/remove logic that causes the variant to grow indefinitley..

QFang
-------------
CLD LabVIEW 7.1 to 2016

Yamaeda · ‎11-11-2013

900+ms?! Wowsa. Then you need to make a function that extracts the array lengths and sum them up, can you post the cluster?

/Y

G# - Award winning reference based OOP for LV, for free! - Qestit VIPM GitHub

Qestit Systems

Yamaeda · ‎11-11-2013

@QFang wrote:

Yamaeda,

Unfortunatley the underlying structure is more complicated than that. The top level variant contains a (named) list of "sub-variants", each of these sub-variants contains a (named) list of array(cluster).

So: Topindex.PFolder.TimeStamp[array of cluster] or some such in pseudo-code.

Making the Flatten+String length on the bottom cluster should be fast, and multiply with array size should be easy enough. If you have several you'll need to loop it ...

/Y

G# - Award winning reference based OOP for LV, for free! - Qestit VIPM GitHub

Qestit Systems

Norbert_B · ‎11-11-2013

Is there a specific reason for having a nested data structure like this without keeping the obvious data types?

Also, you might want to move "sub-variants" to variant "attributes" as this is normally faster to access other than using array indexing and "bundling".....

Also, why does size concern at all? Do you have issues by running out of memory?

And last but not least: Allocation of memory kills determinism. So why do you (obviously) re-allocate memory repeatedly when working on a RT target?

Norbert

Norbert
----------------------------------------------------------------------------------------------------
CEO: What exactly is stopping us from doing this?
Expert: Geometry
Marketing Manager: Just ignore it.

Hooovahh · ‎11-11-2013

@QFang wrote:

At least in LabVIEW 2011, Type-Cast does not allow "variant" as the input wire, it breaks the output, even with the "type" input as Array of U8.. Did I miss something there?

..

My bad I assumed this would work...by the way NI why doesn't this work? I guess the only time I would want to do this instead of the Variant to Type is to get the array of raw bytes the Variant represents anyway.

Unofficial Forum Rules and Guidelines
Get going with G! - LabVIEW Wiki.

17 Part Blog on Automotive CAN bus. - Hooovahh - LabVIEW Overlord

QFang · ‎11-11-2013

Norbert, you ask a lot of great questions...

I will try to answer briefly but succintly as best I can.

We do not use a cRIO for its determinism(!), we do however use them for low-power, rugged, reliable operation. Stated differently, for the most part it is okay to have occasional jitter on the order of a few seconds, as long as it runs "forever".

Size didn't concern me until "available memory" dropped below 1/5th of total memory over a weekend "stress beyond normal operations" test. (running 7 instruments, serving modbus and writing log-files at 2x the max "field" configuration). There is no reason memory usage should increase past a certain steady-state point "x". Now, I will have to re-run (with ram limit adjusted to 1/10th of total) and track variant sizes to see if they never hit steady state or if something is wrong in my "variant clean-up" portion.

Trust me when I say I pre-allocate and use static allocations wherever possible.

Now, the question of why variants, and why the data structure I have? Please stay with me while I try to explain. Also, if anyone can think of a great alternative to handling the below stated problem, I'm all ears.

Requirements are paraphrased on the fly by me, so try to go by intent, not by letter of the language, if that makes sense.

System shall be able to boot up and recover from a power-outage or reboot, meaning after 60 days of running, filling the disk to allocated maximum limits (see below), on reboot, it needs to re-catalog and continue normal operations

System shall retain both types of log output files for as long as possible, deleting oldest files first only when available disk-space drops to unacceptable levels (25% of total) and/or if the number of files in a given folder is above 120 (due to performance issues caused by Reliance file system in folders with large number of files... listing a folder with ~100 files is not too bad, listing a folder with ~350 files takes a LOT longer..)

Files can be downloaded/removed via FTP during/after bootup.

(Files can be added via FTP, but this would be a violation of the operating instructions.)

We have no control over connected devices (other than our instruments) and cannot install "host" applications anywhere. As such, the controllers must be headless.

One required file output generates 1 file per channel per "scan", luckily scan's are done sequentially, so worst case generation rate is 1 file per 60 seconds, however, due to the file system (reliance) used, combined with a requirement to keep files for as long as possible, led us to maintaining a folder structure as ..\Scan\channel-name\[01..31]\date-time.xml where 01..31 are folders one for each day. This allows ~120 files for each day of the month which at least gives a snap-shot over time in the event of client failing to retrieve data. The other log type is tdms and is one file per day, and as such uses a much simpler (separate) folder structure.

One thing to note before I continue, is that ALL disk/file operations take a significant time to do on an RT target, this includes listing folder contents, getting file/folder properties etc. Therefore it is absolutely necessary to minimize (e.g. do only once) any file property reads etc.

So, after a lot of thinking the solution (that keeps up, is robust and worked great, until the weekend stress test) is in a nutshell this:

A QSM at boot-up grabs a list of all files in the log-trees before allowing the rest of the applciation to spin up (this prevents new files from being generated while taking a "current status snap shot".

Once the files are listed (about 120 seconds for 9988+ files for systems that "recover" from power outages), the machine goes to a "index all in temporary list". (New files that are generated while processing the list are queued up as "add file" actions to be done after. This works well since the number of files generated while processing 9000+ files is manageble.) This state will take an array of file-paths and organize the "last modified" and "size" attributes of each file by "parent folder". It also creates/maintains a separate variant (a "status variant") which is only one level deep and tracks the total number of files and the total size for each folder (based on the files in the index). The main variant is organized such that it is very easy to get a list of the oldest file(s) for any given folder (and also easy to delete oldest files first until the given folder is below both size and file-count limits.).

So, the main reasons I used variants is that I cannot think of a better way to continually keep a sorted list (of modified time/dates) that also tracks the relevant size and file-path data, and doing this on a per-folder basis, while also making look-ups low-cost/low overhead operations.

Variants give me the sorting for free (because at the "time-stamp" level, I name my entries with YYYYMMDDHHMMSS strings. It also provides fast lookups, and after the initial bulk-index of ~10k files, we do infrequent add's. Also, add's are relatively fast because of the two-level nature of the file-index variant which limits the "depth" of the tree at each level.. For example, the top level variant typically has some 400 to 500 named (parent folder) entries, and each parent folder has on the order of ~100 to 200 named entries (the time-stamps). The data-type of the time-stamp variant is an array of cluster (becuase the resolution on last modified is 1 second, we have to cover the possibility of multiple files in a folder with the same modified time-stamp, which typically only happens if files are added via FTP, or if mass file-operations are made programatically). The cluster contains the filepath (path), and another cluster consisting of the size (I64) and the number of files (I64 to allow some neat time/block diagram cluster add/subtract operations).

Also, I managed to write both the initial file/folder listing, the "index files", "add file", "check folders for overlimit conditions", and "trim parent folders" states such that processor useage is between 20 to 50% while still completing each task in satisfactory time-frames even when managing hundreds of folders and over 11k (test case) files...

Again, if anyone has suggestion for low overhead (CPU/RAM) ways to manage disk content in a fashion similar to the above situation on a cRIO (vxWORKS) PowerPC (limited CPU) platform, I am all ears! 🙂

Thanks for reading!

-Q

QFang
-------------
CLD LabVIEW 7.1 to 2016

LabVIEW

Variant (in-memory) byte size?

Variant (in-memory) byte size?

Re: Variant (in-memory) byte size?

Re: Variant (in-memory) byte size?

Re: Variant (in-memory) byte size?

Re: Variant (in-memory) byte size?

Re: Variant (in-memory) byte size?

Re: Variant (in-memory) byte size?

Re: Variant (in-memory) byte size?

Re: Variant (in-memory) byte size?

Re: Variant (in-memory) byte size?