recognizing data clusters

Shades · ‎08-03-2006

I have an exercise in computer logic. I have and array of XY values that when plotted on an XY graph would form clusters. Sometimes there is one cluster of plots, sometimes two, sometimes three. The cluster shapes are somewhat irregular. And sometimes the clusters slightly overlap.

Anyway, my program needs to be able to recognize, for any given x,y plot, which cluster that plot belongs too. This is so my program can analyize that plot seperately.

Possible solutions I'm investigating is incorporating classical logic; fuzzy logic; simulated neural network. Frankly, I not much of a mathematician, so some classical logic solution would probably be easiest for me.

johnsold · ‎08-04-2006

If the clusters overlap, there will always be some non-zero probability that a point will be assigned to the wrong cluster.

That said, I think I would look at the distance between points. Calculating every possible distance between pairs of points could be time consuming if you have large numbers of points. A cluster would be a group of points with small distances from each other and with larger distances to other points. Once you have a preliminary clustering selection, it could be refined by locating the centroid of the points in the cluster and recalculating distances from the centroid(s) to the data points and re-assigning marginal points to another cluster if they are a better fit.

Many of the details may depend upon the exact nature of your data and what causes the clusters. If you have any way to predict from the known behavior of the system or a theoretical model where the clusters should be, that information could be useful in defining the clusters. Are all data points to be assigned to some cluster or are there outliers which do not belong to any cluster? Such outliers could distort the determination of clusters.

Lynn

Shades · ‎08-04-2006

There will on occasion be outliers. But not many. I do have some info on the likely patterns of the clusters (are you familar with ionograms?) Sometimes the patterns can be unusual, though.

What your saying makes sense. I think the key is determining the relationship between points by distance. Anyone have any code that would help me examine arrays in this way? (V.7.1 please).

Shades · ‎08-04-2006

I'm considering another idea as well. First, I divide my array into squares. Then I evaluate each square according to the density of plots inside it. Squares without a high enough density could be considered empty squares.

Then I evaluate non-empty squares according to how they are connected to the squares around them. The program could go through the squares in an orderly manner, and evaluate square by square. Then I could assign a 'cluster code' to each non-empty square (and each data point in the square.)

The squares might end up looking something like this:

0 3 3 0 0

0 3 0 0 0

0 0 0 0 2

0 0 2 2 2

0 0 0 0 0

0 1 1 1 0

Zeros representing empty squares.

Anyone have any ideas to go along with this?

Shades · ‎08-04-2006

Hmm... no body seems to have any ideas...

TCPlomp · ‎08-05-2006

easy shades,

i've got one question it looks like the value in your array shows to which region it attachs. is this right? then it is easy:

Ton

Message Edited by TonP on 08-05-2006 02:13 PM

Free Code Capture Tool! Version 2.1.3 with comments, web-upload, back-save and snippets!
Nederlandse

LabVIEW user groep www.lvug.nl
My LabVIEW Ideas

LabVIEW, programming like it should be!

Shades · ‎08-07-2006

Thank you. Your code should save me some time. Would you be willing to send me a block diagram of the 'remove duplicates' subvi? I couldn't find it in the functions palette for 7.1. You could send me the subvi but you indicated you are using 8.1.

TCPlomp · ‎08-10-2006

I'm sorry but that VI is from the openg library and part of the array package (free to use!). I don't have those installed at this station but what it basically does is elminate everey duplicate item. You could do this by sorting the items and then step throught them and check if an item is the same as the previous.

But in my opinion the openg tools are necesarry because the remove a lot of standard tasks of your programming!

Ton

Free Code Capture Tool! Version 2.1.3 with comments, web-upload, back-save and snippets!
Nederlandse

LabVIEW user groep www.lvug.nl
My LabVIEW Ideas

LabVIEW, programming like it should be!

LabVIEW

recognizing data clusters

recognizing data clusters

Re: recognizing data clusters

Re: recognizing data clusters

Re: recognizing data clusters

Re: recognizing data clusters

Re: recognizing data clusters

Re: recognizing data clusters

Re: recognizing data clusters