I have several 20,000+ point scatter plots that I'd like to graph with a best fit. Whats the best way to do this?

dave56 · ‎07-20-2004

Since they're so large and occupy the same range of y values, I'm kind of reluctant to plot them one over another so a best fit line would, I think, be the most helpful in analysing the data. I'm not quite sure however how to go about this since the data can be random

altenbach · ‎07-20-2004

Your question is not entirely clear. Do you just want to show the best fit or do you want to show both?

The fit is easy. Do you have any kind of model? LabVIEW has many fit models built-in (linear, polynominal, etc.) or you can use levenberg Marquardt with more complex mathematical models. Later just generate your fit function for each x-value.

To actually show the scatter plot, I would recommend to generate a 2D histogram (e.g. 200x200) and display it on an intensity graph of the same dimensions. This solves the problem of overlapping data. If a pixel contains several data points, it will be proportionally brighter.

LabVIEW Champion.

dave56 · ‎07-21-2004

I would like to only show the best fit.

I don't have any particular model in mind, although I think that a polynomial would be best. I doubt any kind of complex mathematical models would be necessary, but you never know.

I'm not quite sure how to find the best fit line using something such as General Polynomial Fit. I feed in the times and their respective values, but how do I take the outputs and display them?

altenbach · ‎07-21-2004

You also need to wire the desired polynomial order. The output will be the polynomial coeficients. (ignore the "best fit output, it will contain way too many data points in your case).

Now simply create an array with a suitable selection of x-values and feed them together with the best fit coeficients to the polynomial evaluation function.

See attched example (LabVIEW 7.0). It generates 10000 noisy xy points then fits them to a polynomial which is then displayed using only 32 points.

LabVIEW Champion.

dave56 · ‎07-22-2004

I can't seem to get my program to match yours. The best I can manage is a slightly sloped line which, unfortunately doesnt match what a sample of the data is like. I've included what I have so far and a sample input.

altenbach · ‎07-22-2004

There is no way to fit your data with any kind of polynomial, it is too discontinuous. Couldn't you just do a smoothing (low pass filtering)?

Example: In the attached image, I have convoluted the raw Y data with a Gaussian of 0.5% width, then decimated the data to 209 points total. Looks pretty good to me ;-). You could also try just a simple running average.

LabVIEW Champion.

dave56 · ‎07-22-2004

Wow, that looks pretty good. I've a ways to go in learning LabView :(. Would that method also hold up for more linear data as well? The sample I gave was on the more extreme side, but discontinuous data definitely is a case that needs to accounted for.

Would you mind stepping me through the process or posting the vi? Could you also explain the differences in smoothing and avergaging the data as far as the visualization of it? I apologize for asking such sophmoric questions but this is only my second week of labview.

altenbach · ‎07-22-2004

Well, here's a very primitive way to decimate the data without any fancy algorithm. For this particular case, it looks pretty good. Each loop makes the array 3x smaller, averaging 3 adjacent elements. 4 iterations seem to be perfect. (I have included your raw data in diagram constants).

For a polynomial fit, the more complex the data, the higher the order polynomial is needed to describe it. If it is a simple "banana" (my example above) you're OK with an order of 2 (up to quadratic), if it is s-shaped, you need at least 3rd order, etc. I tried your data with much higher orders (up to 40!) and it did not work.

LabVIEW Champion.

altenbach · ‎07-22-2004

Dave,
Since you just started with LabVIEW, I thought I point out some improvements to your file reading and parsing methods. Basically, you do way too much work! You can do the entire thing in one simple loop (see attached VI in LabVIEW 7.1). On my machine, The reading of the data is now about 30x faster (much less than a second)). It is very inefficient to read a file line-by line if you need the entire file anyway. Enjoy!

LabVIEW Champion.

LabVIEW

I have several 20,000+ point scatter plots that I'd like to graph with a best fit. Whats the best way to do this?

I have several 20,000+ point scatter plots that I'd like to graph with a best fit. Whats the best way to do this?

Re: I have several 20,000+ point scatter plots that I'd like to graph with a best fit. Whats the best way to do this?

Re: I have several 20,000+ point scatter plots that I'd like to graph with a best fit. Whats the best way to do this?

Re: I have several 20,000+ point scatter plots that I'd like to graph with a best fit. Whats the best way to do this?

Re: I have several 20,000+ point scatter plots that I'd like to graph with a best fit. Whats the best way to do this?

Re: I have several 20,000+ point scatter plots that I'd like to graph with a best fit. Whats the best way to do this?

Re: I have several 20,000+ point scatter plots that I'd like to graph with a best fit. Whats the best way to do this?

Re: I have several 20,000+ point scatter plots that I'd like to graph with a best fit. Whats the best way to do this?

Re: I have several 20,000+ point scatter plots that I'd like to graph with a best fit. Whats the best way to do this?