Gaussian Peak Fit Algorithm Bug

DSPGuy · ‎02-02-2021

Hi Peter,

The convergence problems are likely due to the large number of zeros in the data, and relatively small number of non-zero values. This makes it more likely to converge to a degenerate solution (flat line due to amplitude=0, std. dev. =0, etc. ). I have noticed a few changes that can make things behave better.

1. Don't supply an initial parameter guess. If this is not input then the VI will estimate the starting point for the fit internally.

2. Give better initial parameters: The sensitive parameter seems to be the std. dev. value. The estimate in your VI is 12.75, which is rather wide considering the data. I suspect the fitting algorithm is taking a large step to reduce the std. dev. and may reduce too far towards 0. It then can't recover. I implemented a simple array threshold looking for 0.5 of the max value from the left and then from the right. This gives a width at "half height" that seems to be a better estimate for the std. dev. and results in convergence to a more reasonable result. I found that keeping the initial std. dev. value between 0.11 and 9.5 yields more reasonable behavior for the "bad" dataset.

3. Down-weight the data values that are zero. I created a weight array with weight[i]= {0.1 if data<1E-6, 1.0 otherwise}. Then wire this to the weight input of the VI. This may also be a better way to manage noise from the start.

Keep in mind that Least Absolute Residual and Bi-square start from the least-square solution. They then iterate by changing the weights based the previous solution and solve the new reweighted LS problem. The least square solution for your "bad" data is also not a reasonable fit, so reweighting (LAR, Bi-square) will probably not help convergence.

-Jim

altenbach · ‎02-02-2021

@PeterMek wrote:

I would argue that zeros are very usable as they tell you where the value of the Gaussian should be close to, well, zero.

The zeroes will be important once you allow to also fit for an offset (which you don't). In that case they will dramatically improve the accuracy of the "offset" parameter. In the presence of noise, these values are NOT zero and don't really help to fit an offset-free Gaussian. You already know (do you really?) that the offset is zero, which is automatically improving your fit. In your case, the number of nonzero points is more important, the extra zeroes don't really matter.

As Jim said, good parameter estimates can be taken directly form the data and that's what you should use. I get a loop time of about 50 microseconds using simulated Gaussians similar to your data (using least square. That includes simulation, guessing, etc.). Note that I allow non-integer x-ramp. If you know that x0,dx is always 0,1, things simplify slightly.

Attached is a quick demo derived from your code. It lets you play around will any desired scenario.

Personally, I would probably roll my own and use "nonlinear curve fit". This will give you the full covariance matrix where you can estimate parameter errors, i.e. directly tell how much confidence you can give the result (not shown).

LabVIEW Champion.

PeterMek · ‎02-03-2021

Hi Jim,

I added a new data set (above and again here) with no zeros and more points in the peak which shows the same problem, but I maintain the zeros are not the issue. For a pure Gaussian an infinite portion of the domain (from -inf to inf) is, as far as machine precision is concerned, zero. Encountering zeros is what a Gaussian fit should expect. That being said try the new data if you like. Again these are illustrative examples I was asked to provide and fitting these particular cases isn't my concern.

1. In general for the data I had no initial point seemed to work worse than adding one but I could go back to that.

2. This is a pretty good suggestion. I have found before when using Gaussian Peak Fit VI years ago that a larger guess for sigma was better. I reasoned this is because, if my width is small and the center is way off then the peaks may not overlap at all and it would be difficult for the algorithm to know which direction whould best lower the residual. However, as you suggest, I can get a pretty good estimate for sigma with the FWHM which I can do before the fit to basically hand the answer to the algorithm.

3. This may be helpful in this case but, much like the new data provided I expect there would be a decent amount of noise in the true data. Also I don't wish to down play the importance of my Gaussian being zero where the data is zero. With low weights there I am effectively telling the algorithm that it doesn't have to match in these areas so well which I would like it to do.

Hi Altenbach

The offset is not allowed because there is no offset. How do I know this? I created the data specifically with a zero offset in both the old set and the new set with noise (the noise is centered around a zero offset so the lowest LS-residual will be offset ~ 0). I don't allow it, if anything, to help the algorithm along by asking it to only solve for 3 parameters instead of 4.

The demo code is very impressive though and would probably fix my issue fine. Although I have to admit that since starting this thread we have found a new way to go with the data and no longer need to fit Gaussians.

I kept the thread up because I have had some issues with this VI for years, noticing that it is incredibly sensitive to initial estimates so, in the absence of basically handing the parameter values to the VI, it wouldn't reliably fit the data. As is the case this time, we always went another way with our processing. That being said I noticed at least one parameter was always correct which made me check the stopping criteria which, sure enough, asked "if any". This is really the crux of the potential 'bug' I am reporting. I am not sure if it's meant to be that way and I'm wrong or if it's a minor albeit consequential mistake.

All of your efforts are not in vain though. I'll keep your code handy for the future, Altenbach, and I think I benefited from the discussions in this thread even if I am still unsure about this "if any" business.

Sorry for the essay,

Peter

LabVIEW

Gaussian Peak Fit Algorithm Bug

Re: Gaussian Peak Fit Algorithm Bug

Re: Gaussian Peak Fit Algorithm Bug

Re: Gaussian Peak Fit Algorithm Bug