LabVIEW

cancel
Showing results for 
Search instead for 
Did you mean: 

how do i compare two audio bursts and say they are same or otherwise (LV 7.1 or 8.2)?

Hi,

I need to compare an audio burst extracted from a host of other such bursts, with an existing burst in the PC memory.

I have tried to use the correlation vi from the LV8.2 library but does not seem to work.

Would anybody have done such activity, pls post your answers.

thanks,
sunil

0 Kudos
Message 1 of 24
(5,789 Views)

hi there

i have some questions:

- what has to be compared? energy, spectrum (rel. or absolute), amplitude (rel. or absolute)?
- is the measurement equipment and the measurement setup identical for all bursts? (this effects the measured data)
- are there any criteria for "burst differ" and "bursts equal" defined?

Best regards
chris

CL(A)Dly bending G-Force with LabVIEW

famous last words: "oh my god, it is full of stars!"
0 Kudos
Message 2 of 24
(5,783 Views)
HI Chrisger,
For the present, it does not matter which paramater is being compared.
However, it would be logical if we compare the freq Spectrum of the burst which is essentially a word spoken. which means, the freq BW would be just < 3 to 4 khz.
The actualy application is to compare from a look up table of words and say whether the live capture is same and then get out a string. For example:

I have audio file (in PC memory) containing stripped audio data for: G, E, There, is, a, Fault. This would be put as our lookup table in a array form and would look like:
G (striped data)
E
There
is
a
Fault
Now, we do a fresh acquisition of audio which might contain one of these words (this is a controlled environment so they have to be on of these), and say we do not know which one, so we compare this burst of audio with that in the array of words, and we use a case to obtain the appropriate string in text form.
Eg
The acquired audio is, say 'Fault'(audio data), when we run the compare thorugh the array of audio strips in PC memory we would get a compare only of the G and we then extract the 'Fault' ( in text format). This is the work that needs to be done.
As for the criterion for equal or not equal, it is left to us, the best that is stastistically right.
Hope you got the logic, for, my explination may not be clear in its details!

thanks,
sunil



0 Kudos
Message 3 of 24
(5,777 Views)

Hi Sunil,
It sounds like you are trying to write a voice recognition program.  
In general, if you have Advanced Signal Processing Toolkit or the Sound and Vibration Toolkit, you will get VIs that provides signal processing that allows to do frequency spectrum analysis.  What tools do you have available?

Yi Y.
Applications Engineer
National Instruments
http://www.ni.com/support

 
0 Kudos
Message 4 of 24
(5,757 Views)
hi Yi,

Exactly, but I was trying to do it as simply as possible. Especially since the voice files are recorded in controlled environment, that is in test conditions. Which means noise and other unpredictable parameters of changed voice. Each time it is the same voice and the quality matches that of the one on PC memory as reference.

Since the voice bursts are identicale, is there any way using filters, correlations, or any such vis avaliable with labview 7.1 or 8.2?
Some simplified way, need not be very robust since test is controlled.

Presently i do not  have any of those tool kits. But we could see the possibility of going for it, though not immediately.


rgds,
sunil


0 Kudos
Message 5 of 24
(5,742 Views)
hi Yi,
I have followed the instructions in the link you provided. One file 'Read Text Info (one file).vi is missing.
However, I have not found any readme.txt after unzipping.
How do i install and configure the MSAgent?!!

thks and regards,
sunil

0 Kudos
Message 6 of 24
(5,739 Views)
Hi
I have got it going with the MS Agent installed. The vi had a missing text read file.vi and this i found on the ni.com.
When i went into the MS agent code, it looks like, the logic is very similar to what my applicaiton needs. Though the vi runs, it does not change
based on the speech since none exist. Looks like the speech dir path is for LV 6i and  i am using lv 8.2.
Do you have solution for this?
thanks,

sunil

0 Kudos
Message 7 of 24
(5,734 Views)
hi,
what is the path ...\..\LV 6i in thte Create MSAgent Voice.vi subvi in example MSAgent  Signal Generator adn prossessor.vi


If i am using LV 8.2 what should this be.?


thks
sunil
 
0 Kudos
Message 8 of 24
(5,732 Views)

Hi Sunil,

All the VIs in that examples are attached in the libraries.  They are under the ...\vi.lib\addons\MSAgent Voice\MSAgent.llb in the folder that you extrated the exe to.  It appears that the article is quite old though, and the link to Dragon Naturally Speaking is broken.  So you need to have MSAgent installed.  The readme file is located at ...\vi.lib\addons\MSAgent Voice

If you simply want to do frequency spectrum analysis, you'll probably have to have Advanced Signal Processing Toolkit at least, or you'll have to program your own.  The Sound and Vibration Toolkit would probably suite your needs better and more information is available at https://www.ni.com/en-us/shop/product/labview-sound-and-vibration-toolkit.html.

I've never tried to do write a speech recognition program, but I would try two things:

1. ignore the frequencies, and just look at the amplitude vs time graphs. This might not work well at all, and you have to take into account words that are said at different speeds.  Lets say you get a "burst" (the audio for a word) that lasts .5 seconds, resize all audio samples to last .5 seconds, and run compares between all of them and the samples. Assume the answer returns the highest compare value, where the compare value is the integral of the absolute value of the two (smoothed) curves subtracted from each other. I think maybe one of the correlation VI does that, so maybe you have tried it already.
2. The other thing I'd try is to sample the audio at a pretty high rate, and take like 100 FFTs per second, each on a 10ms chunk of a "burst". Let's say the burst is .5 seconds long, scale all samples to .5 seconds, then compare all segment FFTs (all 50 of them) with each set of segment FFTs for the samples. Do a comparison the same way, first for each segment, then sum all segments, and take the result with the compare value (difference from sample) closest to zero.
I'm not sure if this will work, the key may be smoothing the data a bit before it's filtered, otherwise it can be spikey and crappy and impossible to deal with.  You might want to smooth the data after FFTing it, NOT before. Otherwise the FFT will lose higher freq results (depending on how much smoothing and the audio sample rate).

Hope this helps.

Yi Y.
Applications Engineer
National Instruments
http://www.ni.com/support

 

Message Edited by Yi Y on 08-05-2007 11:15 PM

Message Edited by Yi Y on 08-05-2007 11:19 PM

0 Kudos
Message 9 of 24
(5,705 Views)
Hi Yi,

Your lead has taken me to a possible solution without really going into the SA or Audio Analysis. I went through some tutorials on MSAgent and tools for speech, now I think what I need can be done by modifying the existing example. Sure it is a bit of work but I see light at the end of the tunnel!!
Thanks for your inputs, will keep in touch.

rgds,
sunil

0 Kudos
Message 10 of 24
(5,692 Views)