How can you crossfade (fade in/out) arrays?

madgreek · ‎03-24-2007

I am afraid you lost me at some point.

What are you using to pull out the optimum marker locations on your orginal voiced signals?
what do you mean by optimum marker locations?

How long are the signals and how much are you shifting the pitch by?

when you talk about signals, do you mean the original signal or the voiced/unvoiced segments?

About the pitch shifting....whenever you shift (up or down) the pitch, you have to also compensate for the time axis change with a time scale also in order to keep the duration of the original signal, meaning that by increasing the original pitch by 50% lets say (increasing the overlap part between the voiced segments), you have to also increase by 50% the number of voiced segments being used by repeating some of them (see attached graph) and vice versa for downscaling the pitch. I have to also be able to do time scaling, by just removing or repeating integer number of voiced segments.

If you will be able to help me fade in/out the voiced segments the rest i think are easy to deal with (its just removing or repeating windowed voiced segments for both time/pitch scaling and overlap/adding them together). I dont know if i made it easier or harder for you with this

Milqman · ‎03-24-2007

What are you using to pull out the optimum marker locations on your orginal voiced signals?
what do you mean by optimum marker locations?
I read this little tidbit on pitch shifting

http://cnx.org/content/m11711/latest/

It seems to indicate that unless you have a very tight bandwidth on your signal, that your marker locations are determined via a tradeoff between two possibly opposing guides:

you want to place them at the peaks of the fundamental frequency
you also want to place them at exact intervals

if the portion of the signal that you are working with has noise or has a slow drift in pitch (from beginning to end let's say) then you can't fulfill both of those marker placement conditions at the same time. The link shows their (somewhat in depth and possibly beyond your scope) way of doing it.

When I say optimum marker locations, I am just talking about the locations that would be derived by considering both of these conditions. If you have a tight bandwidth on your expected signals then fulfilling both conditions does not take any extra work.

How long are the signals and how much are you shifting the pitch by?

when you talk about signals, do you mean the original signal or the voiced/unvoiced segments?

I was just trying to get a sense of how sophisicated the algorithm would have to be. If you are shifting the signal by a lot, and there it is a long signal, then you end up with a lot of duplicated segments. The alignment/registration between the desired portion markers and the orginal portion markers might turn into a head ache if you made it to be dumb, but then asked it to expand or contract the signal substantially. That's all.

So you just want to add the segments together again? End to end? Is that all you need to know? If that is your question, then use a "Build Array" block on the Array Palette. Right-click it and check "concatenate inputs". I thought you were asking a much much more complicated question. I guess that happens.

~milq

madgreek · ‎03-24-2007

i see what you meant earlier

i followed that link and there are some small differences between that and what i want.....for us its just enough to keep a standard size of the pitch period in any segment

up to this point in my code, i am able to break the original signal into chunks of unvoiced segments (length 512 samples) and into the voiced segments that are broken further down to the period length. Up to here, everything works fine and correct.

for the length of the signal: i am going to use the sound signal i already sent to you. its 4.5 seconds i think sampled at 11025 spm.its not that big (i think). i probably wont need to scale it more than 50%-200% up or down so i believe it wont be that much "scary".

what i want is to take any voiced periods (that i already get from my code) and somehow multiply each one by a Hanning window that will overlap them by 50%. if i do that, then by repeating or deleting these overlapped periods and adding them at the end you are able to scale the signal.

Kevin_Price · ‎03-26-2007

This is a very interesting-sounding app, but I haven't had time to really try to wrap my head around details. Too bad for me. Just wanted to chime in with a tiny little tidbit that might or might not be relevant. Tidbit first, story second.

Tidbit: when you perform a cross fade on signals that represent a sound waveform, and if there's value in maintaining constant human-perceived volume during the fade interval, you probably aren't going to want to use most of the Window functions or a simple linear fade function.

Story:

Fact 1. Certain members of my household who shall remain nameless had developed the habit of running a fan all night primarily for the sort-of white-ish noise.

Fact 2. I'm a notorious cheapskate and part-time curmudgeon. I hate the idea of paying for the electricity to run a biggish fan motor all night if I could generate the same noise from a little boombox. Plus, we wear out our fans faster than we need to so there's replacement costs too.

So I says to myself, "Self, you could record a fan sound to computer, then expand the time out to fill a CD that can be set for repeat play all night." Well, I quickly learned that things I recorded personally sounded nothing at all like what my ears heard. So I scoured the net a bit for fan noise samples, and the only thing I found for free was about a 10 second sample of good-sounding fan noise.

Next step - Audacity, a free open source sound editor. I wanted to append 2 of these 10 second samples back-to-back, blending them with a cross-fade. Then I'd take the ~19 second result and repeat, then repeat again, etc.

However, it turned out to be fairly difficult to cross-fade the sound in such a way that it was difficult to detect an anomaly in overall volume. Simple linear fades produced a net perception of attenuation. I spent too much time fiddling and too little time thinking and researching, but eventually arrived at a satisfactory result. It was definitely a large pain in the neck though. With thinking and researching, there's probably a mathematical function out there that can produce the correctly weighted average of two voltage waves such that an ear which perceives volume logarithmically would detect an overall constant volume. But I wouldn't trust myself to guess what that function is exactly.

Anyway, just a cautionary note in case you may also need to maintain constant perceived volume during a cross-fade. Maybe someone out there knows the right function to apply?

-Kevin P.

ALERT! LabVIEW's subscription-only policy came to an end (finally!). Unfortunately, pricing favors the captured and committed over new adopters -- so tread carefully.

madgreek · ‎03-26-2007

Kevin

Its sure you made my day

The crossfading of the signals i am talking about is not my goal but just a step into a bigger code i am building and even perhaps, crossfade was kinda the wrong term to use, maybe concatenation was more correct. The problem you had is that you couldnt maintain a constant volume level in your signal after crossfading it, is that it?I dont know why exactly this happened but the very first thought that comes out is that since crossfading fades in and out two signals (in your case the same one twice) i would expect to hear a diminishing volume level at the are where the signals meet, but on the other hand though now i am thinking about it, your signal had a constant volume from the beginning till the end of it. I cant say anything on this right now at least. Thank you though for your tip

madgreek

Milqman · ‎03-26-2007

Hey guys,

Back in on Monday 😞

Couple items here:

Volume/intensity follows the log of the input pressure/power

Kevin, if you put the crossfade from being 1 second to being 5 seconds, the final result would be in a constant state of crossfade. Also, (and possibly in addition to) I think audacity has an EQ that you can vary within the file. If that were the case, you could pump up the low spots with it. If the issue is only present at the beginning/end of the file, you could also truncate the file to chop it off. Both of these ideas make your base file smaller (higher % crossfade and truncation) but if you are just going to be stacking them up next to each other, even shrinking them by a factor of 2 would only result in 1 extra operation at the end.

Madgreek, since we are windowing and adding on the timescale of a single wavelength (and because the portions of the period that are typically added are almost zero before AND after the windowing) I am not sure that this extra windowing complication is entirely value-added.

Give me a bit of fiddling time to think about it, and I will try to come up with a sneaky way to implement the behavior you have in the picture you attached up above (very useful for visualizing the need and the application, btw). I am sure Kevin could do it in his sleep (and will probably have something way before me).

~milq

madgreek · ‎03-26-2007

Milqman

Take your time with it. I am also working on it and i will let you know when i come up with something

Kevin

If instead of crossfading your fan signal, you try and repeat that 10 second sound as many times you want, and multiply each one with a uniform windowing function (i.e Hanning ) of lets say length 20 seconds long just as to include the 2 surrounding pieces (left and right) basically overlap them by 50% , so as to avoid any "spectral leakage" at the borders of them, dont you think that might have worked in your case? If you manage to center the very first signal of length 10 secs into a window of length 20 secs by zero padding it for the first 5 secs then i believe you can expand your signal for whatever duration you want it to.

I dont know if i am saying something stupid here, i am just trying to "act as someone smart"

......Its quite similar up to a point with what i want to do

Madgreek

madgreek · ‎03-26-2007

Milqman

Please dont take this the wrong way as if i am trying to take advantage of your kindness to help me, but if you want to get a very general idea, summarized in a paragraph ,about pitch scaling, and you have some "extra" free time in your hands, you could read the paragraph 3.2.1 in page 6 of the following paper.

http://www.tc-helicon.tc/Files/helicon_files/Pitch_shifting.pdf

Its the only one of the papers i have in my posession that is summarized in a paragraph

madgreek

Kevin_Price · ‎03-26-2007

I tried to quickly review this thread again. This is my first exposure to this subject matter and I'm having trouble keeping track of the terminology. I can, however, talk in very generic terms about arrays of numeric data that need to be manipulated. I'll be able to follow along better if your answers use generic terms too.

1. So let's say you've got data that used to be an array of 512 samples, but which has been split down into 8 segments of 64 samples each.

2. These 8 segments overlap their neighbors by 50%, right? For example, the 2nd half of the samples of segment 3 are exactly the same as the 1st half of the samples of segment 4?

3. When I look at your diagrams, it appears that pitch shifting would cause you to either delete or duplicate segments, and to overlap either less or more than 50%. It isn't completely clear to me how one determines the correct # of segments to use, or which one(s) to delete or duplicate. It *is* clear enough how that determination would then control the % overlap necessary, I think. Am I right in perceiving that the output of pitch shifting would still be exactly 512 total samples?

4. How do you determine the # of segments to combine in the pitch shifting case? The diagram shows special cases where the fractions work out to integer #'s of segments. But what if you needed to pitch shift by a factor of 5/7 while starting with 8 segments? Would you just round to the nearest integer?

5. The time shifting appears to always want to maintain a 50% overlap. Is this right?

6. It further appears to me that a time shift would produce output with either <512 samples (compression) or >512 samples (expansion). Then what? Don't you need to resample it back to exactly 512 samples?

7. When calculating the merged and overlapped segments, all the overlap regions are combined in some type of cross-fade. You've referred to a Hanning window and I saw it mentioned in one of the links as well. Frankly, it strikes me as an odd functional form to use for a time-domain cross-fade. I wouldn't expect it to preserve "volume" constancy through the fade region. I'd have first thought that a triangular window would be a more natural choice. But then again, my story about fan noise and Audacity was based on my own experience where an (apparently) linear cross-fade envelope shape also did not preserve volume constancy. There was a distinct volume attenuation there. I'm a novice with Audacity though, so it may have been operator error.

-Kevin P.

ALERT! LabVIEW's subscription-only policy came to an end (finally!). Unfortunately, pricing favors the captured and committed over new adopters -- so tread carefully.

Milqman · ‎03-26-2007

Kevin, Madgreek,

I think this is getting pretty close to where we have to go.

There is almost definitely some rounding problems in the last (inner) loop, and there is a special case issue involving the first segment to compose (it start midway through the segment). On my little examples this gives approximately expected behavior. I would tweak it more or iron it out but I have other stuff to do 🙂

Data fed into this app is a 2D array where each row is a segment/period (peak in the middle).

Given this, Kevin can probably iron it out to make it work correctly. I would have put up word balloons to "comment" the code, but I realized that I do not know how to do that.

Most of it is pretty straightforward, good luck you guys, post what you end up deciding on!

Best,
~milq

LabVIEW

How can you crossfade (fade in/out) arrays?

Re: How can you crossfade (fade in/out) arrays?

Re: How can you crossfade (fade in/out) arrays?

Re: How can you crossfade (fade in/out) arrays?

Re: How can you crossfade (fade in/out) arrays?

Re: How can you crossfade (fade in/out) arrays?

Re: How can you crossfade (fade in/out) arrays?

Re: How can you crossfade (fade in/out) arrays?

Re: How can you crossfade (fade in/out) arrays?

Re: How can you crossfade (fade in/out) arrays?

Re: How can you crossfade (fade in/out) arrays?