SmoothSwing V2 Algorithm description
Jan 17, 2018 17:36:48 GMT -5
Obi-Shane, Sethski, and 12 more like this
Post by thexter on Jan 17, 2018 17:36:48 GMT -5
I was going to post this in the bottom of the SmoothSwing V2 demonstration thread, but I think it's different enough in terms of audience that it should be it's own thread:
Original thread and demonstration:
SmoothSwing V2 demonstration
Overview:
The SmoothSwing V2 algorithm works by interpolating between different pre-authored looping hum sounds based on readings from the gyroscope.
Requirements:
1. The main requirement is the ability to read and blend 3-4 sounds at once from the sd card.
a. Sound 1 is the main hum.
b. Sounds 2 and 3 are a pair of hum sounds randomly selected from a larger set.
c. The 4th sound is any clash or effect sound that needs to be played over the top of the swing/hum sounds.
i. This is optional if you just want to drop the swing while playing the clash.
Differences from SmoothSwing V1:
SmoothSwing V1 had two looping hum sounds (one pitched high and one pitched low) that were blended over the top of the main hum. This gave a monotonous repetition to the swings and took a lot of control out of sound designers' hands. SmoothSwing v2 uses several different looping hums and randomly selects which ones to use for particular swing. This approach gives more variation in the swing sounds and gives more control back to the sound designers.
SmoothSwing V1 treated the issue of blending sounds together as a 2 dimensional problem. The projection of the swing onto two orthogonal axes determined the influence of each of the two hums. Variation had to be introduced by slowly rotating the basis by which we evaluated swing direction. You could still get dead zones in SmoothSwing V1 based on the current orientation of the basis and the direction of motion. SmoothSwing v2 reduces it to a 1 dimensional problem and removes the need to change our frame of reference while providing more variability of the sound and more predictability of the swing response.
Sounds font authoring requirements:
Smoothswing v2 uses a main hum and several pairs of looping hum sounds. The number of pairs can be arbitrarily large, but I chose 8 pairs (16 sounds) for my particular implementation.
Each "swing" is comprised of one of these pairs of looping hum sounds. Each sound in a pair can be any looping hum created by a sound designer, but for the best results, they should generally follow these guidelines:
1. The pair of sounds in each swing should be loosely based on the same hum.
2. The hum used as the base of the swing does not have to be the same as the main hum, and better results can be achieved by picking a hum with more pop, distortion, or tremolo than the main hum.
3. One of the pair is often pitched higher and the other is often pitched lower than the main hum.
4. The "loudness" of the sound is increased via range compression, distortion, or other techniques.
Blending the paired looping hum sounds to make swings:
Basic algorithm overview:
The main hum is always playing.
if not swinging
Randomly pick one of the looping hum pairs.
Randomly pick a transition point (angle) that specifies the center of a transition region.
Set the accumulated swing angle to 0.
if swinging
As the swing starts, we are playing the first sound of the pair.
Over the transition region, we blend between the first and second sounds in the pair.
After the transition region, we are playing the second sound of the pair.
The volume of this resulting swing sound is modulated based on the strength of the swing.
The volume of the main hum is ducked based on the strength of the swing as well.
If you look at the level of each looping hum sound in the pair in terms of angle traversed by the saber during a swing, it looks like this:

If you look at the hum and swing sounds modulated/ducked by the swing strength, you get something like the graph below. Note, this is still over angle, not time, but you get the idea.

Swing strength:
The strength of the swing is a normalized value based on the angular velocity. For example, if you want the swing to be at maximum volume when you're swinging it at 4*PI rad/sec (720 deg/sec), your swing strength is min( 1, angularvelocity / ( 4 * PI ) ). Note, we clamp it to 1. This value is the one that determines how loud to make the swing and how much to duck the main hum.
Selecting a new swing pair and transition region:
To get the variation in the swing sounds, we need to randomly select one of the looped hum pairs, randomly select a transition point/region, and reset our accumulated swing angle. These need to be done at a time that the abrupt change will not be noticed. If either of the following conditions is met, then we are between swings and can safely select a new values:
1. Our swing angular velocity is below a certain threshold (saber is still or moving too slowly to register)
2. Our swing direction changes based on a certain threshold (direction change during a swing)
a. If our direction changes, our velocity would have had to approach 0, so we don't hear any discontinuities when selecting a new swing pair.
Spins:
Spins are not uniquely handled nor authored. They are handled automatically with the addition of two small changes to the above algorithm.
1. The accumulated swing angle wraps around to 0 after passing 2 * PI.
2. We add a second transition region that is PI radians out of phase with the initial transition point.
The second transition region allows us to crossfade the swing back to the first sound in the pair. This transition region is generally wider (larger angle) than the first. I use a fixed 2.6 radians (150 degrees) as the transition region for this second transition region, so it's VERY wide and gradual.
Let's look at what the sounds look like in terms of angles traversed while spinning (either ignore swing strength or assume it's 1 for this example):

Every full revolution of the saber involves one sharp transition in sounds (high to low for example), and one longer gradual transition (low back to high for example).
Implementation details:
1. Because each pair of hums (the swing hums, not the main hum) is tightly linked, I actually store them interleaved in the same file.
a. This helps a little bit with loading data off of the sd card. When I read a 512 byte block, I'm actually getting 256 bytes from the first hum, and 256 bytes from the second, as opposed to having to have two 512 byte blocks for each.
b. This adds an additional requirement that each pair of looped hums have exactly the same number of samples.
2. Instead of always starting with the first sound in the hum pair and transitioning to the second, sometimes start with the second and transition to the first. I choose to switch these with 50% probability, which effectively doubles my swing variation with no additional work.
3. When transitioning to a new swing hum pair, don't start playing the paired sounds from the beginning. Let them start from wherever they last left off.
a. This allows for authoring paired hum loops that evolve over time and can give even more variation when swinging.
b. Make sure that if the swings do evolve over time, that the sound changes gradually enough so that it's not too noticeable within a single swing.
4. For my accumulated swing angle I'm doing a simple Euler integration of the magnitude of the total angular velocity from the gyro.
a. This means I'll get swing response when just twisting the saber in my hand (rotation about Z axis, or whatever one is down the barrel for the particular gyro setup).
i. While I like this effect, it can be eliminated by ignoring that axis when calculating the angular velocity magnitude.
b. The simple integration also means that for spins, you'll accumulate error in the accumulated angle and the transition regions for the swing will appear to drift a bit.
i. The amount of drift will depend on the timestep between gyro samples.
ii. I've not found the drift to be too noticeable, but I'm sampling at a fairly high frequency.
5. When ducking the main hum during a swing, I never go down to 0 volume on the main hum.
a. The lowest I'll go is maybe 25%.
6. Smooth the total angular velocity value calculated from the gyro.
a. I'm sampling the raw gyro data fairly frequently, and the results are often noisy.
b. The accumulated swing angle is the main input value to the algorithm, so if you smooth it, you get smoother results out the other side.
c. This is essential since it ultimately affects the volume of the swings and hum. Any noise in setting the volume will appear as noise in the resulting sound.
d. Note, I do not use anything fancy. I use a simple box filter.
7. While technically the cross-fade between the two hums should be an equal power cross-fade, it seems to work fine with just a linear cross-fade.
a. I believe the dip in the middle of the sound due to the linear cross-fade helps accentuate the transition.
b. Also, I didn't want to add 2 sqrts when calculating every swing sample, so there's that.
8. Many of these parameters should be controlled per-font. I use txt config files for each font.
a. An initial (first) transition region width of PI/4 (45 degrees) works well for most of the fonts I've played with.
i. Smaller transition zones feel more aggressive. Large ones feel more sweeping. Experiment per-font.
b. I randomly select the center of the first transition region to be between 10 and 60 degrees from the start of the swing.
i. Making this selection range variable per-font means you can have more advanced or more delayed "attack" on the swing.
c. Angular velocity threshold. This is the threshold at which the volume of the swing is at maximum, or how fast you have to swing the saber to get the loudest swing.
i. I normally default this to 4 * PI radians per second. Anything below that will be between 0..1 for the swing volume. Anything above gets clamped to 1.
ii. 4 * PI radians per second requires a rather fast swing for maximum volume.
d. Have a swing sharpness value
i. Instead of modulating the swing volume by SwingStrength, modulate it by SwingStrength ^ SwingSharpness ( pow( SwingStrength, SwingSharpness ) )
ii. This gives a nice non-linear swing response.
ii. I default SwingSharpness to between 1.5 and 2.0 for most of my fonts.
Thanks for looking.
Original thread and demonstration:
SmoothSwing V2 demonstration
Overview:
The SmoothSwing V2 algorithm works by interpolating between different pre-authored looping hum sounds based on readings from the gyroscope.
Requirements:
1. The main requirement is the ability to read and blend 3-4 sounds at once from the sd card.
a. Sound 1 is the main hum.
b. Sounds 2 and 3 are a pair of hum sounds randomly selected from a larger set.
c. The 4th sound is any clash or effect sound that needs to be played over the top of the swing/hum sounds.
i. This is optional if you just want to drop the swing while playing the clash.
Differences from SmoothSwing V1:
SmoothSwing V1 had two looping hum sounds (one pitched high and one pitched low) that were blended over the top of the main hum. This gave a monotonous repetition to the swings and took a lot of control out of sound designers' hands. SmoothSwing v2 uses several different looping hums and randomly selects which ones to use for particular swing. This approach gives more variation in the swing sounds and gives more control back to the sound designers.
SmoothSwing V1 treated the issue of blending sounds together as a 2 dimensional problem. The projection of the swing onto two orthogonal axes determined the influence of each of the two hums. Variation had to be introduced by slowly rotating the basis by which we evaluated swing direction. You could still get dead zones in SmoothSwing V1 based on the current orientation of the basis and the direction of motion. SmoothSwing v2 reduces it to a 1 dimensional problem and removes the need to change our frame of reference while providing more variability of the sound and more predictability of the swing response.
Sounds font authoring requirements:
Smoothswing v2 uses a main hum and several pairs of looping hum sounds. The number of pairs can be arbitrarily large, but I chose 8 pairs (16 sounds) for my particular implementation.
Each "swing" is comprised of one of these pairs of looping hum sounds. Each sound in a pair can be any looping hum created by a sound designer, but for the best results, they should generally follow these guidelines:
1. The pair of sounds in each swing should be loosely based on the same hum.
2. The hum used as the base of the swing does not have to be the same as the main hum, and better results can be achieved by picking a hum with more pop, distortion, or tremolo than the main hum.
3. One of the pair is often pitched higher and the other is often pitched lower than the main hum.
4. The "loudness" of the sound is increased via range compression, distortion, or other techniques.
Blending the paired looping hum sounds to make swings:
Basic algorithm overview:
The main hum is always playing.
if not swinging
Randomly pick one of the looping hum pairs.
Randomly pick a transition point (angle) that specifies the center of a transition region.
Set the accumulated swing angle to 0.
if swinging
As the swing starts, we are playing the first sound of the pair.
Over the transition region, we blend between the first and second sounds in the pair.
After the transition region, we are playing the second sound of the pair.
The volume of this resulting swing sound is modulated based on the strength of the swing.
The volume of the main hum is ducked based on the strength of the swing as well.
If you look at the level of each looping hum sound in the pair in terms of angle traversed by the saber during a swing, it looks like this:

If you look at the hum and swing sounds modulated/ducked by the swing strength, you get something like the graph below. Note, this is still over angle, not time, but you get the idea.

Swing strength:
The strength of the swing is a normalized value based on the angular velocity. For example, if you want the swing to be at maximum volume when you're swinging it at 4*PI rad/sec (720 deg/sec), your swing strength is min( 1, angularvelocity / ( 4 * PI ) ). Note, we clamp it to 1. This value is the one that determines how loud to make the swing and how much to duck the main hum.
Selecting a new swing pair and transition region:
To get the variation in the swing sounds, we need to randomly select one of the looped hum pairs, randomly select a transition point/region, and reset our accumulated swing angle. These need to be done at a time that the abrupt change will not be noticed. If either of the following conditions is met, then we are between swings and can safely select a new values:
1. Our swing angular velocity is below a certain threshold (saber is still or moving too slowly to register)
2. Our swing direction changes based on a certain threshold (direction change during a swing)
a. If our direction changes, our velocity would have had to approach 0, so we don't hear any discontinuities when selecting a new swing pair.
Spins:
Spins are not uniquely handled nor authored. They are handled automatically with the addition of two small changes to the above algorithm.
1. The accumulated swing angle wraps around to 0 after passing 2 * PI.
2. We add a second transition region that is PI radians out of phase with the initial transition point.
The second transition region allows us to crossfade the swing back to the first sound in the pair. This transition region is generally wider (larger angle) than the first. I use a fixed 2.6 radians (150 degrees) as the transition region for this second transition region, so it's VERY wide and gradual.
Let's look at what the sounds look like in terms of angles traversed while spinning (either ignore swing strength or assume it's 1 for this example):

Every full revolution of the saber involves one sharp transition in sounds (high to low for example), and one longer gradual transition (low back to high for example).
Implementation details:
1. Because each pair of hums (the swing hums, not the main hum) is tightly linked, I actually store them interleaved in the same file.
a. This helps a little bit with loading data off of the sd card. When I read a 512 byte block, I'm actually getting 256 bytes from the first hum, and 256 bytes from the second, as opposed to having to have two 512 byte blocks for each.
b. This adds an additional requirement that each pair of looped hums have exactly the same number of samples.
2. Instead of always starting with the first sound in the hum pair and transitioning to the second, sometimes start with the second and transition to the first. I choose to switch these with 50% probability, which effectively doubles my swing variation with no additional work.
3. When transitioning to a new swing hum pair, don't start playing the paired sounds from the beginning. Let them start from wherever they last left off.
a. This allows for authoring paired hum loops that evolve over time and can give even more variation when swinging.
b. Make sure that if the swings do evolve over time, that the sound changes gradually enough so that it's not too noticeable within a single swing.
4. For my accumulated swing angle I'm doing a simple Euler integration of the magnitude of the total angular velocity from the gyro.
a. This means I'll get swing response when just twisting the saber in my hand (rotation about Z axis, or whatever one is down the barrel for the particular gyro setup).
i. While I like this effect, it can be eliminated by ignoring that axis when calculating the angular velocity magnitude.
b. The simple integration also means that for spins, you'll accumulate error in the accumulated angle and the transition regions for the swing will appear to drift a bit.
i. The amount of drift will depend on the timestep between gyro samples.
ii. I've not found the drift to be too noticeable, but I'm sampling at a fairly high frequency.
5. When ducking the main hum during a swing, I never go down to 0 volume on the main hum.
a. The lowest I'll go is maybe 25%.
6. Smooth the total angular velocity value calculated from the gyro.
a. I'm sampling the raw gyro data fairly frequently, and the results are often noisy.
b. The accumulated swing angle is the main input value to the algorithm, so if you smooth it, you get smoother results out the other side.
c. This is essential since it ultimately affects the volume of the swings and hum. Any noise in setting the volume will appear as noise in the resulting sound.
d. Note, I do not use anything fancy. I use a simple box filter.
7. While technically the cross-fade between the two hums should be an equal power cross-fade, it seems to work fine with just a linear cross-fade.
a. I believe the dip in the middle of the sound due to the linear cross-fade helps accentuate the transition.
b. Also, I didn't want to add 2 sqrts when calculating every swing sample, so there's that.
8. Many of these parameters should be controlled per-font. I use txt config files for each font.
a. An initial (first) transition region width of PI/4 (45 degrees) works well for most of the fonts I've played with.
i. Smaller transition zones feel more aggressive. Large ones feel more sweeping. Experiment per-font.
b. I randomly select the center of the first transition region to be between 10 and 60 degrees from the start of the swing.
i. Making this selection range variable per-font means you can have more advanced or more delayed "attack" on the swing.
c. Angular velocity threshold. This is the threshold at which the volume of the swing is at maximum, or how fast you have to swing the saber to get the loudest swing.
i. I normally default this to 4 * PI radians per second. Anything below that will be between 0..1 for the swing volume. Anything above gets clamped to 1.
ii. 4 * PI radians per second requires a rather fast swing for maximum volume.
d. Have a swing sharpness value
i. Instead of modulating the swing volume by SwingStrength, modulate it by SwingStrength ^ SwingSharpness ( pow( SwingStrength, SwingSharpness ) )
ii. This gives a nice non-linear swing response.
ii. I default SwingSharpness to between 1.5 and 2.0 for most of my fonts.
Thanks for looking.