I used the meow sounds from https://soundspunos.com/animals/10-cat-meow-sounds.html. I expected to see very little variability in the cat meow sounds, maybe just 4-5 different types for basic emotions. To my surprise, each “cat meow” has astonishingly colorful, complex and unique structure, unlike human vowels that follow a more or less predictable pattern: https://soundshader.github.io/vowels.
The algorithm behind these images is fairly simple. It computes FFT to decompose the sound into a set of A·cos(2πwt+φ)
waves and drops the phase φ
to align all cos
waves together. This is known as the auto-correlation function (ACF). Before merging them back, it colorizes each wave using its frequency w
: the A notes (432·2ⁿ Hz) become red, C notes - green, E notes - blue, and so on. Finally, it merges the colored and aligned cos
waves back, using the amplitude A
for color opacity, and renders them in polar coordinates, where the radial coordinate is time.
Interpreting ACF images:
For example, on one image below there is a green belt with 10 repetitions. One repetition correponds to 13.5 Hz here (55296 Hz sample rate, 4096 FFT bins), so 10 repetitions is 135 Hz, which corresponds to C3. On another image there is a curious red cross in the center, it’s a red belt with 2 repetitons. That’s 27 Hz, or A0, almost infrasound.
Demo: https://soundshader.github.io/?n=4096&img=2048&acf.lr=5&sr=55.296. With C
on keyboard you can switch between polar and decart coords, and with Ctrl+Click and Shift+Click you can select the left and right boundary of the sound sample.
GitHub: https://github.com/soundshader/soundshader.github.io
AGPLv3