Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Did you consider using Constant Q Transform instead of STFT?

mpv and ffmpeg come with CQT visualization:

mpv --lavfi-complex="[aid1]asplit[ao][a]; [a]showcqt[vo]" "$@"

You can even get it from microphone with some piping:

parec --latency-msec=1 | sox -V --buffer 32 -t raw -b 16 -e signed -c 2 -r 44100 - -r 44.1k -b 16 -e signed -c 2 -t wav - | ffplay -fflags nobuffer -f lavfi 'amovie=pipe\\:0,asplit=2[out1][a],[a]showcqt[out0]'



When should one use the Constant Q transform over a Mel or Bark spaced STFT?


Have not tried it -- def worth investigating. Thanks.


The Constant Q Transforms uses bins that are spaced on a log scale like musical notes. So the bins corresponds better to how humans perceive pitch. You wont waste hundreds of bins to the high frequencies.

Calculating CQT can be roughly as fast as FFT.

http://academics.wellesley.edu/Physics/brown/pubs/effalgV92P...

And here are some real musical samples you can use instead of the artificial midi notes:

http://virtualplaying.com/virtual-playing-orchestra/




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: