[FFmpeg-devel] [RFC] AAC Encoder
Gabriel Bouvigne
bouvigne
Tue Aug 19 18:26:43 CEST 2008
Michael Niedermayer a ?crit :
>>Regarding the highpass method, if the highpass freq is properly choosen,
>>it's a method that demonstrated its usefullness. Its main drawback is
>>usually the computation time compared to spectrum-based methods which
>>can often be done using data that is already computed.
>
>
> And how does one properly choose it?
> Is it content dependant, if so i really have my doubts about it being a
> good choice as algorithm.
The highpass freq should be choosed in order to:
a)filter low freq content in order to not trigger short blocks on low
freq transcient, as humans are not really sensitive to "low" freq
attacks. Some codecs like mp3 and atrac even some possible mixed block
sizes, where low freqs are coded with a long block while higher freqs
are coded with some short blocks. Usually, a 2 or 3kHz highpass is fine
for it.
b) filter low freqs so you are not fooled by the period of a low freq
signal (you could end up with fake surges within some of the short windows)
(Lame uses fs/4 as an highpass value)
From a practical POV, when there is a transcient (this is the same for
audio or video), you have some energy spreaded over a big part of the
spectrum, so a "too high" highpass is not really a big deal.
Of course, if the highpass is too high, you might miss a few cases which
do not really qualify as transcient, but would still be more efficiently
coded as short (like fatboy.wav or some harpsichord samples).
In those less easy cases, other methods like detection of PE surge help.
(Lame uses both time domain highpass method and PE surge detection)
There are also frequency based methods, which are mainlu doing a regular
time to freq transform over long block, and are looking at how energy is
distributed over the spectrum in order to detect attacks. (you could try
plotting a spectrogram of castanet.wav with something like audacity, you
would quickly grasp it, even though it's probably intuitive enough if
you compare it to how a dct of a big image block looks in case of sharp
transitions like the ones in some anime content)
Please note that block switching itself is not enough to provent all
cases of transcient smearing, and that the full aac standard also
provides additional tools to deal with it (TNS and gain_control).
--
Gabriel Bouvigne
www.mp3-tech.org
personal page: http://gabriel.mp3-tech.org
More information about the ffmpeg-devel
mailing list