0
1.7kviews
How will you compress the amplitude of power spectrum? How is spectral smoothing done?

Subject: Speech Processing

Topic: Homomorphic Speech Processing

Difficulty: High


How will you convert power spectrum to mel scale? Explain the procedure for calculation MFCC with a block schematic. Clearly explain how the integration of power is done on mel scale filters .How will you compress the amplitude of power spectrum? How is spectral smoothing done?

1 Answer
0
3views

(i) The Mel Frequency Cepstrum (MFC) can be defined as the short-time power spectrum of a speech signal, which is calculated as the linear cosine transform of the log power spectrum on a non-linear Mel scale frequency.

(ii) In the case of the MFC, the frequency bands are equally spaced on the Mel scale.

(iii) This Mel scale approximates the human auditory system's response more closely than the linearly - spaced frequency bands used in case of cepstrum.

(iv) MFCC's can be calculated as follows:

a) Take FFT of window signals.

b) Compute its squared magnitude. Gives power spectrum.

c) Pre-emphasise the spectrum to approximates the unequal sensitivity of human being different frequency.

d) Integrate the power spectrum within the overlapping critical band filter response.

This integration is done using triangular overlapping windows called Mel filters. This effectively reduces the frequency sensitivity over the original spectral estimates, particularly at higher frequency are emphasized because of the wider band.

e) Compress the spectral amplitude by taking log. Optionally the integration of log power spectrum may be done.

f) Take IDFFT. This gives the cepstral coefficients.

g) Perform spectral smoothing, then get MFCC.

Block diagram for calculation of MFCC

Please log in to add an answer.