written 6.7 years ago by | • modified 6.6 years ago |
Subject: Speech Processing
Topic: Speech Analysis in Time Domain
Difficulty: High
written 6.7 years ago by | • modified 6.6 years ago |
Subject: Speech Processing
Topic: Speech Analysis in Time Domain
Difficulty: High
written 6.6 years ago by |
(i) The discrete time auto-correlation function of a deterministic signal is given by $ \phi (k) = \sum_{m-w}^{w} x(m)x(m+k) $
(ii) For random or periodic signal is auto-correlation function is given as $$ \phi(k) = \lim_{n \to w} \,\, \frac{1}{2N+1} \sum_{m-w}^{w}x(m)x(m+k) $$
(iii) When the signal is periodic with a period of P samples, meaning that x(m) has period P i.e. x(m) = x(m+P), then $ \phi(k) = \phi (k+P) $.
(iv) Some important properties of auto-correlation function is:
$ \hspace{0.5cm} $ a) It is even function $ \phi(k) = \phi(-k) $
$ \hspace{0.5cm} $ b) Maximum value is attained at k = 0 i.e. $ |\phi(k)| \leq \phi $ for all k
$ \hspace{0.5cm} $ c) $ \phi(0) $ indicates a quantity that equals the energy signals for deterministic signals.
(v) The short time auto-correlation function is $$ R_n (k) = \sum_{m-w}^{w} x(m)W(n-m)x(m+k)w(n-k-m) $$
The equation can be interpreted as:
$ \hspace{0.5cm} $ a) A speech segment is first chosen by multiplying it by the window.
$ \hspace{0.5cm} $ b) Then the deterministic auto-correlation applied to the segment of speech that was windowed.
$$ R_n (-k) = \sum_{m-w}^{w} x(m)W(n-m)x(m-k)w(n+k-m) $$
$ \hspace{0.5cm} $ First note on $$ y_k(n)h_k(n-m) = \sum_{m-w}^{w} x(m)x(m-k)[W(n-m)W(n-m+k)] \\ h_k(n) = W(n)W(n+k) \\ Defining \,\, h_k(n) = W(n)W(n+k) $$
The equation becomes R$_n$(k) = x(m)x(m-k)h(n-m)
(vi) filtering the sequence x(n)x(n-k) with a filter having an impulse response h$_k$(n) would give us the value of the k$^{th}$ auto-correlation lag at time n.
$$ R_n(-k) = r_n(k) = y_k(n) \times h_k(n) $$
(vii) Short time auto-correlation function is usually computed using following equation:
$$ R_n (k) = \sum_{m-w}^{w} x(m)W[-(m-n)]x(m+k)W[-(m-n+k)] \\ = \sum_{m-w}^{w} x(m+n)W'(m)x(m+n+k)W'(m+k) \hspace{1cm} [W'(m) = W(-m)]$$
(viii) If the duration of the window W' is finite then:
$$ R_n(k) = \sum_{n=0}^{N-1-k} [x(n+m)W'(m)][x(n+m+k)W'(m+k)] $$
(ix) Choice of N is a critical since it should be chosen in a manner that it give a good indication of periodicity.
(x) The requirement conflict due to changing properties of speech signal needs N to be as small as possible.
(xi) Duration of the window must at least cover two period of the waveform in order to get any indication of periodicity in the auto-correlation function.
(x) The modified short time auto-correlation function is given by:
$$ \hat{R}_n (k) = \sum_{m-w}^{w} [x(n) \hat{W}_1(n-m)][x(m+k) \hat{W}_2(n-m-k)] $$
The above expression can also written as
$$ \hat{R}_n (k) = \sum_{m-w}^{w} [x(n+m) \hat{W}_1(m)][x(n+m+k) \hat{W}_2(m+k)] $$
That is,
$$ \hat{W}_1(m) = \begin{cases} 1 \hspace{1cm} 0 \leq 1 \leq N-1 \\ 0 \hspace{1cm} otherwise \end{cases} \\ \hat{W}_2(m) = \begin{cases} 1 \hspace{1cm} 0 \leq 1 \leq N-1+k \\ 0 \hspace{1cm} otherwise \end{cases} $$