written 8.0 years ago by | modified 3.0 years ago by |
Mumbai University > Information Technology > Sem 7 > Data Compression & Encryption
Marks: 5M
Year: May 2012
written 8.0 years ago by | modified 3.0 years ago by |
Mumbai University > Information Technology > Sem 7 > Data Compression & Encryption
Marks: 5M
Year: May 2012
written 8.0 years ago by | • modified 8.0 years ago |
Physical Models: If we know something about the physics of the data generation process, we can use that information to construct a model.
For Ex. In speech- related applications, knowledge about the physics of speech production can be used to construct a mathematical model for the sampled speech process. Sampled speech can be encoded using this model Real life Application: Residential electrical meter readings
Probability Models: The simplest statistical model for the source is to assume that each letter that is generated by the source is independent of every other letter, and each occurs with the same probability. We could call this the ignorance model as it would generation be useful only when we know nothing about the source. The next step up in complexity is to keep the independence assumption but remove the equal probability assumption and assign a probability of occurrence to each letter in the alphabet. For a source that generates letters from an alphabet A = { a1 , a2 , …….. am} we can have a probability model $P= { P (a_1) , P (a_2)………P (a_M)}$
Markov Models: Markov models are particularly useful in text compression, where the probability of the next letter is heavily influenced by the preceding letters. In current text compression, the Kth order Markov Models are more widely known as finite context models, with the word context being used for what we have earlier defined as state.
Consider the word ‘preceding’. Suppose we have already processed ‘preceding’ and we are going to encode the next ladder. If we take no account of the context and treat each letter a surprise, the probability of letter ‘g’ occurring is relatively low. If we use a 1st order Markov Model or single letter context we can see that the probability of g would increase substantially. As we increase the context size (go from n to in to din and so on), the
probability of the alphabet becomes more and more skewed which results in lower entropy.