There are several approaches of building mathematical models in data compression:

Physical Model

In Speech related applications, knowledge about the physics of speech production can be used to construct a mathematical model for the sampled speech process. Sampled speech can then be encoded using this model.

Models for certain telemetry data can also be obtained through knowledge of the underlying process.

For example, if residential electrical meter readings at hourly intervals were to be coded, knowledge about the living habits of the populace could be used to determine when electricity usage would be high and when the usage would be low. Then instead of the actual readings, the difference (residual) between the actual readings and those predicted by the model could be coded.

 

 

Probability Model

The simplest statistical model for the source is to assume that each letter that is generated by the source is independent of every other letter, and each occurs with the same probability.

We could call this the ignorance model, as it would generally be useful only when we know nothing about the source.

For a source that generates letters from an alphabet A = {a1, a2, a3,.....,an}, we can have a probability model P = {P(a1), P(a2), P(a3),......, P(an)}.

Given a probability model ( and the independence assumption), we can compute the entropy  of the source using this equation:

H(s) = -SP(X1) log P(X1)

We can also construct some very efficient codes to represent the letters in A.

Of course, these codes are only efficient if our mathematical assumptions are in accord with reality.