## Main goals of this section

After reading this section, students should be able to

- Implement an MLSD algorithm for FIR channels in software using the Viterbi algorithm
- Recognize other situations than in equalization where the Viterbi algorithm can be used
- Be able to compute the pairwise error probability between any two sequences for a given channel impulse response

## The Discrete-Time Equivalent Model - Structure of the received signals

Let us begin by quickly recalling the equivalent discrete time channel model that is observed at the output of the whitened matched filter. Such a channel is shown in the figure and the output of this channel $v_k$ is given by

(1)Here $o_k = \sum_{j=0}^n f_j \ I_{k-j}$ is the output of the ISI channel in the absence of any noise. In this model, we will assume that $E[|I_k|^2=1$. It is important to that in this model $\eta_k$ and $\eta_j$ are uncorrelated to each other. $\eta_k$ is also a zero mean Gaussian random variable with variance $N_0$, i.e., $N_0/2$ in each dimension. The variance is $N_0/2$ instead of $N_0$ because the baseband energy, i.e., energy in $I_k$ is defined to be 1. The detection problem is to now find the most likely sequence $\underline{I}$ given the sequence $\underline{v}$.

There are primarily two types of algorithms that try to efficiently solve the detection problem. They can be classified in to trellis based algorithms and equalization algorithms. The former take advantage of the fact that the ISI channel can be thought of as a finite state machine and they exploit the underlying Markov structure whereas the latter are based on the fact that the ISI channel can be thought of as a "filtering" operation.

### Finite State Machine Representation and Trellis Diagram

The ISI channel can be modeled as a finite state machine (FSM) which consists of $L$ shift registers. The state of the FSM is given by the contents of the shift register. Thus at time $k$, the state of the FSM is $S_{k-1} = [I_{k-1}, I_{k-2}, \ldots, I_{k-L}]$. An input $I_k$ arrives at time $k$ which results in the output $o_k$ which can be written as a function of $s_{k-1}$ and $I_k$. This input also causes a transition of the state from $s_{k-1}$ to the state $S_{k}$ at the next time instant. It can be seen that $s_{k}$ is also a function of $s_{k-1}$ and $I_k$. Similarly $o_k$ is also a function of $s_{k-1},I_k$. Thus, we get the state equations for the FSM that represents the ISI channel given by

(2)Notice the state $s_{k-1} = [I_{k-1},\ldots,I_{k-L}]$. There are a total of $M^{L}$ states. We will refer to these states using labels $1,2,\ldots,J=M^L$. The evolution of the states and the outputs can be represented using a trellis diagram as shown below. The trellis diagram consists of all possible states at any given time. It depicts all possible transitions that can occur from a state $s_{k-1} = j'$ to $s_{k}=j$. Every such a transition is called a branch transition and we can associate with every such a transition, an input symbol and an output symbol.

## Channels with Memory

Consider a discrete-time channel whose input is the sequence $\underline{I} = [I_1,I_2,\ldots,I_N]$ and output is $\underline{v} = [v_1,v_2,\ldots,v_N]$. We will define a channel as with or without memory as follows and based on this, we first note that the ISI channel with a channel with memory.

For a memoryless channel : $P(\underline{v}|\underline{I}) = \prod_{k=1}^N P(v_k|I_k)$ |

For a channel with memory : $P(\underline{v}|\underline{I}) \neq \prod_{k=1}^N P(v_k|I_k)$ |

## Optimal Sequence Detection

Intuitively, the reason we cannot make symbol by symbol decisions about $I_k$ based only on $v_k$ is because of the fact that several $v_k$s have information about any one $I_k$ and that they all depend on other $I_k$s also. Mathematically, what we want to do is to compute

(3)Suppose that we had no ISI, then we can see that $v_k$ depends only $I_k$ and $\eta_k$ and so, the joint probability of the vector $P(\underline{v}|\underline{I}$ can be factored in to $\prod P(v_k | I_k)$ and hence the maximization can be performed separately for each $k$. However, for channels with memory, this is not possible. If one attempts to compute the metric for each $\underline{I}$ and choose the maximum by exhaustively computing the metric for all $\underline{I}$s, we have a total of $M^N$ metrics to compute, where $M$ is the constellation size of each $I_k$. Thus, we have a complexity that is exponential in the length of the number of bits we want to detect.

Fortunately, the optimal sequence detection problem can be implemented using the Viterbi algorithm which is an instance of dynamic programming in a complexity that is only linear in $N$. Before, we describe this algorithm, it is useful to understand the Markov structure in the current problem at hand. Go on to Notes for the Viterbi Algorithm

## Performance Analysis of MLSE

Consider 2-tap channel [f_{0} f_{1}] and L=1 (memory). We will analyse the following example for BPSK modualtion scheme.

Suppose the all one's sequence is transmitted. What is the probabilty that an error is made at the k^{th} time instant. We must look at sevaral pairwise error probabilities. We will define a first error event probability at time k as the event that a path I^{'} that diverges from state +1 and is declared a survivor when it re-merges.

Notice that the last L values have to be same for both these paths. i.e k+l-L-1=k . $e_j= \frac{1}{2d}(I_j - I_j^\prime)$

For the example in the figure below: l=2 (# of taps), e=[e_{k}]/2 =[1] as d=1.

For the event specified in the figure to occur the following three events are needed.

# E_{1}: At time k, S^{'}_{k}. (states must be same)

# E_{2}: e_{j} must be valid error sequence.

# E_{3}: I^{'} must survive over I.

Therefore the pairwise error probability = $Q(\sqrt{\frac{4d^2(f_0^2+f_1^2)}{4\sigma^2}}})$.

If we want a passband signal with unit energy and if the noise PSD is N_{o}/2 then the equivivalent baseband signal should have energy two. Therefore $d=\sqrt{2} \textrm{ and } \sigma^2=N_0$ and thus the pairwise error probability = $Q(\sqrt{\frac{2(f_0^2+f_1^2)}{N_0}}})$.

If we think about computing the exact bit error rate, it is tougher to compute.This prompts to often use pairwise error probabilities corresponding to the most probable error event as good approximation at high SNR. Therefore we can conclude that probability of bit error =$Q(\sqrt{\frac{2(f_0^2+f_1^2)}{N_0}}})$.

### Comparison to channel with NO ISI

For any 2-tap channel ISI channel,at high SNR, there is no loss in performance.

For the current example ,

(4)P(E_{3}) is the pairwise error probability between I and I^{'}.We will call the Euclidean Distance between the outputs along I and I^{'} to be,

We know from the earlier part of the course that the pairwise error probability depends only on $\delta^2(\mathbf{\underline{I}, \underline{I^\prime}})$ and it is equal to$Q(\sqrt{\frac{\delta^2(\mathbf{\underline{I}, \underline{I^\prime}})}{4\sigma^2}}})$, where $\sigma^2$ is the variance of the complex Gaussian noise in each dimension(real and imaginary part separately).

For the example considered in the figures above, $\delta^2(\mathbf{\underline{I}, \underline{I^\prime}}) = 4 f_0^2 + 4 f_1^2$.

Recap the general idea. We want to compute squared Euclidean distance between the outputs corresponding to $\mathbf{\underline{I}} \textrm{ and } \mathbf{\underline{I^\prime}}$ and then pairwise probability of error is equal to$Q(\sqrt{\frac{\delta^2(\mathbf{\underline{I}, \underline{I^\prime}})}{4\sigma^2}}})$.

We have defined $\underline{\epsilon} = \frac{1}{2d} (\underline{I}-\underline{I'})$ and $\epsilon_j = \frac{1}{2d}(I_j - I'_j)$.Therefore, $2d\underline{\epsilon} = \underline{I}-\underline{I'}$.

Let $O(\underline{I})$ denote output (without noise) for input sequence $\underline{I}$ then due to the fact that ISI is modeled as an LTI System,

### 3 tap example

Suppose $\underline{I} = [d,d,d,\ldots 1]$ . Now consider ,$\underline{I'} = [-d,d,d,\ldots]$, $\underline\epsilon = [1,0,0,\ldots]$,

$\delta^2(\underline{I},\underline{I'}) = 4d^2[|f_0|^2+|f_1|^2+|f_2|^2]$.

However, this is not the $\underline{\epsilon}$ with the smallest Euclidean distance.Consider,$\underline{I'} = [-d,-d,d,\ldots]$,$\delta^2(\underline{I},\underline{I'}) = 4d^2[|f_0|^2+|f_0+f_1|^2+|f_1+f_2|^2+|f_2|^2]$ which may be smaller than $4d^2[|f_o|^2+|f_1|^2+|f_2|^2]$.

Example: $f_0 = \frac{1}{2}$ ,$f_1 = \frac{-1}{\sqrt{2}}$, $f_2 = \frac{1}{2}$ ,$f_0^2+f_1^2+f_2^2 = 1$ and we also have,

(8)we suffer a -2.32 dB loss!