VLSI IMPLEMENTATION OF CHANNEL ESTIMATION FOR MIMO-OFDM TRANSCEIVER

In this study the VLSI architecture for MIMO-OFDM transceiver and the algorithm for the implementation of MMSE detection in MIMO-OFDM system is proposed. The implemented MIMO-OFDM system is capable of transmitting data at high throughput in physical layer and provides optimized hardware resources while achieving the same data rate. The proposed architecture has low latency, high throughput and efficient resource utilization. The result obtained is compared with the MATLAB results for verification. The main aim is to reduce the hardware complexity of the channel estimation


INTRODUCTION
MIMO refers to radio links with multiple antennas at the transmitter and the receiver side. The antennas at each end of the communications circuit are combined to minimize error and optimize data speed. In multiple antennas, the spatial dimension can be exploited to improve the performance of the wireless link. The MIMO technology help to achieve such significant improvements in array gain, spatial diversity gain, spatial multiplexing gain and reduce interference. The multiple antennas at the transmitter and receiver in a wireless system, where the rich scattering channel can be exploited to create a multiplicity of parallel links over the same radio band to either increase the rate of data transmission through multiplexing or to improve the system reliability through the increased antenna diversity. The performance is high when it is used along with Orthogonal Frequency Division Multiplexing (OFDM) (Haene et al., 2008). OFDM is a multi-carrier transmission technique for high speed bi-directional wireless data communication. OFDM is based on FDM, which is a technology that uses multiple frequencies to simultaneously transmit multiple signals. MIMO-OFDM is a promising technique that embraces advantages of both MIMO system and OFDM, i.e., immunity to delay spread as well as huge transmission capacity (Jungnickel et al., 2005). Different types of Space time codes were applied to MIMO-OFDM system to increase the capacity.
The complexity of MIMO channel estimation is infeasible in practice. Hence, the pilot signals have been chosen orthogonal to allow low-complexity channel estimation for multi-antenna transmissions. In this work, channel estimation is implemented to handle real-time channel estimation with different antenna configurations and mobility. High hardware cost of such systems is an initiative to redesign the critical functional blocks in order to satisfy timing and power constrains as well as to minimize overall circuit complexity and cost.
The objective of this study is to carry out an efficient implementation of the OFDM system (i.e., transmitter and receiver) using "Field Programmable Gate Array (FPGA)" in such a manner that the use of hardware resources can be minimized and system performance is improved. FPGA has been chosen as the target platform because OFDM has large arithmetic processing requirements which can be restricted if implemented in software on a Digital Signal Processor (DSP). In addition, FPGA is flexible for reconfiguration in response to real world performance evaluation. The performance

AJAS
improvements of MIMO technology also entail a considerable increase in signal processing complexity in particular for the separation of the parallel data streams. A major challenge associated with the implementation of wireless communication systems in the design of low complexity MIMO detection algorithm and corresponding VLSI architecture.
The study is organized as follows. Section 2 deals with system model presented in brief and section 3 discuss about channel estimation algorithm. In section 4, VLSI architecture for MIMO-OFDM and MMSE channel estimation were discussed. Section 5 contains conclusion.

Overview and MIMO-OFDM Specification
MIMO refers to the process of transmitting multiple streams of data on multimple antennas at the same frequency in order to increase throughput (Sibille et al., 2010). The essential purpose of a MIMO system is to determine which antenna is corresponding to which data on the receiver side. The Receiver receives data from all the transmitter antennas. In MIMO-OFDM the serial data at the input is a sequence of samples occurring at interval T s . The high-rate serial input data sequence is first applied Serial-to-Parallel (S/P) converter and converted into M lowrate parallel streams, in order to increase the symbol duration to T = MT s , where M is the total number of subcarriers. The Packet Error Rate (PER) and throughput in the physical layer were evaluated under multipath fading conditions in a baseband simulation. The low-rate streams, represented by the symbols b m [k], m = 0, 1... M-1, k = 1, 2… are modulated into different subcarriers. In order to eliminate interference between parallel data streams, each of the low-rate data streams is modulated into a distinct subcarrier belonging to an orthogonal set with subcarrier spacing 1/T. the parallel streams are then multiplied and a cyclic prefix is added to eliminate the effect of ISI. Equation 1 shows the expression for signal transmitted y (t) during the symbol interval: where, b m [k] is the k th data symbol of the m th stream. The transmitted signal y(t) passes through the wireless channel which introduces signal distortion and additive noise.
The channel can be modelled as a multipath frequency selective fading channel using a tapped-delay line with time-varying coefficients and fixed tap spacing (Prasad, 2004). Equation 2 represents time varying impulse response of the channel which is given by: where, h 1 and τ 1 are the complex amplitude and delay of the path respectively. For OFDM to be effective, the length of the cyclic prefix should be larger than the maximum multipath delay spread of the channel.
The received signal r (t) can be represented as shown in Equation 3: The received signal r (t) is first demodulated after cyclic prefix removal. For practical implementation, modulation and demodulation can be achieved by Inverse Fast Fourier Transform and Fast Fourier Transform respectively.
Channel estimation is applied to obtain the estimate of channel fading in each subcarrier such that the coherent detection can be achieved. By assuming that the channel impulse response is quasi-static during the k th symbol interval so that h(t) = h(kT) for kT≤t≤(k+1)T, the inter carrier interference can be neglected compared to noise. The data symbols (pilots) are transmitted at the beginning of the session or multiplexed into the user data stream at a later stage and the initial estimation of channel parameters is performed using the received pilot signal. The channel is frequency selective and time-varying for wideband mobile communication systems so the dynamic estimation of the channel is important before the demodulation of OFDM signals. The channel estimation can be done by inserting pilot tones into all of the subcarriers of OFDM symbols with a Specific period or inserting pilot tones into each OFDM symbol. The mathematical model for pilot symbols is as follows: At time t, antenna 0 transmits P 0 and antenna 1transmits P 1. At the next time t+T, antenna 0 transmits -P 1 and antenna 1 transmits P 0 . Where, R 0 and R 1 are the received data from both antennas. P 0 and P 1 are the pilot symbols whose value is calculated using PRBS generator. By solving the simultaneous Equation 4 and 5, we can obtain the channel estimates. The BER and MSE are Science Publications AJAS improved in MIMO-OFDM systems by using pilot based channel estimation (Jiang et al., 2011).
In MIMO-OFDM systems, pilot symbols are inserted during subcarrier mapping in both time and frequency directions such that the receiver can estimate time-variant radio channels. The reference signals transmitted from multiple antennas are orthogonal to each other. As a consequence the channel impulse response between different Tx-Rx antenna pairs can be estimated individually. The received OFDM symbol y at one receive antenna is shown in Equation 6: where, w is additive white Gaussian noise with variance 2 w σ at the receive antenna and the vector h contains the channel coefficients in the frequency domain. The matrix X comprises permuted data symbols x d and pilot symbols x p on the main diagonal. The permutation is given by the permutation matrix. Equation 7-9 represents permutation matrix: The data rate is controllable by combining puncturing and QAM mapping. When the communication environment changes, the transceiver system calculates the Bit-Error-Rate (BER). If BER is beyond a threshold value, this system changes the puncturing and QAM mapping options. If the block used is 64-QAM and the error rate is higher than the threshold, then the system replaces the 64-QAM block with next reliable block (16-QAM). While the system is changing the QAM mapping, it can also change the puncturing block to one with a different rate. BPSK provides the lowest data rate but it is the most reliable. The 64-QAM provides highest data rate, but least reliable. As another technique to control data rate, we can use spatial multiplexing algorithm to parse the kernel when the BER is low.

CHANNEL ESTIMATION
Channel estimation is necessary for all coherentdetection aided transceivers and its accuracy has a great impact on the achievable BER. This is normally carried out by incorporating the pilot symbols into the information stream, where the pilots are known at the receiver. However, the pilots transmitted result in a loss of the effective data throughput, which may be quite high for high velocity vehicles and a high number of transmit antennas (Simko et al., 2011). More explicitly, the channel estimation complexity and the pilot overhead required for achieving accurate channel estimation may become particularly high in the context of MIMO systems.
Channel estimation is performed in the frequency domain and realized with multiple correlation circuits operating in parallel in the same FPGA. The correlation is performed over multiple training symbols using a dedicated memory cell for each I and Q branch for each pair of transmit and receive antennas and each sub carrier. Due to the additional read-write operations from and to the memory cell, the channel estimator is revised for operation at 100 MHz The block diagram of simplified channel estimation process is depicted in Fig. 1.
The estimation error is denoted as e (n). The aim of most channel estimation algorithm is to minimize the Mean Squared Error (MMSE), E [e 2 (n)] while utilizing as little computational resources as possible in the estimation process. Accurate channel estimates needed in MIMO-OFDM systems for decoding purposes (Wang et al., 2008).
The number of degrees of freedom in the estimated frequency-domain channel coefficients depends on the length of the time-domain channel impulse response. Assuming the channel being shorter than the cyclic prefix and a number of subcarriers that is larger than the cyclic prefix length, the estimated frequency-domain channel coefficients are correlated.

MIMO-OFDM System
To implement MIMO-OFDM physical layer many efforts have been carried out by VLSI communication groups. Most straightforward and efficient way to design hardware-efficient MIMO-OFDM system is to reduce hardware resource used for Fast Fourier Transform (FFT) block. Multiple independent channels need multiple FFT operations. If N independent channels come at the same time, N parallel FFT blocks are needed. If each channel arrives at different time, resource sharing is possible between each channel. It means that the number of FFT blocks can be reduced. As parallelism used for FFT processing is less, hardware area occupied by FFT block becomes smaller dramatically. If N FFT operations can be done by single FFT block without any parallelism, it can be regarded as a conceptually ideal system (Storn, 1994). For the FPGA implementation, the MIMO-OFDM physical layer is partitioned into synchronization, OFDM modulation/demodulation, channel estimation, tracking, MIMO pre-processing and detection, a First-Input First-Output (FIFO) buffer and channel de/coding. Since the data flow in transmit mode is trivial, the explanations focus on the receive mode. The synchronization unit is active only when the channel is idle and during reception of the preamble. After successful synchronization, OFDM demodulation starts with the beginning of the training sequence. During this training, the output of the OFDM demodulator is fed to the channel estimation unit. The resulting estimates Ĥ [k] for the channel matrices for all data and pilot tones are stored in a memory (Mehlfuhrer et al., 2008).
In transmit mode, the OFDM de/modulation unit maps binary data to complex-valued constellation points, computes the superposition of all modulated tones with an IFFT transform and inserts the cyclic prefix. At the beginning of each frame, the de/modulation unit also outputs the preamble, whose time domain representation is stored in RAMs instead of being generated at run-time to reduce the transmit latency to a minimum. In receive mode, the same unit demodulates the received OFDM symbols by means of FFT transforms. For the de/modulation of OFDM symbols, a 64-point I/FFT processor with a single radix-4 processing element is shared among the transmit and receive data paths. The different spatial streams are processed in a time-interleaved fashion by the same hardware. The memory unit stores the complex-valued vector to be processed. In order to provide sufficient memory bandwidth, the storage is divided into four separate, dual-ported memory banks, each holding up to 16 complex-valued data words. Figure 2 shows the test bench waveform for a 4×4 MIMO-OFDM transceiver.
Area report generated upon simulation is presented in Table 1 which shows the device utilization facor. It can be noted that device utilization factor is low for the implementation. The proposed architecture has a fixed latency and 150 times faster performance (with 512 OFDM subcarriers) due to its complete pipelined architecture. Our architecture is thus better suited for high-speed wireless transceivers with a large number of OFDM subcarriers.

MMSE Channel Estimation
For a MIMO-OFDM system with MT transmit antennas, MR receive antennas and K data subcarriers, the MIMO channel for the k th data subcarrier is given by H k with an M T X M R matrix in the frequency domain. The received vector for the t th data symbol is shown in Equation 10: k k k k y (t) = H s (t) + n (t) where, S k (t) is the transmitted vector and n k (t) is the channel noise. H k is estimated from the training symbols. In MIMO detection, the channel matrix is inverted to extract S k (t) from the received vector. The inverse matrix given by the MMSE algorithm is: where, σ 2 indicates the noise variance and (·)H denotes a function of hermitian transpose. The matrix inversion described by Equation 11 is called "preprocessing." The final step is to decode the approximate transmit vectors by using the Equation 12 which represents the estimate of transmitted vector.  The MMSE estimator employs the second-order statistics of the channel conditions to minimize the meansquare error . It is denoted by R gg , R HH and R YY the auto-covariance matrix of g , Hand Y respectively and by R gy the cross covariance matrix and also denoted by The pre-processing represented in Equation 11 is the most costly computation in the MMSE detection. This hardware implementation, especially for matrix inversion. The MMSE-MIMO detector and 4×4 MIMO-OFDM transceiver were implemented.

Fig. 3. Test bench waveform for MMSE channel estimation
The proposed architecture needs a large word length and has a considerable circuit scale. However, the latency in the MIMO-OFDM is superior when there are a large number of OFDM subcarriers. The factor K indicates the number of data subcarriers. The latency of the conventional architectures exceeds the OFDM symbol duration when there is more than 128 OFDM subcarriers. Figure 3 shows the simulation of MMSE estimator.

CONCLUSION
In this study, the MIMO-OFDM transceiver system is designed using VHDL codes. Each block and its sub blocks are tested separately and the errors are corrected. Test-bench waveforms are generated for MIMO OFDM system and MMSE channel estimation block. The results are compared with the previously calculated MATLAB Results to ensure the correctness of the design prior to implementation on FPGA. The proposed implementation saves more amounts of the hardware resources and minimizes the effect of worst case disturbance on the estimation error of the channel during uncertainty conditions. This channel estimator offer good performance-complexity trade-off.