# Real-Time Hardware Demonstration of 180 Gbps DFT-S OFDM Receiver Based on Digital Sub-banding

# A. Tolmachev<sup>(1)</sup>, M. Meltsin<sup>(1)</sup>, R. Hilgendorf<sup>(1)</sup>, M. Orbah<sup>(1)</sup>, Y. Birk<sup>(1)</sup>, S. Ben-Ezra<sup>(2)</sup> and M. Nazarathy<sup>(1)</sup>

<sup>(1)</sup> EE Faculty., Technion, Haifa 32000, Israel. nazarat@ee.technion.ac.il
 <sup>(2)</sup> Finisar Israel Ltd, 26 Harokmim, Holon, Israel

**Abstract** In just 3 FPGAs we realize fastest (180 Gbps) real-time filter-bank based DFT-S CO-OFDM 16-QAM 25 GHz Rx, at record 1.06 samples/symbol (7.3 b/Hz), demonstrating dual polarization SMF transmission. Extrapolated ASIC would save <50% power.

### Introduction

Coherent optical orthogonal frequency division multiplexing (CO-OFDM), including Discrete Fourier Transform -spread (DFTS-OFDM) variants [1,2] with dual-polarization QAM is a promising technology for optical communication. In recent years, several groups demonstrated real-time high-speed CO-OFDM implementations. For practical systems, overall complexity must be reduced and efficiency be improved. In this work we report the fastest and the most efficient *real-time* receiver (Rx) system, based on our novel sub-banding DFTS-OFDM DSP(Tab.1) [2, 3].

principle suffices for competitive extrapolation of distances. Our key advantages: (i) major reduction (~ factor-of-two) in the real-time HW requirement, attained due to our unique twice-under-decimated sub-banding technique digital theoretically detailed in [2]; (ii) maximizing spectral efficiency (SE) and reducing sampling rate to 1.07 samples/symbol (rather than conventional 2 samples/symbol) due to unique digital subbanding algorithm [2] eliminating spectral guardbands between the sub-bands, e.g. as in [7]. (iii) First real-time highly HW-efficient FPGA implementation of specific novel DSP algorithms

 Tab. 1: Some recent CO-OFDM real-time demonstrations in comparison with our current work

| Author[Ref.] | Net Rate | Baud Rate | DAC/ADC    | HW Platform       | Modulation | Spectral   | Distance |
|--------------|----------|-----------|------------|-------------------|------------|------------|----------|
|              |          |           |            |                   |            | Efficiency |          |
| S. Chen [4]  | 110Gbps  | 18GBd     | 6G/1.5G    | 1 FPGA chip       | DP-QPSK    | 1.9 bit/Hz | 600km    |
| N.Kaneda [5] | 76Gbps   | 29GBd     | 64G/32G    | 4 FPGAs Virtex-7  | DP-QPSK    | 2.9 bit/Hz | 500km    |
| X. Xiao [6]  | 100Gbps  | 16.3GBd   | 63G/ 42G   | 32 FPGAs Virtex-6 | DP-16QAM   | 6.57bit/Hz | 200km    |
| This Work    | 180Ghns  | 25GBd     | 26 6/26 6G | 3 FPGAs* Virtex-6 | DP-16OAM   | 7 3hit/Hz  | 100km    |

\*Note: Our system needs just 3 FPGAs to arbitrarily access on the fly in real-time, any of the 14 data-carrying sub-bands per channel. Extrapolated to a full Rx serving all 15 sub-bands of the 25 GHz channel, the equivalent processing power of 16 Virtex-6 FPGA would be required to process all sub-bands in parallel; still our full extrapolated system would remarkably require a factor-of-two fewer FPGAs than that state-of-the-art demo of Xiao [6], yet our system carries almost twice the bit rate (180 Gbps rather than 100 Gbps in [6]).

This work takes a major step toward more efficient hardware (HW) implementation with higher spectral efficiency and lower sampling rates and significant reduced HW-computational complexity of the receiver (Rx) – as itemized in the note above.

Our demonstrated shorter reach is not an actual system limitation but is due to experimental resource availability. In fact, our attained EVM in

(at the sub-band processor level): *Golay-Complementary-Codes* (GCC) based *timing* recovery and Pilot-assisted *Carrier Frequency Offset* (CFO) compensation, where the pilot resides in the mid sub-band, saving HW-expensive band-pass-filtering.

### System description

The experimental setup (Fig. 1) comprises an optical transmitter (Tx) and Rx for 180 Gbps data





Fig. 1: (left) System block diagram of experimental setup. (right) Real-time 180 Gbps sub-banded DFT-S OFDM Rx on just 3 Virtex-6 FPGAs

payload over a 100Km SMF optical link. At the Tx side, each transmission band is digitally multiplexed out of 16 *Single Carrier* (SC) subbands, realized as DFTS-OFDM 16-QAM per sub-band modulation (Fig. 2). Among the 16 subbands, 14 of them carry uncorrelated PRBS19 data; two sub-bands do not carry data but are reserved for special purposes: the 16<sup>th</sup> wrapped around sub-band provides ADC anti-aliasing filter transition; the 8<sup>th</sup> sub-band carries two ±415.6 MHz pilots for CFO/phase recovery. The 14 remaining sub-bands carry data, 64-DFTS-OFDM

bottom). The novel Rx DSP is detailed in Fig. 3. The basic functions comprise twice-underdecimated analysis *Filter Banks* (FB) [3] incorporating, at the FB outputs, pilot-assisted [9] CFO and nonlinear phase noise mitigation stages. "Classic" OFDM Rx functions are separately performed (in "Sub-band FPGA processors"), on sub-band basis at 8-fold lower rate. In fact, the two X/Y-FB modules are the only ones operating at full rate. The OFDM receiver functions are: RX timing sync based on Golay sequences [10], DFTS-OFDM de-multiplexing



Fig. 2: (left) Transmitter scheme (center) Data Spectrum (right) Results of PAPR reduction for DFT-S OFDM algorithm

per sub-band, corresponding to 1024 subcarriers for all 16 sub-bands. The *Cyclic Prefix* (CP) is just 8 samples (0.83%). Overall spectral efficiency is 8\*14/15\*0.9917=7.4b/Hz. Taking into account that just 1% of the transmitted data suffices for channel estimation, our final spectral efficiency is 7.34b/Hz. To reduce the PAPR, we use a low-complexity combination of a *Partial Transmit Sequence* (PTS) algorithm [8] along with optimized clipping (Fig. 2right). Random phases (±1 factors) are applied to each of the 14 data sub-bands. Out of the 2<sup>14</sup> possible and channel equalization (adaptive LMS based), QAM16 slicer, BER meter, all efficiently implemented in the FPGA HW with innovative solutions for the Golay timing. Although we afforded implementing just a single "Sub-band FPGA processor", the other 13 processors would be identical, in parallel. The sole sub-band processor is switchable to any FB output, enabling to fully evaluate system performance by accessing and demodulating at will any subband (1.6GHz slice) out of the 25 GHz channel spectrum.



Fig. 3: Sub-banded DFT-S OFDM 180 Gb/s receiver digital signal processing – and its partitioning over 3 FPGAs

389

combinations, we verified that 50 random combinations already provide significant improvement of several dB, in exchange for moderate extra complexity and 0.5% SE reduction. The last TX digital stage performs digital high-pass pre-distortion, to compensate for Micram ADC30 and DAC30 roll-off (Fig. 2center-

# **Real-Time Implementation**

All receiver functions (Fig. 3) (except for duplicating identical processor modules for all sub-bands) are real-time implemented resorting *to just 3 FPGA chips*. FPGA X/Y interfaces to the ADC chips, performs calibration of internal ADC parameters and IQ gain/phase imbalance correction, at system bring-up. Following IQ- imbalance correction, the data is de-multiplexed into sub-bands. Sub-banding is performed on sets of 16 input samples; due to 2x oversampling, a factor-of-8 rate slow-down is attained.

We use 8 time-parallelized FB HW modules (Fig. 4left), yielding a temporal slow-down factor of 64 at FPGA clock rate of 415.625MHz=(26.6 GS/s)/64. Post the FB stage, CFO is corrected using data extracted from a pilot sub-band and demultiplexed (frequency parallelized) over multiple sub-band processor HW modules - here a single module, the "Sub-Band FPGA" (Fig. 4right) may



Fig. 4: FPGA layouts: (left) Filter-bank; (right) Sub-Band

be arbitrarily connected to any desired sub-band. This FPGA module also operates at 415.6MHz clock using 8-fold temporal parallelization prior to due to ADC filter attenuation. In all measurements, the tested BER was under the soft FEC 3.8e-3 limit, indicating feasibility for practical optical CO-OFDM implementations at very high spectral and power efficiency.

| Tab. 2: HW Utilization of the Virt |
|------------------------------------|
|------------------------------------|

| Parameter       | Filter Bank FPGA | Sub-Band FPGA |  |  |  |  |  |  |
|-----------------|------------------|---------------|--|--|--|--|--|--|
|                 | XC6VHX565T       | XC6VHX380T    |  |  |  |  |  |  |
| Slices          | 43139 (48%)      | 32789(54%)    |  |  |  |  |  |  |
| DSP Multipliers | 337 (39%)        | 220(25%)      |  |  |  |  |  |  |
| Memory/BRAM     | 308 (20%)        | 160 (19%)     |  |  |  |  |  |  |

## Conclusions

We demonstrated a complete *real-time* Rx over fastest (180 Gbps) *filter-bank* HW in just 3 FPGAs, at record 1.06 samples/symbol (7.3 b/Hz). The FB complexity and sampling rate savings imply that an ASIC implementation of our method for a long-range optical DFTS OFDM Rx would save <50% of the power of a conventional ASIC (even with 50% of ASIC occupied by soft FEC). This work was kindly supported in part by FP7-ICT ASTRON grant n° 318714.



Fig. 5: Experimental results: (left) Measured EsN0 (MER) for various OSNR conditions (middle) Measured EsN0 (MER) vs reach (right) Measured EsN0 (MER) per sub-band (bar-chart) over the various sub-bands (degradation away from center due to ADC/DAC

decimation by factor-of-2 to the sub-band baud rate. Fig. 4 presents the Filter Bank and Sub-Band FPGA chips layout and Table 2 presents FPGA utilization figures of merit (consistent with high HW efficiency).

# **Experimental Results**

To evaluate system performance we measured EVM/EsN0 vs OSNR (Fig. 5left). A dominant impairment was identified as quantization noise. The estimated ADC ENOB for both DAC and ADC was ~4bits. In actual measurements, the effective ADC noise was higher due to lack of adaptive pre-ADC AGC stage. The maximally attainable estimated SNR was ~18dB. Beyond back-to-back, the transmission distance was tested over 100km SMF (Fig. 5middle), yielding minor performance loss, mostly due to setup imperfections. Upon using both polarizations (POL) yielded SNR degradation of ~1dB (likely due to dual-POL PAPR). In addition we measured the performance as a function of subband number (Fig. 5right) and again as expected we see some degradation in performance, mainly

### References

[1] Q.Yang, et. al, "Coherent optical DFT-Spread OFDM...orthogonal band...," OPEX 20, 2379 (2012) [2] M. Nazarathy and A. Tolmachev "Sub-banded DSP architecture...under-decimated filter-banks..." IEEE Sig. Proc. Mag, 31, 70 (2014) [3] A. Tolmachev et al, "Real-time FPGA Implementation of Efficient Filter-Banks for Digitally Sub-banded...", OFC' 13, OW3B.1 (2013) [4] S. Chen, Y. Ma, W. Shieh, "110-Gb/s Multi-Band Real-Time Coherent Optical OFDM Reception after 600-km..., OFC'10, OMS2 (2010) [5] N. Kaneda, N et al, "Field Demonstration of 100-Gb/s Real-Time Coherent Optical OFDM Detection", JLT, 33 1365 (2015)[6] X. Xiao, F. Li, X. Li, and Y. Chen, "100-Gb/s Single-band Real-time Coherent Optical DP-16QAM-OFDM...," OFC'14 Th5C.6 (2014) [7] X. Liu, P. Winzer, C. Sethumadhavan, R. Sebastian, C. Stephen, "Multi-band DFT-spread-OFDM ... ", OFC'13OW3B.2, (2013) [8] L. Cimini, N. Sollenberger, "Peak-to-average power ratio reduction...OFDM... partial transmit sequences," IEEE Comm Lett., 4, 86, (2000). [9] B. Inan et al, "Pilot-tone-based nonlinearity compensation...OFDM ...," *ECOC'10,,* (2010) [10] R. Goldman et al, "...Coherent Optical Time-Domain

Reflectometry With Golay Complementary Codes," JLT, 31, 2207 (2013).