Introduction
To keep pace with the ever-increasing bandwidth requirement of high-performance systems, FPGA vendors have to continually enhance their device architectures. Even with such advanced architectures, designers end up implementing their designs using wide on-chip buses. While this method augments data throughput in the FPGA core, these wide buses incur considerable fabric resources and power. Incorporating similar advancements also result in other issues. In simple words, the conventional FPGA architectures cannot keep up with tomorrow’s performance demands. Some challenges that demand significant improvements in the architecture are:
- The need for bandwidth
Today’s increasingly high-performance computing applications call for higher bandwidths. While routing architectures do make the wires more efficient by using hierarchy and optimization techniques, the die area and power dissipation levels soar too. - The need for efficiency
The designers usually add registers to pipeline a design for greater performance. The result is diminished returns for the pipelining technique due to significant routing delays. - The need for improved clocking
Conventional FPGA core architectures focus on balanced clock trees, which reduces the deterministic skew considerably. This method works well with designs up to 500 MHz, but post the 500 MHz, a next-generation clocking solution becomes imperative.
The Intel HyperFlex FPGA Architecture
The novel Intel HyperFlex FPGA Architecture surpasses unimaginable levels of performance, supporting almost twice the core performance compared to the previous generation FPGAs. Also, combined with the Intel HyperFlex Architecture, designs can run at blazingly fast speeds with the core clock rates as high as 1GHz. All Intel Stratix devices leverage a redesigned core that includes additional registers, known as Hyper-Registers. These registers are present in every interconnects routing segment and at the inputs of functional blocks. When Hyper-Registers are utilized, all logic resources are available for logic functions, and your design is optimized for best-in-class performance. Also, to make the use of the Hyper-Registers convenient, the Intel Quartus Prime software have a Hyper-Aware design flow with:
- Post place-and-route tuning to accelerate timing closure
- Hyper-Aware synthesis along with place-and-route to facilitate efficient pipelining
- Fast Forward Compilation to explore performance augmentation options
To keep up with the performance of the core fabric, the dedicated function blocks in the FPGA core, like the M20K memory and the floating-point digital signal processing (DSP) blocks, have been redesigned to accomplish operations at clock speeds of up to 1 GHz.
Keeping in mind the necessity of a flexible clock network, Intel Stratix 10 FPGAs and SOCs possess programmable clock tree synthesis. The ASIC-like clocking brings down the net power dissipation and helps mitigate uncertainty.
Advantages of Intel HyperFlex
The Hyper-Registers allow you to leverage the conventional performance enhancement methods, which are retiming, pipelining and optimization, implemented in a better way. When carried out with Hyper-Register instead of the traditional ALM registers, these are referred to as Hyper-Retiming, Hyper-Pipelining and Hyper-Optimization.
- Hyper-retiming
The new HyperFlex core architecture harnesses Hyper-Registers to enable fine-grained Hyper-Timing. The Intel software retimes the path by shifting the register out of the logic cell and into the interconnect. Hyper-Registers available in each routing segment make optimization easy. With Hyper-Registers, the retiming granularity becomes really fine, incurring delays in the order of tens of picoseconds. Also, Hyper-Retiming doesn’t affect the existing LABs and ALMs, meaning no additional placement or routing is required and the compilation time is not significantly impacted. - Hyper-pipelining
When utilizing the Intel HyperFlex FPGA, you can pipeline at will without bloating the size of your design. This process is referred to as Hyper-Pipelining. With what is almost zero cost pipelining, you can use this technique aggressively, especially in the datapath and feed-forward logic. Since the software automatically retimes the logic, you only need to specify the number of pipeline registers at the input to the clock domain. The designers who make their design pipeline-friendly will experience great benefits. - Hyper-optimization
Because the performance gains from Hyper-Retiming and Hyper-Pipelining are quite prominent in some specific sections, other areas, like long feedback loops or complex state machines, are exposed as bottlenecks that hinder further gains. With Hyper-Registers, the process of redesigning the circuit to pre-compute the feedback values can achieve greater speeds. - Flexible, high-speed programmable clock tree synthesis
The Intel HyperFlex FPGA Architecture includes an entirely new clock structure with pre-routed clock paths on which the design’s clock region is synthesized. This arrangement offers unprecedented flexibility to create small and localized clock domains. It also facilitates convenient management of skew: taking advantage of beneficial skew when available and reducing it when necessary. - Power efficiency
The augmented performance of Intel HyperFlex Architecture enables a 1024 bit datapath clocked at 350 MHz to be executed as a 512 bit datapath clocked at 700MHz. The Intel HyperFlex gives the designers the freedom to direct part of the performance dividend to enhance clock speed and use the remaining performance dividend to power savings through a reduced core power supply voltage or leverage a slower speed grade device. - Productivity
The newly induced core performance offered by the Intel HyperFlex FPGA Architecture provides benefits not just limited to a faster clock rate but also early timing closure, enhanced design team productivity, and shorter time to market.
Conclusion
The option of relying on the conventional FPGA core architectures blindfolded to meet the needs of next-generation, high-performance designs is out-of-the-box. Modern problems need modern solutions. Analyzing the emerging modalities and hyperplexed structures can help you augment your design team’s and your enterprise’s net throughput drastically.