Thursday, September 23, 2010

Difference between DSP and FPGA

Life used to be easy. If you were working on a multiprocessor signal processing application, you would write down the requirements, check the specs of the devices on offer from the major DSP vendors, and just pick the chip that suited best.
Times have changed, and today's engineers are blessed with far more choice. The big FPGA vendors have stepped up their offerings for signal processing, and choosing the best solution can seem complex.
For distributed applications, the choice of interconnect technology can also obviously have a crucial effect on the overall solution. Crunching the data is all very well, but your system needs to have the right interfaces to move it around between the different processors, and off-load the results. What do the DSP and FPGA vendors have to offer in this area?
This article will look at what's available for multiprocessor systems (which inevitably tends to mean the high-performance end of the market), and how you can make the best choice between DSP, FPGA or a hybrid mixture of the two. We'll look fairly briefly at issues involved in the two types of chip, but concentrate more on system-level factors.
For high-performance signal processing applications, of course there are other options beyond DSPs and FPGAs. Massively parallel processors from vendors like picoChip are one alternative, but unfortunately often require the use of the vendor's proprietary toolset. ASICs and ASSPs are also well-suited to certain signal processing tasks, but their high up-front costs rule them out except in high volume applications.

DSPs evaluated

Pretty much since their invention in the 1980s, DSPs have provided excellent performance at reasonable power and cost levels. A large community of experienced DSP engineers has also grown alongside the technology, developing a substantial base of off-the-shelf field proven code to run on the DSP cores. There is also a well-established support from third party vendors for debug and optimization tools.
High performance DSPs continue to develop with faster clock speeds and multicore solutions. Very-long instruction word (VLIW) DSPs provide high clock rates and independent execution units to get the maximum speed.
DSP development cost is relatively low, and as a mature technology it can be argued that it has a lower risk and faster time-to-market than FPGAs and other signal processing technologies.
DSPs can be attractive for many applications, which are based on emerging standards, which often change frequently and rapidly. As DSP algorithms can be readily implemented in an accessible language such as C, it is easier to update the code to reflect changes in the standards as they occur. In addition, the complex nature of many of the signal processing algorithms in applications such as the latest wireless standards often make them more suitable to implement using a DSP: it is much easier for a DSP device to change the processing algorithm on-the-fly by calling a different software routine. While modern FPGAs can be reconfigured quickly, to achieve this dynamically while continuing to process data is a complex and challenging task.
DSPs are also improving their performance in the field of power. Led by the demands of the hand-held market, some next generation high-performance DSPs are incorporating power management techniques from their little brothers. This allows overall system power dissipation to be reduced during times of low traffic or to prevent over-temperature. A power and temperature-aware FPGA configuration could, of course, manage its clock domains in a similar way, but at the cost of greater development effort.
However, the DSP is not particularly well suited to parallel processing tasks: multiple devices can be required for tasks, which easily fit into a single FPGA. For example, in wireless baseband applications, for the processing of WiMAX OFDM Access (OFDMA) channels, a pure DSP solution cannot match an FPGA in the bandwidth and number of channels it can process. Consequently the DSP solution may have an unacceptable cost and power per channel.
To improve DSP performance in specific algorithms, vendors have introduced hardware cores to handle some processing traditionally off-loaded to FPGAs. For example TI's TCI6482 DSP includes Viterbi and turbo decoder co-processors for 3GPP and 3GPP2, while the multi-core TCI6487 DSP also includes a direct Common Public Radio Interface (CPRI) / Open Base Station Architecture Initiative (OBSAI) interface which can be chained between DSPs.
The FPGA alternative

FPGAs have one big advantage over DSPs: their efficiency in concurrent applications, achieved by using multiple parallel processing blocks. Coupled with their flexibility to allow the embedded systems designer to tailor the device to match their application's demands as closely as possible, FPGAs can achieve the highest possible throughput with low cost per channel.
The FPGAs' flexibility has traditionally come with an additional cost in power due to the increased gate count and silicon area of non-optimized solutions in comparison to hardwired architectures. However, 65nm technologies and the use of equivalent ASIC technology for volume manufacture mean that FPGAs can be low-power in the lab, and power-reduced further in volume.
The per-channel power of an FPGA may now be well be below that of DSPs, even though the chip-level power dissipation is higher. DSPs typically consume 3-4W and FPGAs 7-10W but FPGAs can handle 10x the channel density.
Acknowledging the advantages of DSPs has seen a shift in recent years to FPGAs incorporating DSP technology, for example Xilnx Virtex-5 SXT devices. This enables the FPGA to incorporate DSP algorithmic processing for tasks, which are not naturally parallel. Such "DSP-enabled" FPGAs have shown huge throughput advantages for certain types of signal processing, which has been reflected in their success in the high-end processing market. However, FPGAs are in general ill suited to processing sequential conditional data flow.

Programming FPGAs remains difficult, usually requiring a hardware-oriented language such as Verilog or VHDL. FPGA solutions can take an order of magnitude longer to code than DSP solutions which impacts development costs and increases time to market.
C-based synthesis tools have yet to deliver the ease of use and performance of C-coded processor solutions. High-level representations such as Simulink block diagram synthesis are not currently widely adopted and old FPGA synthesis methods still persist, especially where maximum performance is required.
Hybrid multi-processor systems

From a design engineer's point of view, this neck-and-neck technology development of FPGAs and DSPs is enabling them to find new and better solutions for signal processing applications. There is no simple answer as to whether FPGAs or DSPs are superior, and for many applications the best approach is a hybrid system, including both technologies to provide a solution that is superior to the sum of its parts.

Recent developments in FPGA technology redress many of the long-held preconceptions about their use, and have met many engineers' concerns about power, cost and complexity. Developing signal processing applications on FPGAs still requires significantly more effort than for DSPs, even with the high-level development tools and libraries available from FGPA vendors. Finding the right engineers with DSP and system-level experience to develop applications on FPGAs can also be tricky.
Independent benchmarks can be a valuable help of choosing the best device. For example, in 2007, BDTI published an analysis of FPGAs in DSP applications. The BDTI tests looked at cost/performance in a typical multi-channel communications application. The results are clear-cut, with the FPGA delivering a cost-per-channel figure of better than 20x compared to the DSP. This does not mean that FPGAs are necessarily best for high-performance signal processing applications, but certainly demonstrates they can have clear performance advantages over DSPs in some circumstances.
Another important factor is the IP cores and software libraries geared at particular target applications, which are often provided by vendors. These can alleviate some of the reliance on in-house development of complex algorithms using the vendor tools and further reduce time-to-market.
The key advantages of the DSP are reduced development time for new and complex algorithms, and flexibility to run many different algorithms. For the FPGA, its number one benefit is efficiency gains from parallel processing. In many applications, such as image processing and wireless baseband processing, there is a mixture of these repetitive, simple processing tasks that are best suited to an FPGA and more complex and less predictable tasks that are perhaps better handled with a DSP. Additionally, as parallel processing blocks implemented in FPGAs become mainstream, they are increasingly likely to be integrated into the DSP vendors' silicon.
Overall, this means that a hybrid system containing DSPs and FPGAs can often provide the best solution for high-performance multi-processing applications, allowing each device to play to its strengths. The key to this particular debate is to look at FPGAs and DSPs as complementary technologies, rather than competition for each other.

No comments:

Post a Comment