Godkända
Konstruktion av "Coarse-Grained" rekonfigurerbar arkitektur för digital signalbehandling
Chenxin Zhang (SoC07)
Start
2008-07-15
Presentation
2009-02-05 13:15
Plats:
E:2311
Avslutat:
2009-02-05
Examensrapport:
Sammanfattning
Reconfigurable computing is an emerging trend for embedded system design. With the use of platform containing a reconfigurable architecture, it is possible to accelerate arbitrary algorithms that are executing on an embedded system. To achieve high performance to a feasible hardware cost, the reconfigurable architecture should be a trade-off between efficiency and flexibility. This thesis discusses design and implementation of the coarse-grained reconfigurable architecture targeting for digital signal processing applications. The proposed reconfigurable architecture is constructed from a mesh of resource cells, divided into processing and memory cells, which communicate using a combination of local interconnections and a global hierarchical routing network. The processing cell can further be distinguished from a generic RISC processor and a CORDIC cell. High performance local interconnections generate a high communication bandwidth between neighboring cells, while the global network provides flexibility and access to external modules. All the cell modules developed in the reconfigurable architecture are design-time configurable, where different hardware structure can be generated depending on the user requests. Besides, the processing and memory cells are run-time reconfigurable to enable flexible application mapping. A 4-by-2 reconfigurable cell array containing four 16/32-bit RISC processor cells, three smart memory cells and one configurable CORDIC cell has been designed and implemented in HDL, and has been eventually integrated as a coprocessor into an embedded system. Applications of a time-multiplexed FIR filter and a 32~1,024-point time-multiplexed radix-22 FFT have been manually mapped onto the constructed cell array and have been verified on an FPGA platform, the Virtex-II Pro-30-7ff896 from Xilinx. It is shown that the reconfiguration code size for the mapped FFT implementation on the cell array outperforms ordinary DSP processors by a factor of 8, and the number of used clock cycles is reduced with ~20%.
Handledare: Thomas Lenart (Oticon)
Examinator: Viktor Öwall (EIT)