Ernest Jamro Kat. Elektroniki AGH, Kraków Dep. Of Electronics, AGH

1 Ernest Jamro Kat. Elektroniki AGH, Kraków Dep. Of Elect...

Author: Rościsława Szafarczyk

0 downloads 3 Views

1 Ernest Jamro Kat. Elektroniki AGH, Kraków Dep. Of Electronics, AGHHardware Implementation of Algorithms Sprzętowa Implementacja Algorytmów Układy mnożące, konwolwery Multipliers, convolvers Ernest Jamro Kat. Elektroniki AGH, Kraków Dep. Of Electronics, AGH

2 Mnożenie / Multiplication1 X + 9 x 11= 99

3 Parallel Array Multipliers Mnożenie równoległe

4 FPGA, Muilt-in multiplier DSP48

5 Sequantial Multiplier /Mnożenie sekwencyjne

6 Wallace Tree Multiplier (with Carry Save Adders)W układach FPGA nie zaleca się stosowania CSA In FPGA the CSA are not recommended

7 Mnożenie ze znakiem / Multiplication of Sign numbersZnak, Moduł / Sign-Module Standardowe mnożenie liczb dodatnich / Standard unsigned multiplication Znak= Znak1 XOR Znak2 Sign= Sign1 xor Sign2 W kodzie uzupełnień do dwóch Two’s Complement (a1+a2)*(b1+b2)= a1b1+ a1b2+a2b1+a2b2 C. R. Baugh and B. A.Wooley, “A two’s complement parallel array multiplication algorithm,” IEEE Trans. Comput., vol. C-22, pp. 1045–1047, Dec

8 Mnożenie w kodzie uzupełnień do 2 / Two’s complement multiplication

9 Układ mnożący o zredukowanej szerokości / Reduced-width multiplier

10 Kompensacja błędu redukcji / Truncation error compensation

11 Mnożenie przez stały współczynnik / Constant Coefficient Multiplier Look Up Table (LUT)Example: Y= 5*X Address Data 0 0 1 5 2 10

12 LUT-based Multiplier Constant Coefficient: C Y = CA = CA(0:3) + 24 CA(4:7)

13 Different ROM sizes Input data width = 6 bits

14 Heteregenous memory usage Virtex: 161, 321, 4k1, 2k2, 1k4, 5128, 25616 Input data and coefficient width= 14

15 Exchange distributed RAM to BRAMCLB BRAM

16 Area [CLB] for different input and coeffitinent width KEquvalent cost of 1 BRAM Only CLB, scale 1:10 # of BRAM

17 MM (Multiplierless Multiplication) Mnożenie bezmnożneBinary Representation, example B= 14= 11102 M= AB= (A<<1)+(A<<2)+(A<<3) Sub-structure Sharing (SS) example B= 27= tmp= A + (A<<1) M= AB= tmp + (tmp<<3) Canonic Sign Digit (CSD) set {0, 1, -1} (0 – no operation, 1 – addition, -1 (1) – subtraction) example: B= 7 = B= 1001CSD M=B·A= (A<<2) + (A<<1) + A M= (A<<3)-A

18 BINARNIE  CSD insert symbol ‘1’ only if the total number of operation is reducedStandard Modified

19 Applience of different techniques of MM

20 The MM cost for different coefficients

21 Filters FIR

22 Filter FIR (sposób pośredni/ transposed)

23 FIR 2D

24 Examples of 2D FIR Filters1 2 4 -1 -2 1 2 1 -8 Low-Pass Sobel Laplace

25 FIR Filter N=2 LUT-based multipliersz-1 LUT M0 L0 M1 L1 In 8 4 Adder1 Adder0 Adder2 12 13 4 18 Multiplier 1 Multiplier 2 Adder1 Adder0 Adder2 12 13 9 4 14 18 Adders Block

26 FIR, Arytmetyka w innej kolejności (Parallel) Distributed Arithmeticdifferent bits of the input input coefficient

27 Arytmetyka Rozproszona (Distributed Arithmetic)The same input bit weight (smaller LUT widths)

28 Filtry FIR z liniową fazą / Linear Phase Filters (symetryczne/ symmetric: h(0)=h(N-1), h(1)=h(N-2), ...)

29 FPGA, Muilt-in multiplier DSP48

30 Example of sub-structure sharing for FIR filtersH(z)= z-1 + 5z-2 = z z-2 Example 1: A= 5 = temporary expression H(z)= A + ( A)z-1 + Az-2 Example 2: A= 1 + z-1 H(z)= 5A + 8z-1 + 5z-2

31 Materiały dodatkowe The END

32 Szybkie mnożenie w układach FPGA26·(2·a7 ·b + a6 ·b)

33 Układy mnożące w FPGA (a7 and bi) xor (a6 and bi+1) Przykład: G4 - a7G3 - bi G2 - a6 G1 - bi+1 F4 – a7 F3 – bi-1 F2 – a6 F1 – bi Fragment of Virtex Configurable Logic Block (CLB)

Ernest Jamro Kat. Elektroniki AGH, Kraków Dep. Of Electronics, AGH

Recommend Documents