Accesso libero

Design of Convolutional Neural Network Optimization Algorithm Based on Embedded System and Its Application in Real-Time Image Processing

 e   
24 mar 2025
INFORMAZIONI SU QUESTO ARTICOLO

Cita
Scarica la copertina

Figure 1.

Convolutional diagram
Convolutional diagram

Figure 2.

Multiplication-cum-operation diagram
Multiplication-cum-operation diagram

Figure 3.

Architecture diagram of ZCU104
Architecture diagram of ZCU104

Figure 4.

Darknet framework block diagram
Darknet framework block diagram

Figure 5.

Diagram of different resources used by CNN acceleration
Diagram of different resources used by CNN acceleration

Logical resource consumption statistics

LUT LUTRAM DSP BRAM36K FF
Convolution accelerator 28.9K 14.7K 331 43 17.6K
ARM soft core 14.8K 0 5 19 3.2K
AXI DMA 28.1K 2.21K 0 31.6 6.7K
total 71.8K 16.91K 336 93.6 27.5K

CNN performance test results

t number 1 2 3 4 5
Sample size 100 100 100 100 100
True number 130 159 142 129 155
Number of omissions 6 17 9 10 19
Missing rate 4.6% 10.7% 6.3% 7.6% 12.2%
Average missed detection rate 8.28%
Identification number 128 154 138 128 151
Recognition rate 98.4% 96.8% 97.2% 99.2% 97.4%
Average recognition rate 97.8%
Sheet time 1.23s 1.46s 0.92s 1.12s 1.24s
Average time spent per sheet 1.19s

Comparison of CNN network operation time in different hardware

Operation time/s Cortex-A9 single-core Intel CPU Zynq-7035
Conv1+Pool1 20.5745 0.9790 0.1619
Conv2+Pool2 161.4176 5.4650 0.2265
Conv3 160.3363 5.4470 0.2023
Conv4+Pool3 320.6059 10.9780 0.3118
Conv5 159.0134 5.5630 0.1875
Conv6+Pool4 318.0369 11.0760 0.3334
Conv7 79.7953 2.7340 0.1552
Conv8+Pool5 79.7529 2.7350 0.1497
FC1 4.5438 0.0530 0.2569
FC2 0.0261 0.0000 0.0021
FC3 0.0002 0.0000 0.0000
Total time 1304.1029 45.03 1.9858
Total duration ratio ×658.12 ×23.18 ×1.0
Convolution layer duration ratio 99.72% 99.69% 87.15%

Comparison of object detection hardware-level acceleration experiments

CPU GPU Zynq
Experimental platform Intel core i510400f NVIDIA GTX 3060 ZU5EV
Development language C C Verilog HDL
mAP 71% 71% 69%
Data accuracy Float32 Float32 INT8
FPS 30 355 220
Power consumption(W) 70W 185W 4.5W
Handling capacity(GOP) 0.94 202 145
Energy efficiency ratio(GOP/W) 0.15 1.13 31.4
Clock frequency 2.8GHz 1321MHz 201MHz
Lingua:
Inglese
Frequenza di pubblicazione:
1 volte all'anno
Argomenti della rivista:
Scienze biologiche, Scienze della vita, altro, Matematica, Matematica applicata, Matematica generale, Fisica, Fisica, altro