Design of Convolutional Neural Network Optimization Algorithm Based on Embedded System and Its Application in Real-Time Image Processing
Published Online: Mar 24, 2025
Received: Oct 06, 2024
Accepted: Feb 02, 2025
DOI: https://doi.org/10.2478/amns-2025-0744
Keywords
© 2025 Baoyuan Liu et al., published by Sciendo
This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
With the rapid development of artificial intelligence technology, optimizing the convolutional operation of convolutional neural network (hereinafter referred to as CNN) to adapt to the resource constraints of embedded systems has become one of the current research hotspots. In this paper, we explain the basic connotation of CNN and embedded platform Zynq, and optimize the Im2col-Gemm algorithm based on Darknet framework, so as to further optimize the CNN model. The CNN before and after optimization under different hardware configurations are compared through acceleration tests, and the average time spent on each layer and the total time of CNN operations are recorded, which clearly concludes that the Zynq combining the optimized CNN can achieve 658.12 and 23.18 times acceleration with respect to CPU and GPU, respectively. Through the character recognition detection and traffic sign detection, Zynq’s character recognition with optimized CNN achieves 220FPS with less than 4.5W power consumption, and it only takes about 4.5ms to recognize a picture. Meanwhile, the traffic sign recognition has a high recognition rate of 97.8% on average and a low leakage rate of 8.28%, which verifies that Zynq with optimized CNN is fast and consumes low power, which is advantageous for applications in real-time image processing. Optimizing CNN based on embedded systems helps promote the continuous upgrading of artificial intelligence.
