Research on Control Algorithm of Visual Servo Grasping Operating System for Unmanned Aerial Vehicle Robot Arm for Transmission Line Inspection
Publicado en línea: 22 sept 2025
Recibido: 12 ene 2025
Aceptado: 25 abr 2025
DOI: https://doi.org/10.2478/amns-2025-0955
Palabras clave
© 2025 Shi Ru et al., published by Sciendo
This work is licensed under the Creative Commons Attribution 4.0 International License.
Robotic systems are intelligent, reusable, and lifeless, and can replace humans for long-distance sensory exploration or operational tasks in complex and hazardous environments [1]. Unmanned aerial vehicles (UAVs) are capable of flying movements in three-dimensional space and have better maneuverability than ground-mobile robots, expanding their applications to a wide airspace. Rotary-wing UAVs can take off vertically and hover at a fixed point in the air, and are simple to operate, maneuverable, and compact. It has a natural structural advantage over fixed-wing UAVs, and thus the research and applications related to multi-rotor vehicles are developing rapidly [2–3].
At present, multi-rotor UAVs are mostly used in the fields of exploration and remote sensing, monitoring and inspection, security and search and rescue, and mapping and terrain mapping, etc. They are equipped with a variety of sensors to perceive the surrounding environment, and analyze and process the information obtained in a high-efficiency and all-round way. Multi-rotor UAVs are able to fly flexibly and rapidly in a wide airspace, and thus can monitor targets of interest or large areas of the environment [4–5]. These UAV applications are not yet able to physically interact with the environment in an effective and substantial manner and have not realized their great potential. The natural phenomenon of birds swooping to catch prey has inspired researchers to install robotic arms on their vehicles, and multi-rotor UAVs thus have the ability to actively operate and alter their environment. Rotor-flying robotic arm systems typically consist of a multi-rotor UAV platform and a multi-axis robotic arm, with the exact number and configuration of rotors and arm structure adapted to the actual application requirements. The rotor flying robotic arm draws on the advantageous characteristics of the two sub-modules, and has the ability to fly, hover, cruise, and maneuver quickly, as well as manipulate targets for flexible operations [6–7].
In recent years there has been an increasing amount of research directed towards flying robotic arm systems, and a large number of innovative research results have emerged from academia and industry [8]. If the flying robotic arm can accurately and stably grasp objects, it can in turn expand tasks such as nursing and equipment operation. Most fixed-platform robotic arms take the process of target object localization, determining object attitude, estimating grasping points, planning grasping paths, controlling joint movements and gripper jaw opening and closing to complete the whole grasping process [9–10]. In the flying robotic arm grasping scenario, the on-board visual perception technology detects and estimates the state of various types of target objects, and provides the surrounding environment information to the flight control and autonomous navigation system, such as distance measurement, target detection and tracking, and obstacle avoidance module. Thanks to the rapid development and wide application of deep learning, deep learning detectors circumvent the need for manually designed complex feature representations based on earlier traditional target detection methods. Utilizing data-driven, various types of general detectors and improved lightweight target detectors based on deep learning have been widely applied on UAV platforms with excellent results [11–12].
With the rapid development of the electric power industry and the continuous expansion of the power grid scale, the electric power inspection work is facing more and more challenges. The traditional power inspection method mainly relies on manual labor, which is not only inefficient, but also has safety risks. Therefore, it is of great significance to study the UAV power inspection technology to improve the efficiency and safety of power inspection to ensure the stable operation of the power system [13–14].
The necessity of UAV power inspection technology research is firstly reflected in its high efficiency. UAV has the advantages of mobility and flexibility, speed, wide range, etc. It can quickly reach the inspection area, and carry out high-precision and high-efficiency inspection of electric power equipment. Compared with the traditional manual inspection method, drone inspection can greatly improve inspection efficiency, shorten the inspection cycle, timely detection and treatment of equipment failure, so as to protect the normal operation of the power system. Secondly, UAV power inspection technology research is also of great significance to improve the safety of power inspection [15–16]. In the process of electric power inspection, manual inspection often needs to be close to high-voltage equipment, there are electrocution, fall from height and other safety risks. The drone inspection can complete the inspection task without contacting the equipment, effectively avoiding the safety risks in manual inspection, and safeguarding the life safety of inspectors. Finally, the study of drone power inspection technology also helps to promote the intelligent development of the power industry [17–18]. With the continuous development of artificial intelligence, machine learning and other technologies, drone power inspection technology can achieve more intelligent and automated inspection. Through the intelligent analysis of inspection data, it can realize the state monitoring of power equipment, fault prediction and other functions, and further improve the operational efficiency and reliability of the power system [19–20].
In summary, the research of UAV power inspection technology is of great significance for improving the efficiency and safety of power inspection and promoting the intelligent development of the power industry. In the future, with the continuous development and application of UAV technology, UAV power inspection technology will play an increasingly important role in the power industry [21–22].
In applying the UAV robotic arm visual servo grasping operation system for transmission line inspection, this paper uses the Cartesian spatial trajectory to plan the position information of the end of the robotic arm. For the problem of obstacles in the transmission line inspection environment, an improved RRT algorithm is designed to solve the problem of collision-free motion planning of the robotic arm, which has a faster search efficiency and better results for narrow passages. In the control method of robotic arm transmission line inspection for grasping movement, the position-based visual servo control method is adopted, but it is more sensitive to the change of light and view angle. Therefore, a position estimation algorithm based on Extended Kalman Smooth Variable Structure Filtering is added to the visual servo control. The data fusion technique is used to improve the robustness and stability of the system. In this way, the control of UAV robotic arm visual servo gripping operation system is realized.
When applying the UAV robotic arm visual servo grasping operating system for transmission line inspection, the Cartesian space trajectory planning robotic arm end position information is used. Although finally for the robotic arm control also finally need to inverse solution to each joint in the joint space to control, but the task performed is not the same goal, the end position of the Cartesian space trajectory planning is a function of time, this subsection will be introduced to the straight line trajectory planning [23] and circular arc trajectory planning [24] two Cartesian space trajectory planning methods.
The common interpolation methods are interpolation according to timed length and fixed distance interpolation according to distance. Timed interpolation refers to interpolation according to a certain time step, but this method also has a certain problem, that is, if the teaching distance is too large, it will lead to the interpolation distance is far away and therefore the accuracy is reduced. Fixed-distance interpolation refers to interpolating according to a certain distance step, and for high accuracy, the interpolation distance only needs to be small.
If two points in Cartesian space need to be planned in a straight line, it is actually planning the position of the middle point, i.e., position and attitude. In practice, the linear motion process does not need to change the attitude, that is, often need to consider the position planning.
Spatial arc trajectory planning is different from spatial linear planning, because it involves changes in the coordinate system, so it is relatively more complex. First of all, the default robotic arm needs from
Sampling planning and optimal planning are common planning problems for robotic arms. In practice, especially in the transmission line inspection environment, there are generally other obstacles in the operating space of the robotic arm, so it is necessary to pay attention to the collision-free motion planning problem of the robotic arm, which is also known as the sampling planning problem. Nowadays, the two common and most common algorithms are Probabilistic Graph Method (PRM) and Rapidly Expanded Random Tree (RRT) [25].
The algorithms in this paper are all based on the RRT algorithm, which is a single-query planning algorithm, as can also be seen from its name, which is a way of slowly and continuously expanding from the root node until the end point is found, whereupon the path that is found is the connected path that is planned. As PRM needs modeling, its computation is large, time-consuming and inefficient, while the RRT speed for PRM has a significant improvement, more efficient, which is often RRT in the robotic arm collision-free motion planning tends to be more widely used.
The 3D trajectory planning problem is the main concern is the speed problem, in order to improve the speed of spatial modeling, collision detection is taken for sampling points. For the search problem in high dimensional space, in order to find the beginning and end of the planned path, try to utilize in the transmission line inspection state space, guiding the blank place. This approach has very good search results for high dimensional spaces.
RRT uses a single-query approach whose goal is to find a connectable path quickly. Its search process can be simply understood as a tree structure, i.e., spreading outward section by section. As a random sampling algorithm, which is also adapted in high-dimensional space, this approach is mainly aimed at speed, and the starting path is not optimal, according to the previous mobile chassis path planning is used in the bidirectional A * algorithm, naturally also thought of the feasibility of having a bidirectional search. This is obviously for transmission line inspection speed will have a great improvement. The next section will be about the bidirectional RRT-Connect.
RRT-Connect relative to the RRT is a two-way search, in the front to the starting point as the starting point, the end as their own goal point, while standing in the end of the point of view, the end of the end as the starting point, the starting point among the end of the end of the results of this two-way search if there is a final intersection, then the speed will be significantly improved, the success of this expansion to the connection can be seen as a success of the planning. As RRT-Connect adds heuristic steps, the search efficiency is faster and better for narrow channels. However, RRT-Connect is still a single-query approach, and its planned trajectory is not yet optimal, but in fact, the optimization of the trajectory is not particularly important in the robotic arm operation of transmission line inspection tasks, and speed is often the most important factor.
In the robotic arm transmission line inspection for grasping motion control method, this paper adopts the position-based visual servo control method (PBVS), the method is through the camera feedback to the robotic arm target object position information to control the robotic arm on the transmission line inspection when the target object of the precise positioning and grasping. The basic flow of the PBVS is shown in Figure 1.

PBVS basic process
According to the dynamic binocular vision calibration method the position of the target can be obtained so that the PBVS system can be constructed. After the construction of the system is completed, the motion speed of each joint needs to be obtained so as to control the joint motion for grasping.
The error of PBVS can be expressed by equation (1), where
Let
In the ideal state,
Let the linear and angular velocity vectors at the end of the robotic arm be
Let
Assuming an exponential reduction in error, the velocity of the jaws is shown in equation (8), where
According to Eq. (4) and Eq. (6) Eq. (8) can be rewritten as shown in Eq. (9).
The linear velocity
The position error
Although PBVS can accurately control the gripping position of the robotic arm, it is more sensitive to light changes and view angle changes. Therefore, a position estimation algorithm based on Extended Kalman Smooth Variable Structure Filtering (EK-SVSF) is added to the visual servo control system in this paper.
The Kalman filter provides an optimal estimate of the system state by predicting and updating the state of the system. The state prediction
According to Eq. (11), the prediction error
This results in the smoothed boundary value
The expression for the gain of the EK-SVSF is shown in (15), with
The state prediction
In order to improve the robustness and stability of the system, data fusion techniques are usually used to integrate data originating from multiple sensors in a specific way. In this paper, the sensors in the vision servo system include left and right cameras, when the master arm performs the grasping task, the master eye will collect the target information and perform the position estimation, and the slave arm will reach the side of the master arm to collect the target information and the information of the actuator at the end of the master arm and perform the position estimation, so in this paper, we will perform data fusion of the position estimation so as to get the optimal position estimation.
The data fusion technique is divided into two ways: centralized and distributed. Centralized data fusion is to concentrate the information from multiple data sources into a central location for integration and analysis. Distributed data fusion is to process the information from multiple data sources according to their own needs and conditions before further integration and analysis. This approach reduces the computation time and data storage burden, and is very friendly for processing information in real-time dynamic environments. Therefore, in this paper, distributed data fusion is used to obtain the optimal bit position, as shown in Fig. 2.

Distributed data fusion is the optimal position estimation
In order to obtain better position estimation, this paper utilizes the ordered weighted average (OWA) operator to fuse the KF-SVSF position estimation results originating from two cameras.The OWA operator is based on the weighted average of the sequential relationship between the nodes to obtain the aggregated results.The distributed data fusion process based on the OWA is shown in Fig. 3.

Owa distributed data fusion process
The weights of the OWA are determined based on the covariance matrix
The lower the estimation error the higher the weights and vice versa the lower the weights. Assuming that the first elements of the two covariance matrices
In order to verify the real-time, effectiveness and path optimization performance of the algorithms in this paper, corresponding simulations and comparative experimental analyses were conducted. All the algorithms and simulations in this paper are run on the same laptop hardware device, which is equipped with an Intel i7-9750H model with a 2.60GHz CPU and 16GB RAM.
The computational effort required in 2D space is small and easy to produce more intuitive results, so this subsection first simulates the RRT algorithm and the modified two-way RRT-Connect algorithm in 2D space to evaluate the performance of the proposed algorithms. The performance of RRT, and the improved bidirectional RRT-Connect algorithm of this paper is analyzed and evaluated by comparing their simulation results in various environments. The fixed maximum number of nodes is 2000 and the maximum number of iterations is 10000.
Here the RRT and RRT-Connect algorithms planning diagrams are displayed as shown in Fig. 4 and Fig. 5, respectively, and the black solid lines indicate the paths planned by the corresponding algorithms, in which the algorithms do not incorporate safety distances in the actual collision detection in order to embody the characteristics of the boundary extension. From Figure 4, it can be seen that the RRT algorithm planning diagram planning search path is not kept near the starting point and the target point line, the route around the longer, from Figure 5, it can be seen that the RRT-Connect algorithm optimized path basically stays near the starting point and the target point line, and the length of the path is further reduced after optimization. From the figure, it can be clearly seen that compared with the RRT algorithm, the optimized path of RRT-Connect algorithm has a shorter path length.

RRT algorithm planning

RRT-Connect algorithm planning
The search time and path length used are counted for difference comparison to verify the performance of the algorithm in this paper. The search time statistics are shown in Fig. 6, and the path length statistics are shown in Fig. 7. 100 groups of experiments were conducted with an average of 50 experiments per group. As can be seen from the figure comparison, the planning time of this paper's algorithm can be kept at a lower level with a mean value of 1.68s when dealing with complex environments, and the length of the searched paths is also shorter than that searched under the RRT algorithm.

Search time

Path length
The system utilizes a UR3 six-degree-of-freedom robotic arm and a DH AG95 two-finger parallel gripper to implement the gripping action. The communication between the systems is realized through the robotic arm operating system, which ensures efficient data exchange between the upper computer and the robotic arm. The upper computer integrates the feature detector, IBVS controller and visualization interface developed independently in this paper, which is a complete set of facilities to support the processing of the data coming from the robotic arm and send feedback control signals to it. The connection between the robotic arm and the upper computer is realized through Ethernet, which ensures the stability and speed of data transmission.
The grasping performance was analyzed using four metrics: grasping success rate, servo success rate, positioning error and attitude similarity.
Grasping Success Rate (GSR): Grasping an object and keeping the object from falling is regarded as a successful grasping. The grasping success rate is the ratio of the number of successful grasps to the total number of grasps.
Servo Success Rate (SSR): Project the target keypoints of the camera's desired pose and final pose onto the image plane respectively, and if the root-mean-square (RMS) errors of all the keypoints are within the threshold, it is regarded as one successful visual servoing.
Positioning Error (PE): indicates the displacement error between the final position and the desired position of the end-effector. The localization error measures the planar grasping accuracy of the algorithm.
Grasping Posture Similarity (GPS): as the method proposed in this paper guides 6D grasping through 2D information, this chapter can quantify the degree of deviation between the final and desired postures of the end-effector of a robotic arm as a reflection of the accuracy of the 6DoF grasping by mapping the rotational and translational similarities to a uniform metric space of [0,1] and weighting the sum.
In the grasping experiments, the performance of the proposed method is compared with five visual servo grasping models, IBVS, SIFT, Deep MPCVS, KOVIS, and PID-IBVS. Among them, IBVS utilizes Ar Uco markers from OpenCV vision library as a feature extraction tool, and combines with a proportional controller to realize basic image visual servo grasping. SIFT extracts stable feature points to guide the robotic arm grasping by combining scale-invariant feature transformation technique with a proportional controller. The two methods, DeepMPCVS and PID-IBVS, respectively, employ respectively a Model Predictive Control and PID control strategies, both identify the bounding box of the target through RTMDet network and use the four corner points of the bounding box as the control features to achieve accurate visual servo grasping. KOVIS, on the other hand, is a scale-free grasping method based on learning the feature representation of the object from the selfencoder.
The following section shows a successful grasping example of the proposed method under transmission line inspection. From the external camera viewpoint, the robotic arm is able to approach the target gradually and complete the grasping task successfully. And from the camera view of the robotic arm, both the target bounding box and the key points are accurately captured and tracked throughout the grasping process. The key performance curve of the grasping process is recorded. Figure 8 shows the error convergence curve under the control method of this paper, and the image feature error can be smoothly reduced to below the set threshold, which proves that the system has successfully realized the target convergence. Figure 9 shows the feature point trajectory of the robotic arm, in which the blue stars indicate the initial image features and the orange triangles indicate the desired image features, and it can be seen from the feature trajectory that the feature trajectory of the proposed method can reach the desired position smoothly.

Error convergence curve

Characteristic trajectory
Table 1 shows the comparison of the grasping performance of six visual servoing methods in static cluttered scenes of transmission line inspection. In terms of grasping success rate (GSR), the IBVS method is only 0.49, and all other deep learning-based visual servoing methods perform well. The control method proposed in this paper and the KOVIS method achieve the optimal and sub-optimal grasping success rate of 0.93 and 0.91, respectively. In terms of servo success rate (SSR) index, the proposed method in this paper is the optimal 0.95, which indicates that the controller of the proposed method has strong convergence. In addition, the average errors of the proposed method in x-axis, y-axis and z-axis are 0.19 cm, 1.11 cm and 0.18 cm, respectively, which are optimal and show good grasping accuracy. In addition, the grasping similarity result of the proposed method is 0.975, which shows that it can restore the grasping attitude better. By analyzing the above four indexes, it can be seen that the proposed method performs better in terms of comprehensive performance, which demonstrates its stable and efficient grasping ability in dealing with static cluttered scenes in transmission line inspection.
The static clutter scene captures performance comparisons
| Contrast model | GSR | SSR | PE (cm) | GPS |
|---|---|---|---|---|
| IBVS | 0.49 | 0.69 | 0.34,1.87,0.36 | 0.845 |
| Deep MPCVS | 0.73 | 0.78 | 0.34,1.81,0.28 | 0.829 |
| PID-IBVS | 0.78 | 0.71 | 0.32,1.86,0.27 | 0.856 |
| SIFT | 0.78 | 0.84 | 0.25,1.84,0.27 | 0.913 |
| KOVIS | 0.91 | 0.90 | 0.23,1.49,0.34 | 0.956 |
| This method | 0.93 | 0.95 | 0.19,1.11,0.18 | 0.975 |
In this paper, an improved RRT robotic arm path planning algorithm based on adapting to multiple scenes is proposed. Then the optimal position estimation of transmission line inspection target was performed by using position-based visual servo control method, combined with EK-SVSF position estimation algorithm and OWA-based stepwise data fusion algorithm.
The path searched by the RRT algorithm planning did not make the distance from the start point to the target point optimal, and the optimized path of the RRT-Connect algorithm basically stays near the line connecting the start point and the target point, with a substantial reduction in the length of the path, and it not only shortens the path, but also the planning path time can be kept at the lower level of 1.68s in the mean value.
The error convergence curve under the control method in this paper can be smoothly reduced to below the set threshold, successfully realizing the target convergence, and the trajectory of the characteristic point of the robotic arm can also smoothly reach the desired position.
The control method proposed in this paper achieves the optimal grasping success rate of 0.93 under the static clutter scene in transmission line inspection. The servo success rate index of the proposed method is the optimal 0.95, which indicates that the controller of the proposed control method has strong convergence. In addition, the average errors of the proposed method in x-axis, y-axis and z-axis are 0.19cm, 1.11cm and 0.18cm, respectively, which show good grasping accuracy. Finally, the grasping similarity result of the proposed method is 0.975, which shows that it can restore the grasping attitude well.
