Open Access

Research on unmanned delivery path optimization strategy based on reinforcement learning in intelligent logistics system

,  and   
Sep 26, 2025

Cite
Download Cover

Figure 1.

Perception-action-reward feedback framework for reinforcement learning
Perception-action-reward feedback framework for reinforcement learning

Figure 2.

DQN algorithm training flowchart
DQN algorithm training flowchart

Figure 3.

Actor-Critic algorithm framework
Actor-Critic algorithm framework

Figure 4.

100 scale calculation example node distribution
100 scale calculation example node distribution

Figure 5.

The 100 size of the customer time window distribution
The 100 size of the customer time window distribution

Figure 6.

The path diagram of the example
The path diagram of the example

Figure 7.

Node path planning
Node path planning

100 example path scheme

Numbering Path Total demand Loading rate
Truck 1 100→62→51→72→47→89→90→34→94→31→30→52→72→24→69→9→67→60→37→42→26→16→18→64→100 386 99.2%
Uav 1 100→0→ 47, 38→ 83→ 27, 65→ 14→ 69 12
Truck 2 100→56→75→72→61→82→33→45→17→24→70→87→35→26→87→42→69→93→17→40→100 322 98.3%
Uav 2 100→ 21→ 54, 64→ 0→ 32, 24→ 70→ 16, 36→80→88, 84→ 0→ 39, 76→ 1→ 99 69
Truck 3 100→50→96→54→59→40→13→98→19→18→54→48→67→35→60→59→23→97→95→76→76→100 352 99.6%
Uav 3 11→ 35→ 18, 47→ 57→ 38, 57→ 6→ 13 50
Truck 4 100→49→16→22→47→39→86→85→38→74→7→89→88→19→92→29→93→53→81→35→100 322 97.5%
Uav 4 100→ 91→ 15, 54→ 8→ 34, 78→ 82→ 61, 88→27→25, 26→5→20 61

Example path scheme

Scale of 50(C=150) Scale heterogeneity
Numbering Total demand Loading rate General course Stroke rate Numbering Total demand Loading rate General course Stroke rate
1-1 144 93.8% 230 97.9% 1-1(200) 191 97.2% 233 95.5%
1-2 150 99.1% 2-1(200) 184 95.8% 230 75.8%
2-1 147 98.6% 215 88.6% 3-1(200) 199 99.6% 239 96.5%
2-2 130 91.2% 4-1(200) 196 97.3% 229 99.6%
2-3 80 46.5% 5-1(200) 198 85.5% 245 100%
3-1 150 95.7% 239 99.6% 6-1(200) 187 98.2%
3-2 136 88.3% 7-1(200) 138 92.2% 222 93.6%
4-1 149 95.2% 235 96.2% 7-2(200) 138 97.2%
4-2 131 87.5%
5-1 144 92.5% 232 98.2%
5-2 66 46%
6-1 147 98.3% 150 58.6%
6-2 57 32.8%

100 scale node information

Numbering Coordinate Position Demand Coordinate Attribute Numbering Coordinate Position Demand Coordinate Attribute
0 (21,42) 2 DC 51 (34,14) 14 FC
1 (21,22) 0 DC 52 (25,40) 19 FC
2 (21,10) 2 DC 53 (58,29) 18 FC
3 (54,12) 1 DC 54 (30,71) 24 FC
4 (53,38) 5 DC 55 (64,56) 18 FC
5 (14,22) 2 DC 56 (43,53) 16 FC
6 (57,48) 8 DC 57 (54,58) 23 FC
7 (32,13) 0 DC 58 (3,61) 20 FC
8 (40,28) 1 DC 59 (43,60) 20 FC
9 (7,68) 1 DC 60 (19,23) 7 FC
10 (29,45) 16 FC 61 (54,42) 23 FC
11 (53,13) 12 FC 62 (46,47) 18 FC
12 (50,74) 14 FC 63 (19,34) 20 FC
13 (30,68) 19 FC 64 (31,41) 7 FC
14 (16,18) 17 FC 65 (19,72) 4 FC
15 (56,66) 22 FC 66 (19,21) 24 FC
16 (28,42) 20 FC 67 (35,69) 17 FC
17 (36,24) 11 FC 68 (46,9) 15 FC
18 (24,25) 20 FC 69 (46,53) 24 FC
19 (29,23) 11 FC 70 (17,24) 16 FC
20 (70,62) 17 FC 71 (14,56) 12 FC
21 (29,48) 10 FC 72 (32,2) 14 FC
22 (55,41) 18 FC 73 (24,31) 18 FC
23 (25,63) 18 FC 74 (11,27) 12 FC
24 (61,28) 11 FC 75 (20,30) 16 FC
25 (19,10) 17 FC 76 (44,27) 17 FC
26 (19,49) 23 FC 77 (60,38) 20 FC
27 (64,11) 17 FC 78 (23,23) 15 FC
28 (23,21) 19 FC 79 (40,13) 18 FC
29 (14,50) 6 FC 80 (43,37) 14 FC
30 (64,74) 17 FC 81 (45,34) 12 FC
31 (10,19) 20 FC 82 (3,4) 8 FC
32 (8,58) 16 FC 83 (3,56) 15 FC
33 (47,35) 16 FC 84 (42,16) 15 FC
34 (49,52) 7 FC 85 (5,18) 22 FC
35 (42,29) 16 FC 86 (17,11) 17 FC
36 (7,14) 21 FC 87 (18,25) 19 FC
37 (15,40) 19 FC 88 (49,10) 16 FC
38 (36,57) 22 FC 89 (6,18) 10 FC
39 (12,28) 14 FC 90 (5,47) 8 TC
40 (23,1) 18 FC 91 (53,9) 31 TC
41 (21,28) 17 FC 92 (19,47) 19 TC
42 (45,67) 19 FC 93 (65,17) 23 TC
43 (43,14) 17 FC 94 (68,48) 32 TC
44 (54,48) 17 FC 95 (54,52) 30 TC
45 (21,63) 11 FC 96 (41,42) 21 TC
46 (16,24) 17 FC 97 (69,32) 26 TC
47 (73,2) 11 FC 98 (31,12) 32 TC
48 (40,39) 22 FC 99 (60,66) 18 TC
49 (34,31) 24 FC 100 (32,38) 0 Distribution center
50 (14,41) 2 FC
Language:
English