Digital Intelligence Technology-Driven Transformation and Innovation Research on Choreography Design
Pubblicato online: 05 giu 2025
Ricevuto: 09 gen 2025
Accettato: 25 apr 2025
DOI: https://doi.org/10.2478/amns-2025-1033
Parole chiave
© 2025 Wei Miao, published by Sciendo.
This work is licensed under the Creative Commons Attribution 4.0 International License.
With the continuous innovation of contemporary art, the audience’s aesthetic concept shows diversified characteristics, in order to provide the audience with beautiful artistic enjoyment, it is crucial to strengthen the choreography. Choreography is a unique art form, which organically integrates time and space products into a whole to be presented, and has high requirements for the material level and technical level, presenting diversified design characteristics [1-3]. Contemporary choreography includes actors’ costumes, props, decorative sets, etc. To present beautiful stage effects, it is necessary for stage lighting, costumes, props and stage sets to coordinate and cooperate with each other, and optimize the design around the theme content of the work and the contemporary audience’s aesthetic needs, so that it can create a high-quality audio-visual environment, allowing the audience to get a more three-dimensional, rich aesthetic feeling [4-8].
Choreography is a typical expression of the combination of art and technology, and choreographers have been pursuing the innovation and artistic expression of choreography. However, the traditional means of choreography design mainly relies on the combination of handmade props, physical sets and lighting, but this approach is easily limited by the production process and spatial constraints, making it difficult to satisfy the needs of modern audiences for visual effects [9-12]. In this context, the emergence of digital intelligence technology has revolutionized the design of choreography. With its strong visual presentation ability, real-time interactivity and high flexibility, digital intelligence technology provides unlimited possibilities for choreography [13-16]. By using these technologies, choreographers can create more vivid and realistic stage effects and bring more shocking visual experience to the audience [17-19]. In addition, digital media technology can also realize the real-time interaction between the stage and the audience, enhance the audience’s sense of participation and immersion, and make the choreography more creative and dynamic [20-22]. Therefore, exploring the application of digital intelligence technology in choreography design has important theoretical significance and practical value for promoting the innovation and development of choreography design.
In the process of choreography design, the use of light, sound and electricity technology in digital intelligence technology can also enrich the stage’s expressive power, and bring the audience an all-round, multi-dimensional audio-visual enjoyment. Stage lighting as the visual focus of choreography design, its application is also inseparable from the support of digital intelligence technology. Gao, J. et al. researched a stage lighting intelligent control system for lighting automatic tracking of performers, which is based on the deep convolutional neural network tracking algorithm, and is able to automatically identify the character’s position and complete real-time target tracking, which solves the problem that the manual lighting control method can not accurately and timely tracking of the actors [23]. Hsiao, S.W. et al. introduced music emotion recognition and machine learning methods into the stage lighting adjustment system, which automatically adjusted the lighting mode and optimized the stage effect by detecting the music intensity and emotion in the music clip [24]. Qu, S. pointed out that the stage lighting control system has problems such as poor control effect and weak energy-saving performance, and proposed the construction of a support vector machine based on the improvement of the histogram of oriented gradients (HOG-SVM) feature extraction of intelligent stage lighting control system, which can effectively recognize the human body contour image and reduce the control energy consumption while meeting the needs of stage performance [25].
Sound technology is an important part of the choreography, bringing the audience a clear and realistic sound experience through advanced sound equipment and precise audio processing technology. Liu, Y. showed that the sound system in the stage plays an important role in enhancing the audience’s auditory experience and stage art expression, and the combination of digital technology to configure and optimize the sound system according to the principles of stage design significantly enhances the stage performance atmosphere [26]. The application of optoelectronic technology can also make the props and sets on the stage present a more splendid and varied effect. Jung, H. et al. introduced the application of projection mapping technology in stage design, which uses the outer wall of a building or the surface of a specific object as a screen to extend the existing stage presentation from a small screen to the reality of the three-dimensional space, which greatly enhances the stage performance tension [27]. Nakatsu, R. et al. developed a projection mapping system that can project images on non-rigid moving objects such as performers, and realized real-time 3D projection mapping of multiple moving targets with the help of depth sensors and multiple projection devices, which innovated and expanded the form of stage performances [28]. Sun, F. et al. analyzed the form and function of movable stage set design from the point of view of intelligent control, which helps to improve the presentation of stage theatre art [29].
In addition, the virtual interactive system can ensure that all elements on the stage can be coherently presented to the audience through precise synchronized control and data interaction, thus creating a more perfect stage effect. Samur, S.X. constructs virtual, literal modes of performance presence through novel head-mounted virtual reality technology to help audiences realize the digital sense of presence in stage performances [30]. YOO, Y. Aiming at the problem of insufficient completeness and diversity of visual content due to the occupation of the field of vision that occurs in traditional stage performances, he proposes to use digital image technology to design virtual reality performances suitable for the visual space of the stage, and to realize cross-media performances of the stage through virtual reality technology [31]. Yan, S. et al. emphasized that social interaction is an important element of stage performances, and designed a new approach to social interaction based on virtual reality technology to evoke the audience’s social awareness in stage performances, bringing new possibilities for enhancing the audience’s stage viewing experience [32].
Summarizing the above studies, we find that the application of digital intelligence technology in choreography has brought epoch-making changes to stage art. The application of digital technology not only significantly enhances the visual effect of the stage, but also greatly enriches the expressive power of the stage through innovative means. With the continuous development of digital media technology, it is believed that choreography will show a more diversified and personalized artistic style.
This paper clarifies the artistic goals and visual style, deeply understands the creative process of choreography, analyzes the data of a large number of choreography images and related text descriptions using the stable diffusion model, and determines the tone and style of choreography design generation. Then, the parameters of LoRA model were finely tuned to realize the style fine-tuning of the choreography design. The CGADM first and second layer architectures were designed one by one to improve the SD generation scene model. Through the YOLOv5 algorithm, stage performers are detected and recognized, and the KNN and RANSAC algorithms are used to extract the key points and feature points of the human body and form a matching point group for matching, so as to improve the choreography design and the positioning and tracking function of the middle stage lighting. The final presentation effect of the choreography design of this paper was jointly evaluated through physiological feature measurements and the collection of subjective opinions.
The choreography process begins with an in-depth understanding of the project’s theme, defining the artistic goals and visual style. AIGC techniques such as Stable Diffusion are utilized to input textual cues related to the theme and initiate the image generation process. This step is critical in determining the tone and style of the generated images. Next, the style is fine-tuned by selecting and adjusting the large model, as well as fine-tuning the parameters of the LoRA model. This stage requires close collaboration between the creator and the technology to ensure that the detail and style of the image matches the creative vision. Finally, frame-by-frame output of the video is performed using plug-ins.
Large model selection. Stable Diffusion (SD) is a deep learning model based on artificial intelligence, which belongs to a kind of Generative Adversarial Networks (GAN), and is specifically applied to the field of image generation, and is a powerful authoring tool [33]. WebUI is the web interface of SD, and by analyzing a large amount of image data and its related textual descriptions, SD learns to transform textual information into visual images, which opens up digital authoring to a a whole new dimension, making the conversion from text to image possible. Mov2Mov plug-in The Mov2Mov plug-in is an extension for SD software that allows the user to convert a video file into a series of textual cues that are then used by SD to generate images corresponding to the original video frames. The innovation of this plug-in is the ability to change the transfer efficiency of the video stream from traditional video data compression to text cue compression, thereby significantly reducing the required bitrate while maintaining image quality. In this way, the Mov2Mov plug-in provides a novel solution for video editing and transmission.
LoRA is a fine-tuning technique for tuning large language models (LLMs) [34]. It significantly reduces the number of trainable parameters, memory requirements, and training time by introducing low-rank matrices at each layer of the Transformer architecture and training only these matrices while keeping the original model weights unchanged. LoRA is a Parameter Efficient Fine-Tuning (PEFT) technique that achieves model tuning by introducing a small number of parameters while keeping the original model unchanged. With the LoRA model, customized tuning can be performed to fit a specific style and maintain high operational efficiency when generating images using SD tools. LoRA, like Control Net, uses a small amount of data to train a painting style or character without modifying the SD model to achieve the customization requirements, and requires much smaller training resources than training the SD model, which is expressed in the data equation (1) as follows:
Where WebUI image generation process: select model - input prompt words - adjust parameters - click to generate images, the actual machine operation process: prompt words prompts - lexical elements Tokens - text EncoderText Encod-er-Set parameters-Noise Predictor-Noise Predictor-Variable Auto-Encoder VAE-Output image. The core principle of LoRA model training is to embed the “repeated impressions” in the noise predictor network. In the field of stagecraft, for lighting designers, the dataset can be fixture specifications and documentation, and for stage designers and costume designers, the dataset can be costume styles, well-known designers and works of different style genres.
Generative adversarial network consists of a generator and a discriminator. Where the generator is responsible for learning the sample distribution and generating the most realistic image possible [35]. The discriminator is responsible for recognizing the real data and the generated data. The model training ends when the data generated by the generator successfully deceives the discriminator. The loss function of the generative adversarial network can be expressed as equation (2):
where
The diffusion model constructs a Markov chain that uses a diffusion process to transform known distributions (e.g., Gaussian) into target data distributions. The diffusion model is shown in Figure 1.

Diffusion model
where
The latent diffusion model (LDM) belongs to the conditional control model significantly reduces the computational requirements compared to pixel-based diffusion models. The latent diffusion model can not only be trained on limited computational resources, but also ensures the quality and flexibility of the generated images.
Stabilized Diffusion (SD) model is proposed on the basis of potential diffusion model, the difference between stabilized diffusion model and potential diffusion model is that stabilized diffusion model uses CLIP as a text encoder and self encoder (VAE) as an image encoder. The stable diffusion model uses LAION-5B as the dataset for model training. The similarity between the stable diffusion model and the potential diffusion model is that both use U-Net network.
Attention mechanism In order to further enhance the expression ability of object embedding vectors, the attention mechanism is added to the graph convolutional network, i.e., the attention coefficients are assigned to the neighboring nodes when the aggregated neighboring nodes output the feature vectors, and thus the ability of the object to perceive other neighboring nodes can be enhanced. A shared parameter matrix is first used
where Scene Layout In this paper, we use a scene layout network based on a regional convolutional network using four parameters
where
First, a border
CLIP text encoder The attention mechanism uses the multi-head attention mechanism to introduce the three concepts of Query, Key and Value, Query denotes the query, which is represented by the symbol
where
The text embedding vectors output from the text encoder are high dimensional, and when the data is limited, the text description cannot fill the hidden space, and the data flow shape is discontinuous, which affects the training effect, so the conditional enhancement technique is introduced to get more hidden variables by using random sampling in Gaussian distribution, and the expression for the Gaussian distribution is shown in Eq. (15) [36]:
where Residual network generation The residual network can be used to deeply understand the multi-peak features of text and images, and this network can alleviate the model gradient vanishing as well as the overfitting problem. The first layer of the network will generate the image through a convolutional kernel window size of 4 × 4, step size of 2 convolutional layer, batch attribution layer and LeakyReLU activation function composed of the network for downsampling, to get the image features. The combination of image features and hidden variables are fed into the residual network for learning the multimodal representation between the image and the hidden variables to improve the image accuracy. The residual network consists of a series of residual blocks, in this paper, four residual blocks are used, and each residual block can be expressed by equation (16):
The residual block consists of two parts, the direct mapping and the residual, where
Where Match-aware discrimination The discriminator of the sublayer network uses a match-aware discriminator. In the training process, the discriminator divides the real image, the generated image and its corresponding text description into two cases: positive sample and negative sample. Only the combination of the real image and the matching text is a positive sample, and the rest are negative samples The discriminator uses a deep convolutional neural network containing two downsampling layers, the downsampling layer consists of a convolutional layer with a step size of 2 and a convolutional kernel window size of 4 × 4, a batch normalization and a Leaky ReLU. The objective function of the sublayer discriminator network is expressed as equation (18):
Where
In order to realize the stage lighting localization and tracking function of deep neural network technology, it is firstly necessary to detect and identify the target by YOLO algorithm, and then extract the human body’s key points and feature points in each image by key point detection algorithm, and form a matching point group for matching.
First, the Mosaic data enhancement technique significantly improves the system’s ability to adapt to various scenes and the accuracy of target detection by randomly cropping, scaling, and rearranging the images. In addition, all images are resized to a uniform size at the input side in order to maintain the consistency of the training data and improve the processing speed of the model. This step ensures the standardization of data input and lays a solid foundation for model training.
The Backbone architecture integrates three key architectures, Focus, CSP and SPP. the Focus architecture specializes in image slicing and generates finer-grained feature maps through convolutional operations. The CSP structure introduces a varying number of residual components to further optimize the feature extraction process. SPP (Spatial Pooling Pyramid) enhances the model’s adaptability to changes in image size by performing a maximum pooling operation on feature maps of different sizes.
The role of the Neck module is to connect the Backbone to the Head, which contains both FPN and PAN structures. FPN efficiently conveys feature information through a top-down approach, while PAN employs a bottom-up approach that enhances the detailed representation of features.
Finally, the Head module disposes of the mismatch between the prediction frame and the target frame through loss function and non-maximum suppression (NMS) operations, and effectively removes redundant prediction frames through a weighting method, which increases the accuracy and clarity of detection.
During the matching process, each feature point is assigned the corresponding 3D coordinates one by one. By applying the Nearest Neighbor Rule Classification Matching Algorithm (KNN Algorithm), the distance between each sample, as well as the difference in their directions, can be calculated to minimize the error in the matching process.
The calculation of the distances follows equation (19):
where:
In order to measure the difference in direction, Eq. (20) can be utilized to calculate the direction cosine value between two feature points:
Where:
In the process of calculation, the system will take the center of the stage as the origin, establish a coordinate system with a spacing of 0.5m, and calculate the 3D coordinates of each matching point group through the built-in parameters and position of each camera device, and then after deriving the 3D coordinates, the system will convert the 3D coordinates of each matching point group into the corresponding 2D coordinates of the projection onto the stage in the way of perspective projection.
Cross dataset training, i.e., training the localization algorithm by collecting additional data as a supplementary training set. By collecting data under different environments, different lighting conditions and different actor postures as supplementary training sets, the stability and extensiveness of the system can be effectively improved, thus enhancing the system’s adaptability to situations that may occur during improvisation.
And the motion posture prediction can be divided into two cases. The first case is uniform motion, i.e., the actor’s route in a certain time is uniform, in this case, the system can calculate the next coordinate point based on the 2D projection coordinates of the previous feature points, as shown in Eq. (21) and Eq. (22):
Where:
By using the above formula, the next route of the actor can be effectively predicted.
The second case is non-uniform motion, i.e., the actor carries out irregular, and the speed changes at any time on the stage. In this case, the system can ensure real-time by its own real-time tracking technology as well as high-speed computation capability to output images in a very short time.
First, the server receives images from multiple synchronized cameras, locates human bodies and creates human detection frames using a human detection algorithm. Each human body extracts multiple keypoints and orientation gradient feature points (ORBs) from the image. Then, combining these features, the algorithm server calculates the 3D coordinates of the body’s key points and corrects for reprojection errors to improve accuracy. Subsequently the 2D stage coordinates of the performer are calculated. Finally these coordinates are converted into DMX commands, which the console receives and directs the lights to accurately illuminate the performer.
On top of the above workflow, the system’s camera technology also applies flexible mechanisms to meet the challenges of the stage’s complex environment. Fig. 2 shows the schematic diagram of two-view triangulation for stage light tracking. During the working process, it is impossible to obtain the pixel depth information through a single picture, so two or more cameras need to be aimed at the same target, and then the system will accurately locate the target through the method of multi-view triangulation.

Stage light tracking triangulation of two views
Wherein, when two camera devices
According to the pair of pole constraints, the following relation is available:
Eq. (25) is obtained by multiplying both the left and right sides by
Finally,
And when there are more than 2 cameras focusing on 1 target, the target
Where:
Expanding equation (27) gives:
This can be derived from the third line of the above equation:
Where:
Substituting the result of the third line into the first two lines subsequently gives:
The above two equations are the results of one observation. However, in actual performances, multiple observations are required, so after assuming that
Let
Similarly, in order to account for errors due to noise, the system will be solved by least squares to figure out the target position.
Color theory provides a series of guiding principles on how to influence the viewing experience through color combinations, such as color harmony, contrast, and color symbolism. The direction of the audience’s emotions can be guided through color combinations in stage design, e.g., warm tones are often used to create a warm, exciting or tense atmosphere, while cool tones are used to express calmness, sadness or contemplative emotions.
Lighting and lighting design is an extremely core component of choreography. Lighting not only provides the necessary illumination, but is also a key tool for constructing the atmosphere of the scene, guiding the audience’s emotions and reinforcing the visual impact, the core of lighting design lies in the mastery of the quality, direction, color, and intensity of the light as well as how to interact with the stage space and the actors through these elements. Lighting design must be closely integrated with the content of the script and the visual intent of the director, supporting the development of the plot and deepening the expression of emotions through changes in lighting.
Spatial layout focuses on how the stage space is utilized including the depth, width and height of the stage as well as the configuration of reality and reality in the scene, while composition involves how the visual elements of the stage are distributed in the space, which includes the position of the actors, the arrangement of props and the design of the backdrop. Effective use of space can guide the audience’s eyes to create a visual focus, so that the audience’s attention is focused on the key elements or actions on stage.
Eye movement In previous studies, the presentation time of the stimulus materials for choreography was usually controlled at 7 s to 20 s depending on the number and type of the stimulus materials, and in this experiment, the Tobii Pro Glasses 2.0 eye-tracking device was used to collect data to present the samples through a screen with a resolution of 1,920 × 1,080, and the presentation time of the choreography samples was 10 s, with a gap of 2 s between the photographs, and all the photographs were randomly shown after being disrupted in order, and only one subject was allowed to enter the experimental site during the experiment. All photos were randomly played after disrupting the order, and only one subject was allowed to enter the test site during the experiment. The sampling rate of the eye movement data obtained in this experiment reached more than 80%, which was in line with the collection standard, and the data were reliable. After the eye movement experiment was completed, subjective data were collected from 20 samples. Table 1 shows the analysis of the sample gaze time, this paper counts the gaze time of the subjects in each region in the sample by dividing the eye movement interest region, and quantifies the eye movement indexes to analyze the degree of eye-catchingness, degree of attention, and attractiveness of the spatial elements of each choreography design. After analyzing all the samples, this paper divided the spatial elements in the samples into the following six categories: spatial interface elements, costume elements, prop elements, choreographic composition, lighting elements, and set elements, and computed and analyzed the average gaze time of automatic optical inspection (AOI) in each region. In order to reduce the experimental error, the boundaries of each region were divided in such a way that they followed the original element boundaries as much as possible, and the boundaries of each region did not overlap. Among the six categories of choreographic design space elements, the attention of choreographic composition is significantly weaker than that of the other five categories of choreographic design space elements, indicating that the choreographic composition elements are less eye-catching and lack attractiveness, but they are more recognizable and comprehensible, with a total gaze duration value of 3.95 s. The total duration of attention is 3.95 s, which is the same as that of the other five categories of choreographic design space elements. Props elements are widely present in the choreography scene and account for a larger proportion in some samples, but the degree of eye-catchingness is weaker than that of other major elements in the choreography scene in subjects’ observation, with a higher degree of recognizability but a slightly weaker attraction to the subjects. The spatial elements that received more attention from the subjects were the spatial interface elements and set elements, which were more visible and attractive than the other four spatial elements, with the total gaze duration of 104.624 s and 63.552 s. However, due to the large amount of information contained in the elements, the subjects’ comprehensibility and recognizability of these two types of elements were lower. As the primary component of spatial form, spatial interface elements influence the perception of space. As secondary elements, spatial interface elements are more eye-catching in choreography, but in terms of attractiveness, they are again weaker than elements.
Sample fixation time analysis
| Region | Counting point | Time before first gaze/s | First fixation duration/s | Total fixation duration/s | |||
|---|---|---|---|---|---|---|---|
| Mean value | Total value | Mean value | Total value | Mean value | Total value | ||
| Spatial interface element | 28 | 1.846 | 51.688 | 0.745 | 20.86 | 1.615 | 45.22 |
| Clothing elements | 26 | 1.036 | 26.936 | 1.788 | 46.488 | 4.024 | 104.624 |
| Item element | 25 | 2.563 | 64.075 | 0.469 | 11.725 | 1.498 | 37.45 |
| Stage composition | 10 | 3.615 | 36.15 | 0.318 | 3.18 | 0.395 | 3.95 |
| Lighting element | 10 | 2.036 | 20.36 | 0.428 | 4.28 | 0.495 | 4.95 |
| Scenery element | 24 | 1.129 | 27.096 | 0.915 | 21.96 | 2.648 | 63.552 |
| All spatial elements | 1214 | 1.978 | 2401.292 | 1.348 | 1636.472 | 3.379 | 4102.106 |
The normality test was performed on the overall eye movement data of the scene and the eye movement data of each AOI using spss26.0 software. Table 2 shows the S-W test results of the overall eye movement indexes, according to the S-W test results of each index the absolute value of the kurtosis is less than 10 and the absolute value of the skewness is less than 3, the kurtosis is 0.02796, 1.03498, 0.63486, and the skewness is 0.31486, 0.57964, -0.39486, respectively. Figure 3 shows the histogram of the overall normal distribution of the eye movement indexes, which shows that the overall eye movement index data distribution satisfies the normal distribution, and the mean values of the three indexes are 1.97909, 1.3409, and 3.38583, respectively.
The results of the S-W test of the overall eye movement index
| Variable name | Median | Mean value | Standard deviation | Skewness | Kurtosis | S-W test |
|---|---|---|---|---|---|---|
| Time before first gaze | 2.00184 | 1.97909 | 0.56514 | 0.31486 | 0.02796 | 0.94862 |
| First fixation duration | 1.34854 | 1.3409 | 0.22196 | 0.57964 | 1.03498 | 0.98348 |
| Total fixation duration | 3.39706 | 3.38583 | 0.61046 | -0.39486 | 0.63486 | 0.98764 |

The overall normal distribution histogram of the eye movement index
Heart rate variation Due to the complexity of the human physiological signals and the limited precision of the measuring instruments, the values fluctuate greatly, and the demand for local data analysis is very small, and it is only suitable for analyzing the general trend of a large amount of data. In this paper, the Euclidean distance calculation is performed on the physiological signal data of the simulated subjects in different color environments of the choreography scene to analyze the differences and then compare the differences of human physiological state in different color environments. Specifically, Euclidean distances were calculated for 40 groups of physiological signals under the simulated test environment of light blue, white, warm white, orange, red and green, respectively, and 40 groups of physiological signals under the natural conditions of the choreographic environment. As shown in Eq. (33),
Heart rate is the number of times the heart beats per minute and is the most common measure of cardiac activity. A normal adult’s heart rate is about 70 beats per minute and an athlete’s heart rate is about 50 beats per minute. In a choreographic environment, it is important for the audience to maintain a smooth heart rate activity.
In the simulation of the test environment for stage performances to watch the heart rate signal measurement experiments, T-Sens heart rate sensor sampling frequency of 16Hz, the T-log wireless data logger in the SD card data reading, by CAPTIV software processing can be obtained within 10 minutes, 40 subjects heart rate data, due to the sample size is too large, this paper only placed four typical convergence of the collection of results Due to the large sample size, only 4 typical convergent acquisition results are placed in this paper. Fig. 4 shows the heart rate signal acquisition results of 4 subjects, Fig. (a)~(d) represent the 1st-4th subjects respectively, and the mean values of heart rate of 4 subjects in 10 minutes range from 70 to 86 beats/min.

The heart rate signal of four subjects was collected
Color The Euclidean distances of the six simulated experimental environment colors of the choreography design were compared with those of the natural environment heart rate signals, and Figure 5 shows the results of the experiment. Warm white has the smallest Euclidean distance among the six choreography colors, ranging from 800 to 1,500 beats/min, and the heart rate signals in the choreography design color simulated experimental environment are closest to the Euclidean distance in the natural indoor environment. Light and shadow and lighting Figure 6 shows the tracking algorithm solution, using the simulation platform based on KNN and RANSAC algorithms to build a 5m×8m simulated stage positioning environment, to achieve the stage character light tracking, set the lamps after the coordinate system rotation, translation, in the positioning point coordinate system under the coordinate system of the L (5,4), select the positioning point in the environment for simulation of irradiation. Firstly, according to the stage positioning algorithm to locate the known positioning point for the positioning solution, and then according to the stage light tracking algorithm, respectively, the positioning point and the positioning results of the light tracking calculation, in the simulation of the stage environment to select 100 positioning points for the test, to get the lamps and lanterns rotating tracking direction of the correct rate of 97%, the degree of rotation error in the [-18.72%,22.3%] between. It meets the stage lighting tracking performance requirements. Spatial scene composition By measuring the aspect ratio D/H of the corresponding external space of the stage collection point in the field, the D/H of the four choreography scenes is assigned according to the interval grading, the same value interval is counted according to the same class, and the segments of D/H are divided into four segments of 0~1, 1-2, 2-4, and >4 with reference to the values of the different spatial types. Figure 7 shows the D/H ratio of choreographic space characteristics, after statistical sorting, most of the space aspect ratios in the four stage scenes are concentrated in the interval of 0.5~1.5. The average value of D/H of the external space of scene 1 is 0.842 slightly less than 1, the scale ratio of scene 2 is uniform, all in the range of 1±0.38 interval, scene 3 most of the external space of D/H in close to or slightly greater than 1, a few D/H in 0.4~0.8, scene 4 external space is very wide, open space D/H are close to or greater than 1, most of them are greater than 2, the space is very open, space The average value of D/H is about 3.093, and the proportion scale is appropriate.

Test result

The tracking algorithm is solved

Dance beauty spatial feature D/H ratio
The visual entropy of the designed choreography is calculated by writing the program language through matlab, the operation method is to read in the choreography image and grayscale it, obtain the grayscale histogram of the image, calculate the probability of each grayscale in the image appearing in the image, and calculate the entropy by the definition of entropy. Figure 8 shows the visual entropy statistics of the choreography scene, through the program calculation, it is found that the visual entropy of each choreography scene is between 7 and 8.5, mainly concentrated between 7.5 and 8, a few lower than 7.5 and higher than 8, and the numerical difference is reflected in the 2 decimal places, which on the one hand indicates that the choreography image samples collected in this collection are very uniform in terms of the type, and the information is close to the information, which is helpful for concentrating the focus of the research. On the other hand, it shows that there is a difference in the visual information carried by the visual elements of the external space of the choreography, and this difference is reflected in the details of the color composition, material texture and so on. The data of visual entropy are standardized and data segmented, which are divided into 3 segments: low visual entropy (7.30~7.50), medium visual entropy (7.50~7.70) and high visual entropy (7.70~7.90).

Visual entropy statistics for dancing scenes
Table 3 shows the statistics of the audience’s overall perception, and the analysis of the audience’s overall perception in this paper is divided into two aspects: the audience’s communication method and overall feeling towards the stage performance. The results presented in the survey indicate that the evaluation of the communication mode of this stage performance is very good and good evaluation of the majority, accounting for 62.5% and 17.5% of the total number of people respectively. At the same time, 65% of the audience thought that the overall feeling brought by the stage performance was very good, and no audience thought that the overall feeling of the stage performance was very bad.
Statistics of the audience’s overall perception,
| Audience perception | Group | Number of people | Percentage |
|---|---|---|---|
| Stage performance communication mode | Fine | 25 | 0.625 |
| Good | 7 | 0.175 | |
| Normal | 3 | 0.075 | |
| Difference | 3 | 0.075 | |
| Very bad | 2 | 0.05 | |
| Stage performance overall feeling | Fine | 26 | 0.65 |
| Good | 9 | 0.225 | |
| Normal | 3 | 0.075 | |
| Difference | 2 | 0.05 | |
| Very bad | 0 | 0 |
The audience’s satisfaction with the stage performance elements of the survey is divided into four aspects of the overall stage modeling, stage lighting effects, stage performance and stage scheduling form of generalization, satisfaction scores in accordance with the level of scoring, very satisfied with 5 points, more satisfied with 4 points, and so on. Table 4 for the audience on the performance of the elements of satisfaction analysis, the audience stage lighting effect satisfaction is the highest (score of 178), the rest of the satisfaction in descending order: the form of stage scheduling (score of 171), the overall stage modeling and stage performance scores are equal, are 170 points.
The audience analyzed the performance of the performance
| Performance element | Satisfaction | Number of people | Score | Percent |
|---|---|---|---|---|
| Dance together with the overall shape | Very satisfied | 22 | 110 | 0.55 |
| Relatively satisfied | 10 | 40 | 0.25 | |
| Normal | 5 | 15 | 0.125 | |
| Not satisfied | 2 | 4 | 0.05 | |
| Very dissatisfied | 1 | 1 | 0.025 | |
| Stage lighting effect | Very satisfied | 26 | 130 | 0.65 |
| Relatively satisfied | 9 | 36 | 0.225 | |
| Normal | 3 | 9 | 0.075 | |
| Not satisfied | 1 | 2 | 0.025 | |
| Very dissatisfied | 1 | 1 | 0.025 | |
| Dance expression | Very satisfied | 23 | 115 | 0.575 |
| Relatively satisfied | 9 | 36 | 0.225 | |
| Normal | 4 | 12 | 0.1 | |
| Not satisfied | 3 | 6 | 0.075 | |
| Very dissatisfied | 1 | 1 | 0.025 | |
| Stage scheduling form | Very satisfied | 24 | 120 | 0.6 |
| Relatively satisfied | 8 | 32 | 0.2 | |
| Normal | 5 | 15 | 0.125 | |
| Not satisfied | 1 | 2 | 0.025 | |
| Very dissatisfied | 2 | 2 | 0.05 |
Table 5 shows the performance music audience satisfaction, performance music audience satisfaction from high to low in order of ranking: music and stage picture between the cooperation, music and lighting between the rhythmic sense of embodiment, music and performance style combination, music, sound production effect, satisfaction with the total score were 184, 179, 176 and 174.
Performance of music audience satisfaction
| Performance element | Satisfaction | Number of people | Score | Percent |
|---|---|---|---|---|
| Music, sound production effects | Very satisfied | 23 | 115 | 0.575 |
| Relatively satisfied | 11 | 44 | 0.275 | |
| Normal | 4 | 12 | 0.1 | |
| Not satisfied | 1 | 2 | 0.025 | |
| Very dissatisfied | 1 | 1 | 0.025 | |
| The experience of rhythm between music and light | Very satisfied | 26 | 130 | 0.65 |
| Relatively satisfied | 10 | 40 | 0.25 | |
| Normal | 2 | 6 | 0.05 | |
| Not satisfied | 1 | 2 | 0.025 | |
| Very dissatisfied | 1 | 1 | 0.025 | |
| The music matches the stage picture | Very satisfied | 27 | 135 | 0.675 |
| Relatively satisfied | 11 | 44 | 0.275 | |
| Normal | 1 | 3 | 0.025 | |
| Not satisfied | 1 | 2 | 0.025 | |
| Very dissatisfied | 0 | 0 | 0 | |
| The combination of music and performance style | Very satisfied | 25 | 125 | 0.625 |
| Relatively satisfied | 9 | 36 | 0.225 | |
| Normal | 4 | 12 | 0.1 | |
| Not satisfied | 1 | 2 | 0.025 | |
| Very dissatisfied | 1 | 1 | 0.025 |
Table 6 shows the other sensory experience, in the stage performance, added a perfume rain link, this link experience satisfaction survey, the audience’s satisfaction in descending order: perfume rain and stage atmosphere catering to (satisfaction score of 181), the olfactory experience of perfume rain (satisfaction score of 176).
Other sensory experiences
| Performance element | Satisfaction | Number of people | Score | Percent |
|---|---|---|---|---|
| Perfume rain smell experience | Very satisfied | 24 | 120 | 0.6 |
| Relatively satisfied | 10 | 40 | 0.25 | |
| Normal | 4 | 12 | 0.1 | |
| Not satisfied | 2 | 4 | 0.05 | |
| Very dissatisfied | 0 | 0 | 0 | |
| Perfume rain meets the stage atmosphere | Very satisfied | 25 | 125 | 0.625 |
| Relatively satisfied | 11 | 44 | 0.275 | |
| Normal | 4 | 12 | 0.1 | |
| Not satisfied | 0 | 0 | 0 | |
| Very dissatisfied | 0 | 0 | 0 |
In this paper, AIGC technology is utilized to input textual cues related to choreography to generate the tone of choreography, and LoRA model parameters are applied to fine-tune the style. Based on the SD generation scene, combined with generative adversarial network, the choreography generation scene model is improved, meanwhile, deep neural network is utilized to realize the stage lighting positioning. Simulation experiments and empirical investigations are integrated to jointly evaluate the effect of choreography design. In the audience physiological characteristics test, the spatial elements of the choreography that received more attention from the subjects were the spatial interface elements and the set elements, and the total gaze duration of these two elements was 104.624s and 63.552s, respectively, and the heart rate of the four subjects when enjoying the stage performances ranged from 70 to 86 beats/min. Warm white color in the choreography design color simulation was the closest to the natural indoor environment with the European style distance close to the natural indoor environment, ranging from 800 to 1500 beats/min, and the color can be used as the main color in the subsequent design. The audience’s satisfaction survey, the audience’s overall perception analysis is divided into the audience of the stage performance of the communication mode and the overall feeling of the two aspects of the audience for the stage performance of the communication mode evaluation, very good and good evaluation of the majority, respectively, accounted for 62.5% of the total number of 62.5% and 17.5%, 65% of the audience that the choreography performance to the overall feeling of the audience is very good, all the audience have recognized the stage of this paper effect presented by the design.
