Research on AIGC empowering digital cultural and creative design style transfer and diversified generation methods

With the continuous development and popularization of digital technology, digital literacy is rapidly emerging as an important part of the development of national cultural and creative industries, as well as a new engine of global economic development [1-2]. Characterized by digitization, networking, and intelligence [3-4], digital literacy covers a wide range of fields such as digital cultural content creation, digital cultural product development, and digital cultural service provision, including digital film and television [5], digital music [6], digital games [7], digital design [8], digital art, and so on [9]. These fields have become an important part of digital literacy, injecting new vitality into its development. Digital Literature and Creativity plays an important role in promoting cultural inheritance, cultural innovation, promoting economic growth, and satisfying people’s needs for spiritual culture [10-12].

However, the development of digital creativity still faces some challenges. On the one hand, digital literacy needs to combine the connotation of cultural creativity to create innovative and valuable products that are recognized in the market [13]. This requires a breakthrough in design style. On the other hand, digital LCS needs to overcome technical bottlenecks and improve the quality of content and user experience [14-15]. This requires optimizing the application of technology in design and introducing new technologies to achieve stylistic diversification. Style migration and diversification generation provides new ideas, innovative expressions, new visual effects, etc. in digital cultural and creative design, which, with the help of deep learning and computer vision technology, is able to apply the style of one image to another image, thus creating works with novel artistic effects, making the art works show diversity and personalization [16-17].

In the context of the rapid development of AIGC, the author introduces artificial intelligence algorithms to the style migration of cultural and creative designs, and chooses the VGG-19 model as the pre-training model. The CycleGAN algorithm is improved through the multi-attention mechanism and bilinear interpolation method, so as to make the style migration of cultural and creative product design more natural and optimize the visual effect of cultural and creative design, and to construct the AIGC cultural and creative style migration model based on the improved CycleGAN. The improved CycleGAN model of this paper is objectively evaluated by PSNR, MSE, MS-SSIM, Per-pixel acc and the convergence of the loss function during model training. Using the hierarchical analysis method to construct the evaluation index system of literary and creative style migration application to subjectively evaluate the improved CycleGAN model of this paper, so as to explore the effect of the improved CycleGAN model on literary and creative style migration and generation.

2

Improved CycleGAN-based AIGC Cultural and Creative Style Migration

2.1

AIGC Empowering Cultural Creation

AIGC stands for Artificial Intelligence Generated Content and is also known as Generative Artificial Intelligence [18]. It refers to the creation of multiple types and styles of digital works, such as text, images, sounds, and videos, individually or in combination, based on user inputs or their own logic through the use of AI technology.AIGC is not only capable of digitally presenting and augmenting real-world content, but also generating original or variant content with the help of AI’s autonomy of creativity.AIGC has the characteristics of automation, high efficiency, creativity, and interactivity. Its key technologies cover three important elements. 1)

Data

As the core pillar of AIGC technology, data includes data sources (open domain data, domain-specific data, user data), data storage methods (centralized database, distributed database, cloud-native database, vector database); data forms (structured data, unstructured data), and data processing methods (filtering, annotation, manipulation, enhancement, etc.), which have a direct impact on the generated level and quality of content.

2)

Arithmetic power

As the hardware infrastructure of AIGC technology, the arithmetic power includes semiconductor processors (commonly CPU, GPU, etc.), servers, large-scale model computing clusters, and the construction of distributed training environments on Infrastructure as a Service (IaaS) or self-built data center deployment. This guarantees the running speed and performance of AIGC applications by providing hardware application services such as cloud computing, edge computing, and distributed computing.

3)

Algorithm

The algorithm platform covers machine learning platform, model training platform and automatic modeling platform, etc., which cover the steps of model design, model training, model inference and model deployment. These algorithm platforms build the core innovation power of AIGC technology, provide support and coverage for actual business, and reflect the ability and effect of AIGC application operations.

At present, the AIGC industrial ecosystem has constructed a three-layer structure: the top layer is the AIGC technology infrastructure built based on pre-trained models, the middle layer contains verticalized, scenario-based and personalized models and application tools, and the bottom layer is the application layer, which provides content generation services such as text, pictures, audio and video for C-end users.

AIGC is divided into four basic modes: text generation, audio generation, image generation, and video generation. Based on these basic modes, it also derives cross-modal generation between text, audio, and image, strategy generation, GameAI, and virtual human generation, and other modes.

One of the major advantages of AIGC-enabled cultural and creative products is their ability to intelligently apply big data technology, comprehensively and timely collect and analyze big data-related information, and carry out personalized product design. AIGC technology provides artists and designers with a wealth of product information, and at the same time stimulates creative inspiration, helping them to automatically generate product concepts, prototypes, styles, and other design elements at a rapid speed based on a variety of factors such as market demand, user preferences, and industry trends. This helps to create more creative and aesthetically pleasing product designs, making products more personalized and attractive.

2.2

Style Migration Algorithm

Style migration algorithms combine art creation and computer vision to achieve image style conversion through deep learning. Neural network-based style migration algorithms are mainly classified into two categories: one is the optimization method, which extracts features by pre-training CNNs (e.g., VGG-19 or ResNet) and adjusts the pixels of the input image to be close to the features of the target style by minimizing the loss function. Another class of methods is based on synthetic adversarial networks (GANs), which use synthesizers and discriminators to learn style transformation mapping relationships. There are also other methods, such as self-encoder-based methods and variational autoencoder-based methods. In this paper, the VGG-19 pre-trained model is used in the experiments [19].

2.3

Pre-training convolutional neural network models

The current artificial intelligence applications are mainly based on deep learning technology as its core foundation, through the development of neural networks, deep learning, and the formation of the use of convolutional networks as the basis for the construction of the form. As the basis of many AI applications, this section will unfold with common convolutional networks, and the convolutional network operation process related to style migration is introduced with VGG network.

2.3.1

Fundamentals of Convolutional Networks

1)

Convolutional layer

The convolutional layer plays a central role in the convolutional neural network, which mainly performs matrix operations for feature extraction. In the operation, different sizes and shapes of convolution kernels with corresponding step size are used to perform matrix operations to obtain features such as edges, lines, global, local, etc., which provide different training data for subsequent network training to realize the corresponding network parameter update. In the convolutional operation, the core parameters are the size of the convolutional kernel and the running step size, both of which form the basis of the convolutional operation.

Each convolution kernel in the convolution process can only get one data, so the convolution process can not only carry out the extraction of feature maps, at the same time, it also has a positive significance for reducing the amount of data. In convolutional computation, each convolutional kernel slides accordingly according to the corresponding step size to obtain the corresponding computational regions in horizontal and vertical directions, and then repeats the matrix inner product computation.

Therefore, the size of the convolution kernel and the motion step will determine the size of the data shape after convolution.

2)

Pooling layer

The pooling layer is an important part of the convolutional neural network, and the main role of the pooling layer is to reduce the corresponding shape of the input data, downsampling, in order to reduce the parameters and reduce the amount of subsequent calculations. In the calculation process, similar to the convolutional calculation process of convolution, the pooling process is also based on the shape of the convolutional kernel, and then processes the data of the same shape.

In each pooling calculation, similar to the convolutional calculation, the shape of the convolutional kernel is used as the basis for the calculation, and each time the move is made in the corresponding step to obtain the next calculation region. In the multilayer network model, pooling can continuously merge the data for processing, from shallow to deep, and continuously merge and feed the shallow local information to the high-level depth information, so as to realize the global grasp of the input data.

3)

Activation function

In the convolutional neural network, the activation function is one of the special layers, whose main role is to carry out a nonlinear transformation, remapping the data under the action of nonlinear activation function, in order to increase the nonlinear expression ability of the network, so as to realize better training. Ideally, the activation function should directly output the input data as “0” and “1” through a certain threshold. In the process of forward propagation and error back propagation of convolutional neural network, it is necessary to carry out operations such as derivation, so the activation function is required to have the properties of continuity and differentiability.

The common functions used in the construction of the network and the use of the model layer for style migration are shown below: (1) $S i g m o i d (x) = \frac{1}{1 + e^{- x}}$ (2) $Tanh (x) = \frac{e^{x} - e^{- x}}{e^{x} + e^{- x}}$ (3) $Re l (x) = {\begin{cases} x, x \geq 0 \\ 0, x < 0 \end{cases}$ (4) $L e a k y Re l u (x) = {\begin{matrix} x, x \geq 0 \\ a x, x < 0 \end{matrix}, a i s a c o n s \tan t$

Relu-like activation functions were first used by the Alexnet network in 2012. The derivative of the Relu function is always 1 in derivation training, which is a good solution to the gradient problem of Sigmoid, Tanh, and other functions in deeper networks. At the same time, the ease of derivation makes the training speed up. However, its negative half-axis gradient is 0, if the learning rate is larger there may be neuron necrosis, Leakrelu in Relu based on its negative half-axis using a smaller positive number to obtain a linear parameter to mitigate the disadvantage of Relu.

2.3.2

VGG Convolutional Networks

In 2014, the University of Oxford proposed the VGG network, which uses a uniform 3x3 convolutional kernel instead of a large one, and uses a deeper network layer depth to obtain better results. In the subsequent development, VGG gained good development and was widely used, and in 2015, VGG network was used for the first time for style migration feature extraction. There are two commonly used network architectures in the subsequent development of VGG, one is VGG16 and the other is VGG19 network. In the use of VGG for style migration, either using one of the layers for model construction or using a pre-trained network for feature extraction, the convolution, pooling, and activation layers before full connectivity are mainly used. The computational procedure for feature map extraction using VGG is shown in equation (5): (5) $o u t_{k} = v g g_{i} (i n p u t)$

Where, out_k denotes a certain feature map acquired, vgg_i denotes the constructed model network, and input denotes the input data, which is optimized by the corresponding loss calculation after acquiring the feature map.The VGG19 parameter construction is shown in Fig. 1.

2.4

Style migration model based on improved CycleGAN

2.4.1

Multi-attention mechanisms

Self-attention mechanism is a mechanism that can model global dependencies within a sequence [20]. It obtains a new feature representation of a sequence by computing the correlation between positions in a sequence. Compared to traditional RNNs, this global attention mechanism can model dependencies at arbitrary distances in long sequences more efficiently and in parallel. The self-attention formula is as follows: (6) $A t t e n t i o n (Q, K, V) = s o f t \max (\frac{Q K^{T}}{\sqrt{d}}) V$

The flow of multi-head self-attention computation is as follows: first, the input sequence $X \in ℝ L \times D$ is mapped to different query, key, and value vector spaces, and then the query, key, and value vectors are divided into H equal parts along the last dimension according to the number of heads H. Next, in each copy of the query, key, and value subspaces, a separate self-attention computation is performed to obtain the outputs of the H heads [head_{1}, ..., head_{H}]. Finally, all the header outputs are spliced together as the final multi-header self-attention output vector Z. The multi-header mechanism allows the model to enhance the expressive power of the model by jointly learning different subspace representations of the inputs and integrating the features from the different representation subspaces. A schematic of the multi-head self-attention computation is shown in Fig. 2.

The specific computational procedure of the multi-head self-attention layer is: 1)

Obtain Q, K and V vectors: assume the input sequence is $X = [x_{1}, \dots, x_{L}] \in ℝ^{L \times D}$ , where L is the length of the sequence and D is the dimension of the input vector. The input sequence X is first linearly transformed, and for each x_i, its corresponding query vector $q_{i} \in ℝ^{D}$ , key vector $k_{i} \in ℝ^{D}$ and value vector $v_{i} \in ℝ^{D}$ are obtained. The linear transformation process for the whole input sequence X can be expressed as follows: (7) $Q = X W^{Q} \in ℝ^{L \times D}$ (8) $K = X W^{K} \in ℝ^{L \times D}$ (9) $V = X W^{V} \in ℝ^{L \times D}$

where $W^{Q} \in ℝ^{D \times D}$ , $W^{K} \in ℝ^{D \times D}$ , $W^{V} \in ℝ^{D \times D}$ are learnable linear mapping matrices. Next, the vector subset ${Q_{i} \in ℝ^{L \times D h}, K_{i} \in ℝ^{L \times D h}, V_{i} \in ℝ^{L \times D h} | i = 1, 2, \dots, H}$ used for each head can be obtained by splitting it into H equal parts according to the number of heads along the direction of the last dimension of the mapped Q, K, V vectors, i.e., the dimension D of the vectors, which $D_{h} = \frac{D_{v}}{H}$ denotes the dimension of the vectors corresponding to each head. Then, a self-attention calculation needs to be performed based on the Q, K, and V vectors of each head. 2)

Self-attention calculation: calculate the self-attention of each head separately, take the jth head head_j as an example, the formula is: (10) $h e a d_{i} = A t t e n t i o n (Q_{j}, K_{j}, V_{j}) \in ℝ^{L \times D h}$

3)

Multihead result fusion: multiple self-attention results are fused to obtain the final output vector Z: (11) $Z = M u l t i H e a d A t t e n t i o n ≜ (h e a d_{1} \oplus h e a d_{2} \oplus \dots \oplus h e a d_{H}) W$

Where ⊕ denotes the splicing along the last dimension of the vector, the dimension of the spliced vector may not be equal to the dimension of the original vector D, so here the matrix W⁰ is multiplied to map the vector to the original input dimension. I.e.: (12) $M u l t i H e a d (Q, K, V) = C o n c a t (h e a d_{i}, \dots, h e a d_{h}) W^{0}$ (13) $h e a d_{i} = A t t e n t i o n (Q W_{i}^{Q}, K W_{i}^{K}, V W_{i}^{V})$

To further enhance the model’s ability to generate target images, while using the multi-head self-attention mechanism, i.e., mapping the input features to multiple subspaces, performing self-attention computation in each subspace individually, and finally stitching the outputs of all heads. This allows learning different semantic expressions of the input, such as context dependency in terms of color, texture, shape, etc., and facilitates the learning of style migration.

2.4.2

Bilinear Interpolation

The up-sampling method used in the decoder part of the traditional CycleGAN generator structure is the inverse convolution, also known as transposed convolution. However, there are some problems with the anti-convolution operation. When a large-size convolutional kernel is used, the sensory field is wider, which can enhance the utilization of features and improve the quality of the restored image. However, the phenomenon of “uneven overlap” can also occur, resulting in obvious stacking traces in the image. In addition, inverse convolution is also prone to the checkerboard effect and low-frequency artifacts when up-sampling the image resolution.

To address this problem, the sampling method of bilinear interpolation is used instead of the anti-convolution [21], which can provide smoother sampling and avoid the artifact problem of the anti-convolution, because the bilinear interpolation is a linear interpolation using the surrounding four points, which can make full use of more peripheral pixel information, and the operation of local convolution is smoother compared to the anti-convolution. Bilinear interpolation does not produce a problem of mismatch between the convolution kernel and the step size, and it can accurately interpolate at any scaling factor, thus effectively eliminating the checkerboard effect of deconvolution. Bilinear interpolation has a small amount of computation, which improves the efficiency while ensuring the effect. The specific principle is:

Let the image to be upsampled be I, with size H × W. Scale it to obtain image I′, with size rH × rW (r being the scaling factor). For each pixel (x, y) in I′, first calculate its corresponding coordinate (x′, y′) in the original image I: (14) $x' = \frac{x}{r}$ (15) $y' = \frac{y}{r}$

The pixel values are then calculated by the bilinear interpolation formula: (16) $\begin{array}{l} I' (x, y) (1 - λ x) (1 - λ y) I (x^{″}, y^{″}) + λ x (1 - λ y) I (x^{″}, y^{'} + 1) \\ + (1 - λ x) λ y I (x' + 1, y^{″}) + λ x λ y I (x^{″} + 1, y^{″} + 1) \end{array}$

where λx = x′ − x″, λy = y′ − y″, and x″ are the coordinates obtained by rounding down x′, and y″ is the coordinate obtained by rounding down y′.

2.4.3

Improving the network structure of the model

In this section, an improved CycleGAN modeling approach [22] is proposed for the image style migration task. The improved network is based on the generator structure of the original CycleGAN in the following two ways: first, a multi-head self-attention mechanism is added between the encoder and the converter, and a new network structure connection from the encoding layer to the converter layer is designed to enable better output of style features to the converter layer. The self-attention module can model global dependencies between different regions of the image and capture the intrinsic structural information of the image. The multi-head design can learn the feature representations of different subspaces, and different heads can focus on different global structural information of the input image, such as shape, texture, etc., and aggregate the information to enhance the model’s understanding of the global structure of the image. This parallel multi-task learning approach can enhance the effect of style migration, thus generating resultant images with more natural style migration. Second, in the decoder’s decoding network, the inverse convolution is replaced with bilinear interpolation for up-sampling. Bilinear interpolation can obtain more natural and smooth sampling results, effectively reducing artifacts such as the checkerboard effect that is easily produced by the anti-convolution operation. This further optimizes the visual quality of the style migration results. By incorporating the self-attention mechanism with the use of bilinear interpolation, the improved CycleGAN network proposed in this paper further enhances the generation effect of style migration while maintaining the processivity of the original network. The overall network structure is shown in Fig. 3.

3

Analysis of style migration and image generation effects

In this paper, we explore the image generation effect of the improved CycleGAN style migration model using the cultural and creative products of the Forbidden City as an example.

3.1

Objective evaluation

In order to verify the effectiveness of the proposed method MSRes-CycleGAN, three experiments are designed, which are the conversion of original image → cartoon style, original image → oil painting style, and original image → new Chinese style. Starting from objective evaluation indexes, the differences between the improved CycleGAN and other style migration methods are compared using peak signal-to-noise ratio (PSNR), mean square error (MSE), multi-scale structural similarity index (MS-SSIM), per-pixel accuracies in the FCN scores, and the convergence of the loss function during model training. To illustrate, the final evaluation results of each objective index for the conversion results of original image → cartoon style, original image → oil painting style, and original image → new Chinese style are shown in Table 1.

Table 1.

Style transfer experiment evaluation results

Style transfer	Method	PSNR/dB	MSE	MS-SSIM/dB	Per-pixel acc
Original image→cartoon style	AdaIN	20.304	87.851	0.871	0.040
	SANet	19.801	85.228	0.919	0.041
	StyTr2	20.181	88.609	0.921	0.051
	AdaAttN	20.436	87.892	0.887	0.064
	CycleGAN	19.778	90.999	0.898	0.067
	Ours	21.293	82.678	0.939	0.071
Original image→oil painting style	AdaIN	17.841	84.445	0.836	0.576
	SANet	17.477	84.234	0.832	0.477
	StyTr2	17.583	83.689	0.843	0.524
	AdaAttN	17.552	83.709	0.842	0.608
	CycleGAN	17.635	83.275	0.838	0.419
	Ours	17.857	81.808	0.854	0.723
Original image→new Chinese style	AdaIN	22.882	73.425	0.951	0.088
	SANet	23.214	72.737	0.946	0.106
	StyTr2	22.833	72.887	0.959	0.116
	AdaAttN	22.874	74.443	0.941	0.155
	CycleGAN	22.956	74.198	0.955	0.142
	Ours	24.306	71.243	0.962	0.179

Observing Table 1, it can be found that in the conversion of original image → cartoon style, the PSNR value and MS-SSIM value of the improved CycleGAN method in this paper are 21.293dB and 0.939dB respectively, which are the maximum value among all methods, the MSE value is 82.678, which is the minimum value among all methods, and the Per-pixel acc value is 0.071, which is the the maximum value among all the methods. The improved CycleGAN method achieves the same effect in the conversion of original image → oil painting style and original image → new Chinese style. It can be seen that the improved CycleGAN method in this paper obtains the best results in all three image style migration experiments. The images generated by this model are visually more vivid, the texture is more specific and realistic, and the subjective visual effect is better than the style conversion of other style migration algorithms, with the best style migration effect.

The loss functions of the training process for the three datasets of original image → cartoon style, original image → oil painting style, and original image → new Chinese style are shown in Fig. 4. From the convergence of the loss function, all of them tend to converge in the end, but the improved CycleGAN converges fast and rarely rises, indicating that the model training strategy designed in this paper is better.

3.2

Subjective evaluation

3.2.1

Construction of evaluation indicator system

In order to make the evaluation of the application of improved CycleGAN for AIGC’s cultural and creative style migration more scientific and reasonable, the hierarchical analysis method (AHP) is applied here to carry out the evaluation research. The hierarchical analysis method can quantify the application evaluation indexes, so as to make the results of the evaluation more accurate and more specific. The hierarchical analysis method is applied below to evaluate the application of the improved CycleGAN model in literary style migration. Table 2 displays the improved CycleGAN model evaluation index system for the application of cultural and creative style migration.

Table 2.

Application evaluation index system of improved CycleGAN model

	Primary index	Secondary index
Style transfer effect evaluation system	Image quality (A)	Image fidelity (A1)
		Detail accuracy (A2)
		Image diversity (A3)
		Image expression (A4)
		Image innovation (A5)
		Image art (A6)
		Image practicality (A7)
	Style simulation (B)	Style transfer coordination (B1)
		Style transfer speciality (B2)
		Style transfer applicability (B3)
		Style fusion rationality (B4)
	Connotation expression (C)	Connotation expression form (C1)
		Connotation art (C2)
		Cultural expression (C3)
		Historical expression (C4)

In order to simplify the calculation, Matlab software is applied here to calculate the weights, only the judgment matrix needs to be entered into the Matlab software, and the weight value of each indicator can be calculated quickly. After performing the judgment matrix consistency test, calculate the synthetic weight of the moment, which is the ratio value occupied by each index. The calculation results are shown in Table 3.

Table 3.

Application of evaluation index weight synthetic distribution in improved CycleGAN

	Primary index	Weight	Secondary index	Weight	Synthetic weight
Style transfer effect evaluation system	Image quality (A)	0.3758	Image fidelity (A1)	0.1274	0.0479
			Detail accuracy (A2)	0.1820	0.0684
			Image diversity (A3)	0.1136	0.0427
			Image expression (A4)	0.1538	0.0578
			Image innovation (A5)	0.1243	0.0467
			Image art (A6)	0.1548	0.0582
	Image practicality (A7)	0.1441	0.0542
	Style simulation (B)	0.3815	Style transfer coordination (B1)	0.2639	0.1007
			Style transfer speciality (B2)	0.2213	0.0844
			Style transfer applicability (B3)	0.2847	0.1086
			Style fusion rationality (B4)	0.2301	0.0878
	Connotation expression (C)	0.2427	Connotation expression form (C1)	0.1745	0.0423
			Connotation art (C2)	0.2942	0.0714
			Cultural expression (C3)	0.2736	0.0664
			Historical expression (C4)	0.2577	0.0625

3.2.2

Subjective evaluation results

Using the improved CycleGAN model of this paper to carry out style migration for the cultural and creative products of the Forbidden City, six experimenters (numbered 1~6) were invited to score the generated stylized cultural and creative products with subjective evaluation, and the scoring range was 0-10 points. The scores of each index are calculated on a percentage basis, and the comprehensive score of the stylized migration of cultural and creative products with the improved CycleGAN model is finally calculated as shown in Table 4.

Table 4.

Weight calculation of application evaluation score in improved CycleGAN

Primary index	Secondary index	Synthetic weight	Experimental participant
Primary index	Secondary index	Synthetic weight	1	2	3	4	5	6
Image quality (A)	A1	0.0479	9	8	9	8	8	9
	A2	0.0684	9	8	9	9	8	8
	A3	0.0427	8	8	9	8	9	9
	A4	0.0578	9	9	9	9	9	8
	A5	0.0467	10	10	9	10	9	9
	A6	0.0582	9	9	10	10	9	8
	A7	0.0542	10	10	10	10	9	10
Style simulation (B)	B1	0.1007	10	9	10	8	8	8
	B2	0.0844	9	10	9	8	8	8
	B3	0.1086	9	9	9	9	8	9
	B4	0.0878	9	10	9	10	8	9
Connotation expression (C)	C1	0.0423	8	9	10	8	9	9
	C2	0.0714	9	8	8	10	9	9
	C3	0.0664	9	10	8	9	9	9
	C4	0.0625	9	8	9	10	9	9
Decimal score			9.1166	9.0466	9.1176	9.0628	8.5022	8.6847
Centesimal score			91.17	90.47	91.18	90.63	85.02	86.85
Overall evaluation average score			89.22

From Table 4, it seems that the evaluation of the application effect of the improved CycleGAN model on the style migration of cultural and creative products is composed of three aspects: image quality, style simulation, and connotation conveyance. The average score of the comprehensive evaluation is 89.22 after scoring and converting into weights by the six experimenters. According to the author’s proposed rating scale (excellent 85-100 points, good 70-84 points, passing 60-69 points, failing 60 points or less) to see just reached the excellent level. It also shows that the performance of the application of the improved CycleGAN model in style migration and the generation of cultural and creative products is excellent.

4

Conclusion

In this paper, AIGC is used to empower cultural and creative design, and the CycleGAN algorithm in artificial intelligence algorithm is improved to construct a model for cultural and creative style migration based on improved CycleGAN. Objective and subjective evaluations of the style migration and image generation effects of this paper’s model for cultural and creative products are carried out to obtain the effect of this paper’s model on cultural and creative design.

In the conversion of original image → cartoon style, original image → oil painting style, and original image → new Chinese style, the PSNR values of the improved CycleGAN model in this paper are 21.293dB, 17.857dB, and 24.306dB, the MSE values are 82.678, 81.808, and 71.243, and the MS-SSIM values are 0.939 dB, 0.854 dB, and 0.962 dB, and Per-pixel acc values of 0.071, 0.723, and 0.179, respectively, which all achieve the best style migration performance among all comparison models. On subjective evaluation, the improved CycleGAN model in this paper achieved a comprehensive score of 89.22 points, and the percentile score range of each experimenter’s rating was 85.02~91.18, with two experimenters’ ratings lower than 90 points, which is an excellent grade overall, and the AIGC Cultural and Creative Style Migration Model with improved CycleGAN in this paper has an excellent performance in the cultural and creative products’ style migration and image generation with excellent performance.

Lingua:: Inglese

Frequenza di pubblicazione:: 1 volte all'anno
Argomenti della rivista:: Scienze biologiche, Scienze della vita, altro, Matematica, Matematica applicata, Matematica generale, Fisica, Fisica, altro

Feed RSS della rivista

Research on AIGC empowering digital cultural and creative design style transfer and diversified generation methods

Ran Jia

Pubblicato online: 24 mar 2025

Ricevuto: 12 nov 2024

Accettato: 13 feb 2025

DOI: https://doi.org/10.2478/amns-2025-0791

Parole chiaveAIGC, Style migration, Improved CycleGAN, Digital cultural creation

© 2025 Ran Jia, published by Sciendo

This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.

Parole chiave
AIGC, Style migration, Improved CycleGAN, Digital cultural creation