Research on data-driven optimization of cross-border e-commerce copywriting and artwork
Publicado en línea: 29 sept 2025
Recibido: 14 ene 2025
Aceptado: 10 may 2025
DOI: https://doi.org/10.2478/amns-2025-1133
Palabras clave
© 2025 Shaolin Hu, published by Sciendo.
This work is licensed under the Creative Commons Attribution 4.0 International License.
With the continuous development of the Internet, cross-border e-commerce has become a global hot topic. The development potential of this market has been increasingly emphasized by governments and enterprises, and it has become a global trade model [1-3]. Copywriting and artwork are important factors in enterprise marketing, determining the consumer behavior of products, and the level of copywriting and artwork in cross-border e-commerce determines the competitiveness of cross-border e-commerce enterprises [4-5].
As an important part of cross-border e-commerce, copywriting will become an indispensable core ability of e-commerce enterprises. A successful cross-border e-commerce copywriting needs to be attractive. When writing copy, the needs and preferences of the target customers should be fully considered, and the features and advantages of the products should be described in simple and vivid language, so that the customers can understand them at a glance [6-9]. In addition, some vivid adjectives and interesting metaphors can be used to make the copy more vivid and interesting, attract customers’ eyes and arouse their interest. At the same time, credibility and emotional resonance, etc. are also key factors of successful copywriting, and only by reasonably integrating these elements into the copywriting can we attract customers’ attention and increase the sales of products [10-13]. And cross-border e-commerce artwork is an important part of cross-border e-commerce operation, which enhances the attractiveness and saleability of goods through art design, product display, advertising and other means [14-15]. Cross-border e-commerce artisans need to have certain art design ability and cross-border e-commerce operation knowledge to meet the needs of commodity display and enhance the sales effect of commodities, and for cross-border e-commerce companies, having a group of professional artisan team is very necessary, and it is also an important guarantee to enhance the sales effect of e-commerce [16-19].
This paper utilizes Octopus collector to crawl a cross-border e-commerce website in the commodity marketing copywriting and aesthetics related data, and crawled data for image information extraction, special symbols and other replacement filtering, text de-emphasis and missing value processing and other pre-processing operations to do. For the traditional cross-border e-commerce copywriting manual writing in the low-quality and low-efficiency problems, this paper according to the principles of text generation technology and the idea of factor decomposition machine model, to build the keyword theme control copywriting generation model based on the cross-terms encoder, and the introduction of the attention mechanism in the model to realize the arbitrary adjustment of the sequence of the generated copywriting and the collection of keywords in the copywriting process of the degree of participation. On the basis of the semantic fusion-based generative adversarial network framework, this paper constructs a generative adversarial network model based on coding and decoding structural discriminators for generating cross-border e-commerce artwork images. By selecting different benchmark comparison models and evaluation indexes, we analyze the performance superiority of this paper’s model in the optimization of cross-border e-commerce copywriting and artwork, and further demonstrate the intrinsic synergistic optimization effect of the two models in this paper, so as to provide technical support for the optimization and development of cross-border e-commerce enterprises.
Traditional cross-border e-commerce copywriting is mostly generated manually by professional writers, which theoretically ensures the fit between the copywriting and the products to a certain extent. However, in practice, due to the uneven level of writers, deviations in the understanding of commodity tone and other issues, resulting in low quality of manually written copy. Meanwhile, for cross-border e-commerce platforms with more commodity data, manually writing commodity marketing copy is a time-consuming and inefficient task. Therefore, the research decides to build a cross-border e-commerce copywriting automatic generation model using natural language processing technology, and promote the optimization of cross-border e-commerce copywriting by mining the commodity and copywriting data in cross-border e-commerce platforms. In this paper, we design a keyword topic-controlled copywriting model based on cross-term encoder, firstly, the input keyword set is encoded by cross-term encoder to get the semantic vectors of the keyword set, secondly, it is inputted into the decoder through the attention mechanism to act on the specific process of copywriting, and finally, the generative adversarial network is used to improve the performance of the model.
The data used in the study comes from a large cross-border e-commerce platform. In the Discover Goods section of this platform, each recommended product will have a paragraph of commodity marketing copy written by professional writers, which will be used as the reference commodity marketing copy for the study. The Source, which is the basis for generation, mainly consists of three parts of data, namely, the title of the commodity, the attributes of the commodity, and the marketing image of the commodity marketing copy.
To carry out the research on the generation of copywriting for cross-border e-commerce platforms, it is necessary to collect the data first. Octopus Collector is a universal web page data intelligent collection tool that can collect all public web page data on the network, with rich built-in collection templates and anthropomorphic intelligent algorithms, no need to learn programming, simple operation, and easy for novices to handle. Its self-developed cloud collection technology has more than 5,000 servers around the world, which enables efficient, large-scale acquisition of the required data and rapid export or docking to internal systems. Therefore, Octopus Collector is chosen to capture data from cross-border e-commerce platforms in this study.
Since the product marketing copy and the product title and attributes are not on a single page, direct acquisition is likely to cause misalignment of the result fields. Therefore, this study firstly collects the product marketing copy and the address of the product detail page corresponding to “I want to go and take a look” on the product discovery detail page, and secondly collects the address of the page, product title, product attributes and image information on the product detail page. Finally, the two data result tables are summarized by the public field of the product detail page address. After the above collection process, the study collects a total of 45,000 usable cross-border e-commerce copywriting data.
In performing the cross-border e-commerce copy generation task, the first step in acquiring text corpus information is to perform data preprocessing. Data preprocessing transforms text into structured text form, and subsequent text representation and model training and prediction rely on the preprocessed text information. Therefore, the quality of data preprocessing is crucial. In this section, data preprocessing will be briefly explained, including picture information extraction, replacement filtering such as special symbols, text de-emphasis, and missing value processing.
Picture Information Extraction For the collected picture information, it is necessary to extract the text information from it. Translating images into text is generally known as Optical Character Recognition (OCR), which refers to the process of converting text information in images into editable, searchable, and analyzable text. Using Tesseract text recognition tool, combined with Python use can quickly achieve text recognition. In this paper, we use Tesseract for image text recognition, and test its effect accordingly. Special symbols and other replacement filters There are some links, emoticons and other special characters in the collected text data and image recognition data, and these characters are not related to the generated product copy. If these characters are not filtered, it will increase the number of vocabulary leading to the occupation of a large amount of memory, and will even have a direct impact on the effect of the generated copy, so it is necessary to filter these characters, and the core semantics of the source text content after filtering has basically no impact. In this paper, regular expressions are used to replace and filter the emoticons, links and so on. Text de-emphasis Because there will be some non-existent URLs when generating collection URLs in batches, and the web page will jump to the default page, there will be duplicate web page data, which is worthless on the one hand, and even if it is useful, it is only the first piece of usefulness, so this kind of duplicate web page data must be deleted. There are two ways to deduplicate text: using the “drop_duplicates” function in pandas and the “deduplicate” function in Excel. The drop_duplicates function contains three parameters: subset, keep, and inplace. Subset indicates the name of the column to be deduplicated, which is None by default. Keep has three optional parameters, which are first, last, and false, the default is first, which means that only the first occurrence of duplicates is retained, and the rest of the duplicates are deleted, last means that only the last duplicates that appear in turn are retained, and false means that all duplicates are deleted. Inplace is a Boolean parameter, which defaults to false to return a copy after deleting duplicates, and true indicates that duplicates are directly removed from the original data. In this study, 44892 data were used, 3072 were deleted, and 41820 data were left. Missing value processing In the process of data collection, there will be web page failure, data collection failure, etc., resulting in data results in the presence of missing data, etc., so it is necessary to carry out missing value processing of data. Missing value means that the value of a certain indicator or some indicators in the existing data set is incomplete. Since the experimental data is text data, the filling class method is not applicable, so the experiment uses the method of deleting samples. After the method of deleting samples, a total of 1280 data with missing values are removed, and a total of 40540 data are finally used for the study of this paper. Constructing the vocabulary list Building a glossary is a crucial step when constructing the dataset used to train the model. A glossary digitizes text, allowing computers to understand and process it. In order to ensure the effectiveness of the vocabulary list, the word frequency information and the size of the vocabulary list usually need to be taken into account when constructing the vocabulary list. A vocabulary that is too large leads to high computational cost in acquiring word to poverty, while a vocabulary that is too small makes many different words share a single representation and loses the independence of the words. For this reason, a word frequency-based approach is usually used to construct vocabularies, and a threshold is set to select words with higher frequency of occurrence as members of the vocabulary in order to balance computational efficiency and encoding quality. In this paper, words filtered by deactivated words and with a word frequency less than 5 are called low-frequency words, which are denoted by the <unk> symbol, and the size of the vocabulary list is set to 8000. Text serialization Text serialization refers to the process of converting the state information of the text into a form that can be stored or transmitted. Word embedding will not directly convert the text into vectors, but first into numbers, and then into vectors, the realization of this process requires text vectorization. Text serialization to achieve the specific ideas: first of all the sentences for the word separation, and then the words into the dictionary, according to the number of times the words are filtered and counted the number of times, here the use of Python’s collection module in the Counter to complete the final realization of the text to digital sequences and digital sequences to the text of the method.
The model in this paper contains three main parts, namely encoder, decoder and discriminator. The main structure of the model is shown in Fig. 1, in the left frame is the copy generator based on cross term encoding, which consists of two parts, the cross term encoder and the decoder.

Structure diagram of topic control text generation model based on cross-term
In this paper, we draw on the idea of factorial decomposition machine model [20] to mine the hidden information between keywords through the combination of keywords, so as to enrich the input of the model and reduce the sensitivity of the model to the temporal position of keywords. The structure of the factorizer model is shown in equation (1):
Where: the first two terms are ordinary logistic regression linear models that consider only individual features individually, completely ignoring hidden connections between individual features of the input.
The model in this paper is designed based on the characteristics of natural language, and the specific model structure is shown in Figure 2. As can be seen from the figure, the whole cross term coding model has 3 layers, the first layer is the most core layer in the whole model, and the main operation of the cross term layer is to combine the keywords in order to get the combination of features between the keywords.

Structure of cross-term coding model based on keyword topic control
The left border of the figure shows the implementation details of the crossover layer, in which the keyword serial numbers are firstly input into the embedding layer for vector representation, and then the feature vectors of the preliminary vectorized representation are input into the fully-connected layer in order to enrich the model parameters as well as to adjust the vector dimensions, and finally the keywords are combined with each other to obtain the combined vectors. The specific realization process of this step is shown in equation (2):
where
The second layer is the fully connected layer, and the specific realization process is shown in equation (3):
where
The third layer is the softmax layer, which takes the output of the fully connected layer as the input of the softmax function for normalization operation and introduces the nonlinear components, and then the feature vectors after the normalization operation are weighted and summed as shown in Eqs. (4) and (5):
where
The aim of this paper is to realize the arbitrary adjustment of the participation of generated copy sequences and keyword collections in the copy generation process by introducing an attention mechanism [21]. A weighting factor is used to break the situation that the contribution of generated copy sequences and keyword collections to the generation process is always the same in the connected attention mechanism. The intensity of different semantic vectors can be flexibly adjusted by changing this
Where:
Cross-border e-commerce aesthetics due to its commercial nature, the need to have a high visual communication effect, for the aesthetics of the image also has the content and the text to maintain a high degree of consistency, high quality and style style rich and other requirements. Therefore, the research adopts the text-to-image technology, utilizing the copy generation model constructed in this paper to generate commodity marketing copy, and then generating high-quality realistic images that conform to the content of the copy and cover rich details according to the description of the copy. Since the images generated from the same paragraph of descriptive text may be completely consistent, the generated images have diversity. Thus, the optimization of cross-border e-commerce artwork images can be achieved.
Semantic Fusion based Generative Adversarial Network (SF-GAN) is composed of a character encoder coding and Gaussian distribution generating random noise vectors, and two inputs [22]. The role of the Gaussian random noise vectors is to ensure the diversity of the generated images, i.e., to make the generated images as diverse as possible and to ensure that the generated images are consistent with the given text. The core part of the generative procedure of the SF-GAN consists of six up-sampling layers, six fusion modules (FMs), and one convolutional layer, where each FM is a residual structure consisting of a SATM and a SJAM. The formula is shown in equation (8):
where
In Eqs. (9) to (11),
In Generative Adversarial Network based on Coding and Decoding Structured Discriminators (SF-GAN-V2), the discriminators constructed using coding and decoding can discriminate the whole picture and the parts of the picture. In this section the network architecture of the proposed SF-GAN-V2 is schematically shown in Fig. 3, from which it can be seen that the main part of the whole network consists of a pre-trained text encoder, a generator and a discriminator. The work in this paper focuses on making some improvements on the difference between the text encoding and image generator modules in SF-GAN-V2, and proposes a high-precision image synthesis method based on SATM and SJAM.

SF-GAN-V2 network architecture
According to the given encoding-decoding architecture, the encoder Ddec is dominated by the convolutional layer and the underlying component Block A, and the decoder Ddec is dominated by the convolutional layer and the underlying component Block B. The dimensions labeled in this figure are the output characteristic dimensions of each component, and “□” represents the collocation along the channel axis. For all screening devices, the inputs are RGB images with a resolution of 3 × 25 × 256 × 256 and statement vectors.
One 3 × 3 image with a resolution of 3 × 256 × 256 is convolved by 3 × 3 to obtain a feature map with a size of 32 × 256 × 256. The feature map obtained in the first step is input into Block A and sampled to obtain a feature map of dimension 4×128×128, and then five Block A are input sequentially to obtain feature maps of dimensions 128×64× 64×64,256×32×3, 2512×16×16 and 12×8×8,512×4×4. Since the one-dimensional vector value used for recognition is 256 dimensions, this paper will copy the space of the vector value into a 256×4×4 dimensional eigenvalue with a resolution of 4×4, and the 256×1×1 dimensions of the eigenvalues on the eigenvalue will be the initial vector value, and then the copied eigenvalue will be combined with the output of the final Block A in the Ddec, to obtain an eigenmap with textual and image meaning. 768×4×4 dimensional feature value with textual meaning and image characteristics. The feature map obtained in the first step of the 3×3 convolutional layer is used for feature extraction to obtain a 64×4×4 feature map, and then it is activated by using the Re LU activation function, and finally the 4×4 convolution is used again to obtain a 1×1×1 feature map, and the value obtained at this time is the authenticity of the whole image and the probability of whether it is in accordance with the text or not.
In this section, we will first briefly describe the basic setup and evaluation metrics of the experiment, and then test the training effect of the model and the quality of the generated text, and compare it with other models that have become more advanced in recent years.
This study was conducted under the CentOS Linux release 7.5.1804 operating system using two NVIDIA Tesla V100S 32G graphics cards, Intel(R) Xeon(R) Gold 5218 CPU@2.30GHz, and 512GB of RAM in conjunction with the latest version of the PyTorch Deep Learning framework with the CUDA 11.2 and cuDNN8.1 libraries for experiments. The text encoder in this study is BART encoder and the maximum length of the output text sequence is 128. The study uses Adam as the optimization algorithm and sets the learning rate to 0.00001. The specific hyperparameter setting information is shown in Table 1.
Hyperparameter details
| Hyperparameter | Data set |
|---|---|
| Learning Rate | 0.00001 |
| Warmup Steps | 380 |
| Eval Period | 105 |
| Beam Size | 4 |
| Length Penalty | 1.3 |
| Optimizer | Adam |
| Num Nodes | 48 |
| Num Relations | 57 |
| Embedding | 769 |
| 0.17 | |
| 0.48 | |
| Batch Size | 45 |
In this paper, we use the following state-of-the-art controlled copy generation models for comparative experiments with the model Cross-GRU in this paper:
CTRL: a model for large-scale supervised training by using control codes in the pre-training phase, the experiments will use the Huggingface version. PPLM: A plug-and-play model fine-tuned to the training model by introducing an attribute discriminator, using the Huggingface version. CoCon: Similar to the model structure in this paper, self-supervised training is performed by inserting a Transformer layer into the pre-trained model. TAV-LSTM: utilizes the average weighting and weights of all topic words to represent the topic semantics, and uses Long Short-Term Memory Network (LSTM) as the coder/decoder. This experiment uses the selected cross-border e-commerce platform product marketing copy dataset, which is divided into 32,432 pieces of data as the training set for training according to the ratio of 8:2, and the remaining 8,108 pieces of data are used as the test set for testing.
Automatic Evaluation BLEU: Bilingual Evaluation Substitute (BLEU) is an automatic evaluation metric for machine translation. Using the training set as a reference, BLRU values are calculated to evaluate the generated texts. In this paper, the scores of BLEU-2, BLEU-3 and BLEU-4 are selected for comparison. The higher the score, the better the accuracy (fluency) of the generated copy. Back-BLEU: Using the generated copy as a reference, BLEU values are calculated to evaluate the copy in the training set. In this paper, the value of Back-BLEU-2 is selected for comparison, later abbreviated as B-BLEU. The higher the score, the better the recall (diversity) of the generated copy. Manual evaluation Subjective evaluation of 150 random samples generated by each model by 5 professional writers of cross-border e-commerce copywriting. Five evaluation dimensions were included: completeness (whether the generated copy is complete), accuracy (whether the generated copy is accurate), relevance (whether the generated copy is relevant to the product), fluency (whether it is well-structured grammatically and syntactically), and coherence (whether it has a thematic and logical structure). Each dimension is given a score between 1 and 5 and the final score is calculated.
The results of the automatic evaluation of the model on the training set and the test set are shown in Fig. 4, where the numbers after the model Cross-GRU in this paper indicate different values of the weighting factor

Automatic evaluation results
Fig. 5 shows the results of manual evaluation on the test set and the training set, where D1-D5 denote the five evaluation dimensions of manual evaluation. Similar to the automatic evaluation, the Cross-GRU model of this paper outperforms the best baseline models in both the manual evaluation on the test set and the training set. The training set Cross-GRU-0.6 model outperforms the best baseline model TAV-LSTM in the five dimensions by 1.32, 2.08, 2.04, 1.99, and 1.22, respectively. And the manual evaluation scores of the model in this paper on the test set and the training set also show a trend of increasing and then decreasing with the increase of the weight coefficients, and the best performance of the model is reached when the weight coefficient is 0.6. Therefore, in the experiments and analyses in the following sections, the weighting coefficient of the model is set to 0.6 by default, if not otherwise specified.

Manual evaluation results
The training effect of each model is compared below. Figure 6 shows the decrease of Loss value during the training process. It can be seen that the TAV-LSTM model has the worst convergence effect, and the Cross-GRU model in this paper has a better convergence effect than the other benchmark models. When the model tends to stabilize, the Loss value of Cross-GRU is significantly lower than the Loss value of other models.

Loss value decline during training
Figure 7 compares more visually the accuracy of the generated copy for each model after 100 rounds of training. The accuracy rate is obtained from the element-by-element comparison of the tensor of generated copy and reference copy. From the figure, it can be seen that when the models are trained after 100 rounds, Cross-GRU generates copy with the highest accuracy and TAV-LATM performs the worst. The CoCon model performs the best among the benchmark models due to its low loss rate during training and is second only to the model in this paper in terms of accuracy.

Comparison of accuracy after 100 rounds of training
In order to test the ability of the model to generate controlled texts from topic to text, i.e., the ability to control at the level of words and phrases, this paper selects a single topic word as a control text in a specific domain, and tests whether the generated text conforms to the semantics and topic of the control text. In the experiment, we will select the iconic words among the words related to “shirt” as the control text, and use the common marketing words as the cue text to generate the control text. In the experiment, each model generates test texts based on the cue texts and individual control texts, and inputs these test texts into the trained model to output the corresponding accuracy rate and

Topic similarity
Combining the above experimental results, the Cross-GRU model constructed in this paper performs well in the cross-border e-commerce copywriting task, and is able to efficiently generate rich and varied high-quality merchandise marketing copy with the data in the cross-border e-commerce platform, and the copy generated by the model is highly relevant to the merchandise theme. Using the model in this paper can effectively get rid of the constraints of low-quality and low-efficiency problems of manually written copy, and effectively realize the optimization of cross-border e-commerce copy.
In order to validate the effectiveness of the proposed model in this paper, the experiments in this section use two datasets for validation. The first dataset is the image data with copywriting annotations collected in the initial cross-border e-commerce platform, totaling 20,457 pieces of data, which is recorded as DS1. The second dataset is a self-made dataset for the study, which is produced by extracting the images in the original cross-border e-commerce dataset and deleting the original copywriting table annotations, and then utilizing the copywriting annotations obtained by this paper’s cross-border e-commerce copywriting generation model to be labeled in the corresponding images by using the This method also obtains 20457 data, which is recorded as DS2.
Throughout the training process, setting
It is difficult to assess the performance of the generative model, and although it is straightforward and reliable to decide the quality of the generated images directly by human beings, human beings are inherently subjective and different people have different standards of judgment, which can lead to unfair results. Therefore, in this paper, two widely used evaluation metrics for image quality assessment, Inception score (IS) and Frechet Inception Distance (FID), are used to quantify the model and assess the quality and diversity of images.
Inception score The Inception score is a measure to assess the quality of the generated images by the cross-entropy difference between the conditional class distribution and the edge class distribution, using a pre-trained Inception-v3 network, the performance of the generative network is calculated by counting the output of this network. Inception-v3 is a well-designed Convolutional Network model, the input is the image tensor, and the output is a 1000-dimensional vector, the value of each dimension of the output vector corresponds to the probability that the image belongs to a certain category, so the whole vector can be viewed as a probability distribution, which is calculated as follows:
Where Frechet Inception Distance The IS measure has a fatal flaw: the generated samples are not compared with the real images, so it cannot measure whether the distribution of the generated images is close to the distribution of the real images. The Frechet Inception Distance score evaluates the quality of the generated samples by calculating the Frechet Distance between the generated images and the real images, and the specific calculation method is as follows:
Where
In this paper, we experimentally validate the effectiveness of the proposed model (SF-GAN-V2) and compare the results obtained on the DS1 dataset and DS2 dataset with several models that have become better known in the last few years, including GAN-INT-CLS, GAWWN, StackGAN, StackGAN++, AttnGAN, ControlGAN , MirrorGAN, SA-AttnGAN, SegAttnGAN, DualAttn-GAN, DM-GAN, KT-GAN, and OP-GAN. The parameter settings for the comparison models are the same as those in this paper.
The IS and FID comparison results on the DS1 dataset are shown in Fig. 9. As can be seen from the figure, the method proposed in this paper has a relatively outstanding performance in both evaluation metrics on the DS1 dataset. Among them, the value of the IS metric is improved by 0.50 compared to the benchmark model (AttnGan), reaching a high score of 4.83, and the model performance is improved by 11.55%, and the metric data obtained from this paper’s model is even better compared to the multi-stage models proposed in recent years. For another evaluation metric, FID score, compared with the benchmark model (AttnGAN), this paper’s method reduces by 9.66 to reach 15.23, with a model performance improvement of 38.81%. It further shows that the quality of cross-border e-commerce artwork images generated by this paper’s method, and the degree of matching with the copy are higher.

The objective index evaluation results of the model in the DS1
In order to verify that this paper’s method also has a good generalization ability for the homemade dataset, experimental validation is also carried out on DS2 data, and the specific performance comparison results are shown in Fig. 10. The value of IS index of this paper’s method on DS2 dataset reaches a high score of 32.97, which is higher than the comparative model, especially compared to the benchmark model (AttnGAN) by 9.11. The performance of the model is improved by 38.23%, while the FID value decreases to 12.31, which indicates that the images generated by using this paper’s method are much closer to the real images, and they can control the generated intrinsic connection between images and marketing copy.

The objective index evaluation results of the model in the DS2
Meanwhile, after further observation, it can be found that the two evaluation indexes of this model on DS1 dataset and DS2 dataset have some differences. For the IS index, the score of this paper’s SF-GAN-V2 model on DS1 dataset is 4.83, while the score on DS2 dataset rises to 32.97, which is a 28.14 improvement, and the performance of the model has been improved by nearly six times. For the FID index As for the FID index, the model’s FID score on the DS1 dataset is 15.23, while the score on the DS2 dataset is 12.31, a decrease of 2.92, and the model performance also has a 19.17% improvement. Since the DS2 dataset is a self-made dataset based on the copy generated by the Cross-GRU model in this paper, it can be assumed that the SF-GAN-V2 model in this paper can generate images with higher quality based on the copy generated by the Cross-GRU model accordingly, so as to realize the co-optimization of cross-border e-commerce copy and artwork images.
By constructing the image generation model of copywriting and artwork, we get rid of the drawbacks of traditional cross-border e-commerce copywriting and artwork writing and production, and realize the effective synergistic optimization of the two. According to the designed optimization experiment analysis concludes that the model in this paper has good performance. Among them, the copy generated by the Cross-GRU model, which is mainly used for copywriting, is improved by 15.88, 39.09, 59.25, and 22.88 under the automatic evaluation indexes compared with the optimal baseline model, indicating that the model has a good copywriting generation capability. The accuracy and
