Application of Artificial Intelligence-based Content Generation Technology in News Publishing

In recent years, with the rapid development of artificial intelligence (AI) technology, its application in various fields has become increasingly widespread. The news and media industry, as an important role in information transmission and public opinion guidance, is naturally not exempt from the integration of AI technology [1-3]. In the traditional news reporting process, journalists need to complete the creation of news articles through investigation, interviewing and writing. However, with the development of AI technology, some media have begun to use content generation technology to generate news in order to improve efficiency and reduce labor costs [4-6].

The application of AI technology in news content generation is of great significance. First, AI technology can automatically generate news summaries by learning and imitating a large number of news texts. Compared with the traditional way of generating summaries, AI-generated news summaries are more accurate and time-saving. It can extract the key information from the huge amount of information and provide readers with a concise overview, which saves readers’ time and energy [7-10]. Second, AI technology can also be used for automated writing, i.e., through machine learning and natural language processing technology, enabling machines to generate news reports. This approach can significantly reduce the interference of human factors and improve the objectivity and accuracy of reporting. At the same time, it can also greatly improve the speed and quantity of news reports and make the news content more rich and diverse [11-14]. Thirdly, AI technology can analyze massive data and generate useful data reports. With the support of AI technology, through in-depth data mining and analysis to understand readers’ reading preferences, hot topics and other information, we can more accurately provide content that meets readers’ needs and improve user experience [15-18].

Literature [19] discusses the biases present in AI tools including GPT-3.5 and others. Comparing the performance of these tools with news professionals, the results present the advantages of AI tools such as free and tendency as well as their own biases, emphasizing the wide range of applications of generative AI in various domains and pointing out the importance of a careful reliance on AI content generation. Literature [20] reveals that news business models are shifting from advertising to paid models such as subscriptions, that the identity of the customer will shift from anonymous to known so that a more loyal relationship can be built between the media company and the customer, and that the age of AI will bring new opportunities for creativity along with more misinformation and manipulation. Literature [21] created a system using natural language processing techniques such as Bi-LSTM, which is capable of processing complex text data and generating high quality news content. The performance of the system was evaluated using evaluation metrics such as BLEU and the results verified the excellent performance of the system. It is emphasized that the automatic news content generation system not only ensures the relevance and accuracy of the content, but also improves the efficiency of news work, which has a wide range of applications. Literature [22] examines the changes in the news industry in the digital age, especially the paradigm shift in news production. The use of AI in news was pointed out as an important aspect of change. The case study impact of automated news generation and the prospects and shifting roles of news editors are systematically analyzed, and a comparative analysis is developed to corroborate the utility of AI and its integrated performance in news production. Literature [23] proposes an automatic information generation platform using AI in order to determine the impact that AI has on online journalism. Based on the analysis of internet trends and other relevant data, data and information are categorized so as to achieve this through information similar to that written by human journalists, a process that is made possible with the help of machine learning. Literature [24] designed NewsRobot, a prototype capable of generating news about big events, and through surveys and interviews with users who had used NewsRobot, the results concluded that users showed a preference for personalized news, more presentation elements, but expressed concerns about credibility and quality. And NewsRobot was found to have truth and accuracy, but lacked depth. Literature [25] suggests that AI can transform linear television by improving personalization, content production and viewing experience. A background survey based on Doordarshan to understand the application of AI in the management system of TV industry. Literature [26] aims to explore the importance and potential of AI in high-performance data analytics and endosomal content generation, showing that standardization and mandatory release of data for AI products is the basis for improving transparency and building trust, and that otherwise AI with the potential to improve the quality of life risks becoming a tool for suicide.

This paper firstly introduces the AIGC technology and its basic principles, and for the problem that most encoders are single-fixed and unable to represent word polysemy, Fastformer decoder is proposed which can obtain more information and output feature vectors containing contextual information at the same time. In the decoding process, the decoding results of the network model are generated through the embedded personalized pointers of user features. Finally, a reinforcement learning algorithm is used to optimize the news headline generation model to improve its degree of personalization and headline authenticity. Experiments demonstrate the excellent performance of the personalized news headline generation model incorporating user features in this paper in the task of personalized news headline generation.

2

AIGC technology and its application in the field of public information

2.1

AIGC technology and its basic principles

AIGC technology, i.e., the technology of generating content based on artificial intelligence, is a new type of technical means that combines artificial intelligence and creative content. Through advanced algorithms such as deep learning and neural networks, this technology simulates human thinking patterns and creativity to autonomously generate content with certain practical value.AIGC is not only limited to the generation of multimedia content such as text, images, audio, video, etc., but also includes applications in a variety of fields, such as game design and the construction of virtual reality scenarios. Its core lies in the fact that through a large amount of data training, the machine is able to learn the laws and techniques of content creation, and then independently complete high-quality content creation without direct human intervention.The emergence of AIGC technology greatly improves the efficiency and diversity of content production, and provides creators with a new source of inspiration and means of creation. At the same time, it also provides a more personalized and intelligent content consumption experience for ordinary users. It can be said that AIGC technology is a major innovation in the field of content creation in the digital era, and it is gradually changing the way practitioners access and enjoy cultural products.

The basic principle of AIGC technology mainly relies on deep learning and big data processing technology. First of all, AIGC technology can build a huge database by collecting massive creative materials and data, which not only contains rich and diverse creative elements, but also records the patterns and preferences of human creation. Secondly, AIGC technology can utilize deep learning algorithms to train and learn from data, so that the machine can understand and simulate the human creative process.In specific operations, AIGC technology will extract key features and rules by analyzing the creative elements and patterns in the database. Again, based on these features and rules, combined with specific creation needs, AIGC technology can autonomously generate new content. In this process, it not only simulates human creative skills, but also incorporates its own “creativity” to generate works that are both aesthetically pleasing and novel. Additionally, AIGC technology has the ability to self-optimize and learn. By constantly analyzing and learning from new data, it can gradually improve the level and diversity of its creations, so that the content it generates is closer to human creations, and even surpasses human creations in some aspects. This self-evolving characteristic makes AIGC technology a system of continuous progress and innovation, bringing unlimited possibilities to the future of content creation.

2.2

Application of AIGC in the field of journalism

2.2.1

News content generation

AIGC technology can assist journalists in quickly generating news reports and improving their timeliness. For example, in the case of breaking news, AIGC can quickly generate brief news reports based on real-time data and the background of the event, providing timely information to the public. CCTV.com’s AIGC “Artificial Intelligence Editorial Department” applies artificial intelligence to news practice on a large scale, and builds a “five-wisdom” full-chain communication process integrating intelligent planning, intelligent collection, intelligent production, intelligent operation, and intelligent auditing, and has realized the whole-process integrated intelligent production of news and products. CCTV’s Evening News has realized the integrated intelligent production of the whole process of news products. A news clip about migratory birds in CCTV’s Evening News showed traces of AI: the watermark of “AI creation” was marked in the upper right corner on that day. It also continued to use AI graphics to explain “strong convective weather”.

2.2.2

Speech recognition and image recognition

The application of AIGC technology in speech recognition and image recognition can help newsmakers quickly and accurately extract valuable information from these materials and present it. For example, Facebook’s AI lab uses deep learning technology to monitor images on social media in real-time, automatically recognizing and filtering out valuable news material.

2.2.3

News recommendation system

AIGC technology has the ability to suggest relevant news stories to users based on their interests and preferences. This not only improves the user’s reading experience, but also helps news media organizations better understand user needs and optimize content production.

3

Personalized news headline generation model incorporating user characteristics

The traditional sequence-to-sequence model generally uses the recurrent neural network as the sequence encoder, there is a single fixed and can not represent the word polysemous and other problems, in this paper the Fastformer model is used to solve this problem, the Fastformer model can be combined with the context of the content to obtain more information, which greatly improves the ability of the language model to obtain the text vector features. At the same time, the user information extracted by the personalized recommendation model is injected into the decoding process, so that the generated headline content can meet the reading behavior preferences of different users, and the model structure of personalized news headline generation is shown in Figure 1.

3.1

Fastformer-based encoder

In order to fully extract the feature information in long text content, this paper adopts the Fastformer model to model and analyze the input word vectors and output the feature vectors containing contextual information. Fastformer is a variant of Transformer that uses additive attention [27] to achieve effective contextual modeling with small linear complexity.Compared with the Transformer model, it is much more efficient in the computation process, which can save computational resources, and its modeling performance in long text is also better than the Transformer model.Since most news texts are long, in order to save computational resources and improve performance, this paper constructs a sequence encoder based on the Fastformer model and combined with the Bi-LSTM model [28].

During the computation of the Fastformer model, the query sequence is first transformed into a global query vector using the additive attention mechanism.Then the global query vector will be combined with each key vector through elementwise product operation, and at the same time, the combined sequence will be summarized to generate the global key vector through additive attention mechanism; finally, the global key vector will be multiplied with the value sequence through elementwise product operation, and then the attention sequence containing global information will be obtained through linear transformation, and this attention sequence will be combined with the query sequence and combine the attention sequence with the query sequence to form the final output.

First, the Fastformer model transforms the input matrix of length N into the query, key, and value attention matrices, which are denoted as Q = [q₁, q₂,⋯, q_N], K = [k₁, k₂,⋯, k_N], V = [v₁,v₂,⋯, v_N], respectively, by means of a separate linear Transformation layer.

Secondly, the Transformer model uses dot product operation when modeling the context information of the input sequences through the interaction between Q, K, and V. The dot product operation not only brings large computational complexity, but also makes the model’s performance degrade when modeling the context information of long sequences. Fastformer uses the additive attention mechanism to solve this problem when modeling the interaction between Q, K, and V. It uses the additive attention mechanism to compress the global context information in the Q matrix, which is summarized as a global query vector, and the attention weight a_i of the i st query vector is computed as shown in Eq. (1), where $w_{q}^{T}$ is the learnable parameter vector The global query vector q is computed as shown in equation (2).

(1)

a_{i} = \frac{e x p (w_{q}^{T} q_{i} / \sqrt{d})}{\sum_{j = 1}^{N} e x p (w_{q}^{T} q_{j} / \sqrt{d})}

(2)

q = \sum_{i = 1}^{N} a_{i} q_{i}

Then, the global query vector is modeled interactively with each key vector in the K-matrix by the element product operation, which is an efficient operation for modeling the interaction between two vectors.The Fastformer model combines the global query vectors with each key vector through an elemental product operation in order to learn the global context-aware key matrix, where the ist vector in the matrix is denoted as p_i and summarized as a global key vector containing global context information through an additive attention mechanism.The formula for the attention weight a_i of the ird key vector is shown in Equation (3), where $w_{k}^{T}$ is the learnable parameter vector, and the global key vector k is computed as shown in Equation (4).

(3)

β_{i} = \frac{exp (w_{k}^{T} p_{i} / \sqrt{d})}{\sum_{j = 1}^{N} exp (w_{k}^{T} p_{j} / \sqrt{d})}

(4)

k = \sum_{i = 1}^{N} β_{i} p_{i}

Finally, the global key vector is modeled by interacting with each value vector in the V matrix, using the same computational approach of elementwise product operation, the global context-aware value matrix is learned by elementwise product operation between vectors, and then the final hidden representation between key-values is learned by a layer of linear Transformation layer, which will be finally The obtained global attention matrix is denoted as R = [r₁,r₂,⋯,r_N]. The obtained matrix R is summed with matrix Q, which is the final output of the Fastformer model.

After obtaining the final output of the Fastformer model, the input vectors and the final output vectors are residually connected and normalized as the input of the Bi-LSTM model, the residual connection can effectively reduce the overfitting, and finally the hidden state h sequence is encoded and output through the Bi-LSTM model.

3.2

Pointer Generation Network-based Decoder

The pointer generation network model combines a pointer network with a sequence-to-sequence model based on an attentional mechanism, which not only helps to improve the accuracy of the copied information by copying words from the source text via pointers, but also retains the ability to generate new words by performing word generation via a vocabulary.

After the input vector passes through the sequence encoder to generate the hidden state sequence h, the word vector generated in the previous moment is received by the bi-directional long short-term memory network decoder to get the decoded state sequence s_t. From the encoder hidden state sequence h and decoded state sequence s_i can be derived from the attention distribution a^t, and the attention distribution a^t is used to determine the characters that need to be paid attention to among the characters of the output sequence in the moment of t, which is computed as shown in Eqs. (5), (6) shown, where v,W_h,W_s,b_att are parameters obtained by training.

(5)

Γ_{θ} = s o f t m a x (V^{T} t a n h (W_{h} h + W_{s} s_{t} + b_{a t t}))

(6)

a^{t} = Γ_{θ} (h, s_{t})

The computed attention distribution a^t is then used to generate a context vector c_i by weighted averaging of the encoder hidden states h_i. The context vector c_i is concatenated with the decoded state sequence s_i, and then two linear mappings are performed to generate a word list distribution P_vocab, which is computed as shown in Equation (7), where V^′,V,b^′,b is a parameter learned through training.

(7)

P_{v o c a b} = s o f t max (V^{'} (V [s_{t}, c_{t}] + b) + b^{'})

After calculating the word list distribution P_vocab, the model will determine whether to copy words from the original text or to generate new words through the vocabulary list by means of the generation probability P_gcπ, and the generation probability $P_{g c π}^{t}$ at the moment of t is calculated as shown in Eqs. (8) and (9), where w_k,w_s,w_x,b_psr is the parameter obtained during the training process, σ refers to the sigmoid activation function, and x_t is the input sequence of the decoder.

(8)

T_{θ} = σ (w_{h}^{T} c_{t} + w_{s}^{T} s_{t} + w_{x}^{T} x_{t} + b_{p t r})

(9)

P_{g e n}^{'} = T_{θ} (c_{l}, s_{l}, x_{l})

In order to solve the unlogged word (OOV) word problem, the pointer generation network model selects whether the current word is copied from the original text or a new word is generated from the vocabulary list by using P_seκn as a switch during the decoding process, and the probability calculation process is shown in Equation (10).

(10)

P (w) = P_{g e n} P_{v o c a b} (w) + (1 - P_{g o n}) \sum_{i : w_{i} = w} a_{i}^{'}

3.3

User Feature Embedding and Header Generation

The title generator chooses the pointer generation network, which is different from the traditional text summarization model, and can not only generate words in the summary, but also point to the words in the original text through the “pointer” method. By introducing the pointer mechanism, the pointer generation network can effectively deal with some uncommon words and entities in the original text.

User interest injection is the key to personalize the title. The modeled user interest is added to the computation of the pointer generation network, which in turn affects whether the words are generated from the table or copied from the original text in the decoding stage.

Specifically, in the encoding phase, the Transformer encoder is utilized. A sequence of word vectors v = [w_v₁,…,w_{v_n}] of the candidate news text is input to the two-layer positional encoder, which adds positional coding to the embedding vectors: (11) $P E_{(p o s, 2 i)} = sin (p o s / 1 0000^{2 i / d_{*}})$ (12) $P E_{(p o s, 2 i + 1)} = cos (p o s / 10000^{2 i / d_{w}})$ where pos is the word position and i is the dimension. Each word embedding with position information $e_{p o s}^{'}$ can be represented as: (13) $e_{p o s}^{'} = (e_{p o s} + P E_{p o s}) \oplus W_{p o s} [l]$ where ⊕ denotes splicing, and multi-head subattention is used to capture interactions between words and sentences. Therefore, the encoder hides state h = h₁ ⊕ h₂,…,h_k.

(14)

h_{i} = s o f t max (\frac{E^{'} W_{i}^{Q} {(E^{'} W_{i}^{K})}^{T}}{\sqrt{d_{k}}}) E^{'} W_{i}^{V}

where $W_{i}^{Q}, W_{i}^{K}, W_{i}^{V}$ is the learnable parameter matrix and E^′ denotes the word sequence embedding in the candidate news.

During the decoding process, the hidden state s_i step t of decoding can be derived after a given input x_i and the distribution of attention in the hidden state of the encoder is computed as h: (15) $a_{t} = s o f t max (V_{a n}^{\cdot} tanh (W_{h} h + W_{s} s_{t} + b_{a n t}))$ where $V_{a n}^{\cdot}, W_{h}, W_{s}, W_{u}$ and b_an are trainable parameters. The decoder generates the next word based on the attention distribution.

The attention distribution is used to generate a weighted sum of the encoder’s hidden states to compute the distribution of the generated vocabulary list: (16) $P_{v o c a b} (w_{t}) = tanh (V_{p} [s_{t}; c_{t}] + b_{v})$ (17) $P (w_{t}) = p_{g e n}^{t} P_{w e a b} (w_{t}) + (1 - p_{g e n}^{t}) \sum_{j : w_{j} = w_{t}} a_{t, j}$ where V_p and b_v are learnable parameters. Context vector c_t is a fixed-size representation read from the news body at time step t, and P_vocap(w_t) represents the probability distribution of predicted words for all words in the glossary at time step t. The embedding of user interests is represented in the computation of $p_{g e n}^{'}$ . $p_{g e n}^{'}$ is computed as: (18) $p_{g e n}^{'} = s o f t max (V^{T} tanh (W_{h} h + W_{s} s_{t} + W_{u} u_{t} + b_{a n t}))$ where V,W_h,W_s,W_u and b_ant are trainable parameters. u_t is the filtered user interest vector. In decoding step t, pointer $p_{g e n}^{'}$ is used to select whether to generate words from a vocabulary list with probability P_vocab(w_i) or to copy words from a sample of news body text with attention distribution.

3.4

Enhanced learning

In order to further improve the degree of personalization of the generated headlines as well as the authenticity of the headline content, this paper uses a reinforcement learning algorithm to optimize the model during the training process. Reinforcement learning algorithms are mainly optimized by learning strategies during the interaction between the intelligence and the environment to achieve the purpose of maximizing the return.

Reinforcement learning is a type of unsupervised learning, which differs from supervised learning in that it does not have labeled data and does not require prior knowledge or experience, but learns through interaction with the environment.Intelligent agents in reinforcement learning collect reward signals through interaction with the environment, which indicate whether the action taken by the agent is correct or conducive to reaching a goal. By trying out different actions and collecting reward signals, the intelligent body can gradually learn which actions lead to higher rewards, thus improving its decision-making and action strategies.

The basic idea of reinforcement learning is for the agent to perform various actions in its environment in order to maximize its future rewards.This process can be simulated using computer code or realized through real-world physical experiments.

Intelligentsia in reinforcement learning usually consist of the following three main components:

1) Environment: the external environment in which the intelligent body operates and learns.

2) State: information describing the environment and the current state of the intelligent body.

3) Reward: a signal indicating the merit of an intelligent body to perform a certain action in a given state.

The main goal of reinforcement learning algorithms is to learn the best strategy by maximizing the long-term cumulative reward. Algorithms for reinforcement learning can be categorized into two main groups: value function and strategy gradient.

The value function approach learns the optimal strategy by estimating the value of each state. The value function represents the expected value of the long-term reward for taking an action in a given state, and it can be used to assess how good or bad it is for an intelligent to take a given action. Common value function algorithms include Q-Learning, State-Action-Reward-State-Action, and so on.

Q-Learning is a value function based reinforcement learning algorithm [29] for learning an action value function Q(s,a) that represents the long term payoff of taking an action a in state s. Q-Learning uses Bellman’s equation to update the Q value function and selects the next action based on maximizing the Q value function. This algorithm is widely used to solve many control problems and robotics tasks.

SARSA is another reinforcement learning algorithm based on value functions and a model-based learning algorithm. Unlike Q-Learning, SARSA uses the value function of the current action for updating and selects the next action based on the current strategy.This algorithm is suitable for those cases where the current strategy needs to be taken into account, such as multi-intelligence game problems.

The strategy gradient approach directly optimizes the strategy function itself to find the optimal strategy. The strategy function is a mapping that translates the intelligence states to the appropriate actions. Common strategy gradient algorithms include Actor-Critic and others.

A3C is an asynchronous dominant actor-critic algorithm for solving massively parallel reinforcement learning problems. In A3C, multiple intelligent agents run concurrently, each with a separate neural network for learning policies. This algorithm uses an actor-critic structure to learn strategies and uses multiple threads for asynchronous updates.

Reinforcement learning has a wide range of applications in artificial intelligence, such as training agents to play games in gaming, training agents to perform various tasks in robot control, and training cars to learn to drive in autonomous driving.

4

Experimental results and analysis

4.1

Comparative Experimental Results and Analysis

Rouge is a significant set of metrics that is utilized to evaluate machine translation and evaluate automatic text summarization. In this paper, news headlines generated by a news headline generator are compared with human-written news headlines provided in the original text to evaluate news headlines based on the co-occurrence information of N-means in the news text.Rouge is an evaluation metric oriented towards N-means recall, which evaluates the quality of news headlines by counting the number of basic units (n-means grammars, word sequences, and word pairs) that overlap between the two. This section scores the model using the Rouge-N (including Rouge-1, Rouge-2) and Rouge-L metrics, and uses these scores to compare with other models proposed in the past.

In order to make the experimental results more convincing, this chapter averages the experimental results and takes the average of 10 experimental results as the final experimental result data. After conducting several comparative experiments, the effectiveness of the proposed news headline generation model in this paper has been verified on both LCSTS and CSTS datasets.Tables 1 and 2 show the comparison of different models on the LCSTS dataset and the CSTS dataset, respectively. Tables 1 and 2 demonstrate that the news headline generation model proposed in this paper performs better than the other six models on either the LCSTS dataset or the CSTS dataset. Analyzing the news headlines generated by the models, it is found that the LSTM+Point model pays more attention to learning the textual information in the original news body, which is introduced into the newly generated news headlines through the form of pointer insertion, while the news headline generation model in this paper pays more attention to the readability and authenticity of the news headlines, and provides a generalized summary of the news document. Regarding the LCSTS dataset, this paper’s model improves on the Rouge-1, Rouge-2 and Rouge-L metrics compared with the LSTM+Point model by 16.43%, 6.31% and 0.75% respectively, which illustrates that the news headlines generated by the news headline generation model proposed in this paper are improved in readability and authenticity.

Table 1.

Comparison of different models on the LCSTS dataset

Methods	Rouge-1	Rouge-2	Rouge-L
RNN	6.1	2.77	5.67
RNN-context	10.8	7.3	10.72
HG-News	22.76	7.72	21.34
ABS	28.15	11.03	25.36
LSTM+Point	29.09	14.74	27.83
NAML+HG	31.18	12.23	27.53
This method	33.87	15.67	28.04

Table 2.

Comparison of Different Models on the CSTS Dataset

Methods	Rouge-1	Rouge-2	Rouge-L
RNN	19.55	8.7	17.34
RNN-context	30.18	14.06	27.36
HG-News	32.33	15.03	26.22
ABS	34.29	21.34	31.12
LSTM+Point	37.9	24.9	35.13
NAML+HG	36.31	21.85	34.89
This method	39.01	25.41	37.43

In this paper, the different performance of the model in the case of using different decoding selection strategies, the model on the CSTS dataset Rouge-1, Rouge-2 and Rouge-L three indicators of the different performance comparison graph is shown in Figure 2. A total of four decoding selection strategies are selected in this chapter on LCSTS dataset including: greedy algorithm, beam search, Bi-LSTM, joint use of Fastformer and Bi-LSTM.The comparison reveals that the Rouge metrics of the model are best in the case of the proposed model in this chapter with the joint use of Fastformer and Bi-LSTM.31.28, 12.68, and 28.31 on the Rouge-1, Rouge-2, and Rouge-L metrics on LCSTS, respectively.Comparing the Rouge metrics under different hyperparameters shows a significant improvement, which indicates that the decoding strategy using Fastformer in conjunction with Bi-LSTM designed in this paper is reasonable and efficient.

4.2

News headline generation results

Figure 3 shows the change of loss values during the training process using negative sampling in word-level text processing, negative sampling in Chinese character-level text processing, and using Fastformer combined with Bi-LSTM. First of all, it can be seen that the title generation model using Chinese character-level text processing converges to a lower loss value than the title generation model using word-set text processing, which also proves that the title generation model using Chinese character-set text processing can achieve better performance. In addition, it can be seen from the figure that in the headline generation model with Chinese character-level processing, the use of Fastformer combined with Bi-LSTM can make the model get faster convergence, and the model begins to converge when the number of iterations is 2, and the loss value finally converges to 17.5.

Figure 4 shows the PR curves of title generation of the model on different news types, in order to make the results more complete, on the basis of analyzing the PR curves, the F1-score is added to analyze the results, and Table 3 shows the highest generation results of the model on different news types related to the three indicators.

Table 3.

The title of the three indicators is generated

News category	Precision	Recall	F1-score
Political time	0.9666	0.6776	0.8605
international	0.9671	0.7011	0.8626
society	0.9682	0.7505	0.8659
culture	0.9702	0.7611	0.8724
entertainment	0.9711	0.7745	0.8756
health	0.9741	0.7856	0.8789

From the PR curve of news headline generation in the figure and the results in the table, it can be seen that the model can obtain high headline generation performance, which can be seen that the headline generation performance of the model has a high precision value, and the recall rate is relatively low, this is because in practice, a piece of news can often belong to more than one category, i.e., there is a cross-category labeling, and the model generates the news headlines with the The model generates news headlines about the health genre with high precision, recall, and F1, with the highest precision of 0.9741 and F1-score of 0.8789.

5

Conclusion

This paper focuses on the research and scheme design of the application of intelligent content generation technology in news publishing, constructs a personalized news headline generation model that integrates user characteristics, and evaluates the quality of the generated news headlines through indicators.

On both LCSTS and CSTS datasets, the performance of the news headline generation model proposed in this paper is improved compared to several other models. On the LCSTS dataset, this paper’s model improves 16.43%, 6.31%, and 0.75% on the Rouge-1, Rouge-2, and Rouge-L metrics, respectively, when compared with the LSTM+Point model, which indicates that the news headlines generated by this paper’s model are comparable to those edited manually.

The decoding strategy of the model in this paper is tested in comparison with three other decoding strategies: the greedy algorithm and beam search. The comparison results show that the model’s Rouge metrics perform best under the Fastformer joint Bi-LSTM decoding strategy.The Rouge-1, Rouge-2, and Rouge-L metrics on the LCSTS dataset were 31.28, 12.68, and 28.31, respectively.It illustrates the effectiveness of the model structure designed in this paper, which provides richer textual features of contextual information for news headline generation.

The model in this paper obtains a high headline generation performance, possessing a high precision value, and its three metrics scores of precision, recall, and F1 are higher in generating news headlines related to the type of health, which are 0.9741, 0.7856, and 0.8789, respectively.

Język:: Angielski

Częstotliwość wydawania:: 1 razy w roku
Dziedziny czasopisma:: Nauki biologiczne, Nauki biologiczne, inne, Matematyka, Matematyka stosowana, Matematyka ogólna, Fizyka, Fizyka, inne

Kanał RSS czasopisma

Application of Artificial Intelligence-based Content Generation Technology in News Publishing

Hongqiao Li

Data publikacji: 21 mar 2025

Otrzymano: 10 paź 2024

Przyjęty: 05 lut 2025

DOI: https://doi.org/10.2478/amns-2025-0642

Słowa kluczoweFastformer encoder, Pointer generation network model, User features, News headline generation

© 2025 Hongqiao Li, published by Sciendo

This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.

Słowa kluczowe
Fastformer encoder, Pointer generation network model, User features, News headline generation