Enhancing Research Support for Humanities PhD Teachers: A Novel Model Combining BERT and Reinforcement Learning

In higher education system, the construction and development of newly established undergraduate colleges are of great strategic significance. The main mission of such colleges is to cultivate applied and innovative talents and serve the regional economic and social development[4]. However, the research difficulties faced by liberal arts doctoral teachers in newly established undergraduate colleges have gradually become the focus of attention of the academic community and education administrators. Compared with traditional research universities, liberal arts doctoral teachers in newly established undergraduate colleges often face many challenges such as insufficient scientific research resources, weak scientific research environment, imperfect scientific research team construction, and lagging scientific research support system[6][26]. These factors not only affect the scientific research enthusiasm of teachers, but also limit the overall scientific research output and development of the school[11]. Therefore, how to help liberal arts doctoral teachers in newly established undergraduate colleges overcome these difficulties and improve their scientific research level has become an important issue that needs to be solved urgently. In past studies, strategies to improve the scientific research capabilities of liberal arts doctoral teachers mostly focused on optimizing the education management system and strengthening policy support[3]. For example, by providing more scientific research funds, encouraging interdisciplinary cooperation, and strengthening academic team building to support teachers’ scientific research activities[12][22]. However, these methods can alleviate the scientific research difficulties of liberal arts teachers to a certain extent, but in the context of the continuous emergence of new technologies and new methods, the limitations of traditional methods are gradually emerging. Especially in the face of an increasingly complex scientific research environment and fierce academic competition, relying solely on policy support and resource supplementation can no longer fundamentally solve the bottleneck problem of improving scientific research levels.

In recent years, with the rapid development of deep learning technology, data-driven problem-solving approaches in scientific research have gradually become a popular direction in the academic field. As an important branch of artificial intelligence, deep learning, with its hierarchical structure and powerful data processing capabilities, can automatically learn features from vast amounts of data and effectively analyze them[15][29]. This technology provides new tools and methods for scientific research, particularly in the era of big data, where traditional research methods struggle with complex problems. Deep learning, however, demonstrates significant potential in addressing these challenges. The advantage of deep learning lies in its ability to handle large-scale, multidimensional, and complex datasets, uncovering valuable patterns and insights[30]. For universities and research institutions, the application of deep learning has expanded beyond basic fields like image recognition and natural language processing[13]. It now plays a crucial role in research management, research output evaluation, and optimizing the allocation of academic resources. For example, in research management, deep learning can help identify bottlenecks in the research process, predict future directions and output for research teams, and optimize resource allocation to improve overall research productivity[23]. Furthermore, deep learning provides more precise tools for evaluating research output. Traditional evaluation systems often rely on simple quantitative metrics, while deep learning can automatically identify key factors influencing research outcomes, offering more accurate analysis and assessment[18]. This data-driven evaluation reduces human bias. In the context of humanities PhD faculty facing research challenges, deep learning can assist in better planning research paths, improving research efficiency and output[24]. By leveraging deep learning algorithms, universities can provide customized support for faculty and researchers, alleviating difficulties caused by limited resources and high research pressure, thereby promoting the healthy development of the academic ecosystem[14]. Thus, deep learning has become a powerful tool for addressing the research challenges faced by humanities PhD faculty and is poised to bring significant innovation and transformation to research management.

Based on this, we developed a research framework to explore how deep learning technology can assist humanities PhD teachers in enhancing their research capabilities and addressing existing bottlenecks. By analyzing the current challenges these teachers face and leveraging deep learning’s powerful capabilities, we devised an innovative solution. Specifically, our research focuses on several key aspects: First, we developed a research dilemma analysis system using deep learning to address difficulties in acquiring research resources and evaluating research capabilities. This system employs big data mining to identify bottlenecks and helps university management optimize resource allocation. Second, we developed a research output prediction model using deep learning algorithms to quantitatively forecast research outcomes, aiding university management in planning effective research development strategies. Finally, we validated the model’s feasibility and effectiveness through experiments, showing that it significantly improves the research capabilities of humanities PhD teachers and provides robust decision-making support for newly established universities.

Our contributions in this paper are as follows: 1)

we propose a deep learning-based research dilemma analysis framework capable of analyzing the bottlenecks in teachers’ research from multiple dimensions. This framework combines the powerful capabilities of data mining and deep learning, providing intelligent support for research management.

2)

we developed a research output prediction model that uses deep learning technology to accurately predict the research output of humanities PhD teachers, thereby optimizing the school’s research resource allocation and decision-making processes.

3)

we validated the effectiveness of the model through experiments and practical applications. The experimental results show that the proposed method can significantly enhance the research capabilities of humanities PhD teachers and provide data-driven intelligent decision support for university research management.

2

Related Work

2.1

Application of Artificial Intelligence in Research Management

In recent years, the application of artificial intelligence (AI) in research management has attracted increasing attention, as institutions look for more efficient ways to optimize resources and improve academic productivity[9]. AI technologies, especially deep learning and machine learning, have proven to be powerful tools in enhancing resource management by identifying key trends and patterns in large datasets that are often difficult to detect using traditional methods[25]; [31]. AI-driven approaches allow universities to make more informed decisions about resource allocation, directing funds and support to areas with the most potential for impactful research.

Additionally, AI can automate various time-consuming administrative processes, including grant application management, faculty performance evaluation, and project review, which traditionally require significant manual effort[21]. Automating these tasks frees up valuable resources and allows institutions to focus on strategic decision-making. Furthermore, AI technologies provide institutions with real-time insights into research output trends, enabling timely adjustments to resource allocation strategies as new data becomes available[16].

Several recent studies have highlighted the potential for AI to transform how universities approach long-term research planning. For example, AI can analyze research portfolios, funding trends, and collaboration networks to offer institutions predictive insights into areas of emerging scientific interest[17]. This allows for more dynamic and flexible management strategies, ultimately improving research efficiency. Moreover, AI’s ability to adapt to new patterns in the data ensures that universities remain responsive to shifting academic landscapes, thus improving their ability to compete in global academic rankings[7].

By implementing AI-driven systems, universities can not only optimize the allocation of limited resources but also improve collaboration opportunities and foster interdisciplinary projects[20]. AI models enable a better understanding of resource utilization across departments, ensuring that the institution’s strategic objectives are aligned with actual performance. This increasing integration of AI in research management signals a shift toward data-driven decision-making and a more efficient academic ecosystem.

2.2

Predictive Models and Research Output Optimization

Accurately predicting research output is crucial for effective research management, particularly in higher education where resources are limited and need to be utilized efficiently. AI technologies, such as machine learning, have been increasingly employed to build predictive models capable of forecasting academic productivity based on historical data[1]. These models allow institutions to anticipate future research outcomes, enabling them to allocate resources in ways that maximize impact. Such predictive approaches are critical for strategic planning, particularly in large institutions where resource allocation decisions need to account for varying research priorities and goals[8].

The adoption of predictive models enables universities to move beyond traditional research evaluation metrics such as citation counts, expanding to more complex indicators like collaboration effectiveness, research visibility, and interdisciplinary reach[5]. These more comprehensive assessments allow institutions to align their resource distribution strategies with long-term objectives, ensuring that funding and support are directed to areas with the greatest potential for academic excellence[2]. Additionally, AI-driven predictive models provide early warnings about potential research bottlenecks or inefficiencies, enabling institutions to address challenges before they significantly impact productivity[32].

Recent studies have demonstrated the usefulness of AI in predicting the success of research projects based on variables such as team composition, prior publication records, and access to research facilities[10]. By integrating these predictive capabilities, institutions can reduce the risks associated with funding allocation and support projects that are most likely to generate impactful results. Predictive models also help institutions manage their research portfolios more effectively, allowing them to track ongoing projects and make adjustments as needed to maximize output[28].

Moreover, AI-based predictive models have the potential to improve interdisciplinary research by identifying connections between seemingly unrelated fields. This fosters collaboration and opens new avenues for innovation[19]. As AI technologies continue to advance, predictive models are expected to play an even more central role in helping institutions manage their research activities, improve output quality, and enhance overall academic performance.

3

Method

To address the challenges faced by humanities PhD teachers at newly established undergraduate institutions during their research processes, we propose a deep learning model framework based on BERT, GNN, and reinforcement learning. This network is composed of three main modules, each playing a crucial role and working together within the system. First, the BERT model is used to process the teachers’ research text data, such as papers and project reports. BERT generates semantic embeddings from the text, transforming high-dimensional textual information into context-aware feature representations, which are then combined with structured data from research activities. Next, the Graph Neural Network (GNN) processes these inputs to capture the complex dependencies between research activities, including collaboration networks and citation relationships. Through multi-layer feature extraction in the GNN, we are able to build a comprehensive research activity network, identifying the bottlenecks and challenges teachers face in their research. Finally, the reinforcement learning module optimizes the allocation of research resources to maximize research output. This module interacts with the research environment, learning the best resource allocation strategies to ensure the effectiveness of resource distribution and the maximization of research output. This overall framework not only effectively analyzes and predicts research output but also dynamically adjusts resource allocation in cases of limited resources, helping teachers overcome research difficulties and improve their research productivity. The overall structure and workflow of our proposed framework is illustrated in Figure 1 below.

3.1

BERT for Text Embedding

In our framework, the BERT (Bidirectional Encoder Representations from Transformers) model is used to process the research text data of humanities PhD teachers, such as research papers and project reports. BERT has strong contextual understanding capabilities, enabling the system to extract rich semantic information from unstructured research data. The detailed process of how BERT processes the text and generates context-aware embeddings is illustrated in Figure 2.

First, the raw text dataT = {w₁, w₂, …, w_n} is tokenized into a sequence of words w_i, where each word is mapped to an initial embedding vector v_{w_i} in a high-dimensional space using an embedding function f_embed: (1) $V_{W_{i}} = f_{embed} (w_{i})$ where v_{w_i}. represents the semantic feature of word w_i. To capture positional information in the sequence, a positional encoding p_i is added to the embedding: (2) $V_{W_{i}}^{'} = v_{w_{i}} + p_{i}$ where p_i ensures that the position of each word in the sequence is represented, preserving the order of the text.

Next, $v_{w_{i}}^{'}$ is passed through multiple layers of bidirectional Transformers, where each layer applies a self-attention mechanism to compute the relationship between words in the sequence. For word w_i, the query Q_i, key K_j, and value V_j vectors are defined as: (3) $\begin{matrix} Q_{i} = W_{q} v_{w_{i}}^{'}, & K_{j} = W_{k} v_{wj}^{'}, & V_{j} = W_{v} v_{w_{j}}^{'} \end{matrix}$ where W_q, W_k, W_v are learnable weight matrices. Using these, the attention weight a_i j between word w_i and w_j is calculated as: (4) $a_{ij} = softmax (\frac{(Q_{i} K_{j}^{⊤})}{\sqrt{d_{k}}})$ where d_k is the dimension of the key vectors, and the softmax ensures normalization of attention weights across all words.

The contextual embedding h_{w_i} for word w_i is computed as the weighted sum of value vectors: (5) $h_{w_{i}} = \sum_{j = 1}^{n} a_{i j} V_{j}$ where h_{w_i} incorporates information from all other words in the sequence weighted by their attention scores.

Each Transformer layer further processes these embeddings using a feed-forward network (FFN) and layer normalization. The transformation is given by: (6) $h_{w_{i}}^{(new)} = LayerNorm (FFN (h_{w_{i}}) + h_{w_{i}})$ where the residual connection h_{w_i} + FFN (h_{w_i}) improves training stability, and LayerNorm stabilizes the outputs.

After L layers, the final context-aware embedding for each word W_i is: (7) $e_{w_{i}} = h_{w_{i}}^{(L)}$

These embeddings, E = {e_w₁, e_w₂, …, e_{w_n}}, represent the semantic relationships between words within the context of the entire text.

These context-aware embeddings are then used as input to the Graph Neural Network (GNN) module to further analyze the dependencies within the research activities. By generating these semantic embeddings, BERT enables the model to understand deeper meanings in the text, helping the system to identify potential bottlenecks in the research process. These embeddings improve the interpretability of the text data and provide a strong foundation for accurate predictions. In summary, BERT is primarily used in this network to transform complex textual data into meaningful vector representations, enabling the combination of text information with structured data (such as research collaboration networks), laying the groundwork for further analysis and prediction.

3.2

Model Optimization

In our proposed framework, the Graph Neural Network (GNN) is designed to model the intricate relationships between research activities, such as collaborations, citations, and project interactions. By leveraging both the text embeddings generated by BERT and the structured data from research activities, the GNN allows for a detailed analysis of the complex connections within the academic network.

Humanities PhD teachers’ research activities can be represented as a graph structure G = (V, E),where V denotes the set of nodes, each corresponding to a specific research activity (e.g., a paper, project), and E represents the edges, which capture the relationships between these activities (such as collaborations or citations).Each node v has an associated feature vector X_v, which includes both the BERT-generated embeddings and structured data like the number of citations or collaborations.

The GNN updates node features by aggregating information from neighboring nodes. The node update equation is defined as: (8) $h_{v}^{(l + 1)} = σ (W^{(l)} \cdot AGGREGATE ({h_{u}^{(l)} : u \in N (v)}) + b^{(l)})$ where $h_{v}^{(l)}$ is the feature vector of node v at layer l, $N (v)$ denotes the set of neighboring nodes, W^(l) is the weight matrix at layer l, b^(l) is the bias term, and σ is the activation function. This iterative update enables each node to capture both direct and indirect dependencies in the graph. Message passing, a key component of the GNN, computes interactions between nodes to aggregate information. The message aggregation is defined as: (9) $m_{v}^{(l)} = \sum_{u \in N (v)} ϕ (h_{v}^{(l)}, h_{u}^{(l)})$ where $m_{v}^{(l)}$ is the aggregated message for node v at layer l ,and φ is the message function that computes the interaction between nodes v and u. The aggregated information is combined with the node’s own feature to generate updated node features: (10) $h_{v}^{(l + 1)} = ψ (h_{v}^{(l)}, m_{v}^{(l)})$ where ψ is an update function (e.g, concatenation followed by a linear transformation) that integrates the original node features with the aggregated message.

To ensure stability and proper scaling during updates, the node features are normalized after each layer: (11) $h_{v}^{(l + 1)} = \frac{h_{v}^{(l + 1)}}{‖ h_{v}^{(l + 1)} ‖_{2}}$ where $‖ h_{v}^{(l+1)} ‖_{2}$ denotes the L2 norm of the feature vector $h_{v}^{(l+1)}$ The final representation of each node after L layers is obtained by combining information from all layers: (12) $z_{v} = CONCAT (h_{v}^{(l)}, h_{v}^{(2)}, \dots, h_{v}^{(L)})$ where z_v is the final feature vector used for downstream tasks, and CONCAT denotes the concatenation operation.

The architecture of the GNN within our framework is illustrated in Figure 3. This architecture allows the model to gain a comprehensive understanding of how collaborations, citations, and other relationships influence the overall research network and helps in identifying key bottlenecks in the research process.

3.3

Reinforcement Learning for Resource Allocation Optimization

To ensure optimal distribution of research resources, we integrate a Reinforcement Learning (RL) module into our framework. This module dynamically learns the best strategies for allocating limited resources, such as funding and support, to maximize research output. The RL component continuously adapts to changes in research needs and constraints, making it a crucial part of the model’s decision-making process.

The resource allocation problem is modeled as a Markov Decision Process (MDP). The MDP is defined by a tuple (S, A, P, R), where Srepresents the state space, describing the current allocation of resources to various research activities; A is the action space, which contains all possible resource allocation strategies (e.g, assigning different levels of funding or support to various researchers); P(s′|s,a) is the state transition probability, defining the likelihood of moving from state s to state s′ after taking action a; and R(s, a) is the reward function, which assigns a numerical value based on how effective the action a is in improving research output. At each decision point, the RL agent observes the current state s ,selects an action a ,and receives a reward R(s, a) based on the effectiveness of the resource allocation. The objective of the RL agent is to maximize the cumulative future rewards G_t, which is defined as: (13) $G_{t} = \sum_{k = 0}^{\infty} γ^{k} R_{t + k}$ where γ is the discount factor, which balances the importance of immediate versus future rewards. The RL agent must learn an optimal policy π(a|s), which selects the action a that maximizes the expected long-term reward given the current state s.

To achieve this, we employ the Q-learning algorithm to iteratively update the value function Q(s, a),which represents the expected cumulative reward for taking action a in state s.The Q-value update rule is given by: (14) $Q (s, a) = Q (s, a) + α (R (s, a) + \underset{a^{'}}{γ \max} Q (s^{'}, a^{'}) - Q (s, a))$ where α is the learning rate, and max_a′ Q(s′, a′) represents the maximum expected future reward for the next state s′. Over time, this iterative update process ensures that the agent improves its policy π(a|s).

To balance exploration and exploitation, the agent follows an epsilon-greedy policy.

The action selection rule is: (15) $a_{t} = {\begin{cases} {argmax}_{a} Q (s, a) & with probability 1 - ϵ \\ random action from A & with probability ϵ \end{cases}$ where ∈ controls the exploration rate, decaying over time to allow more exploitation as the policy improves.

The policy π aims to maximize the expected return G(s), defined as: (16) $G (s) = E_{π} [\sum_{t = 0}^{\infty} γ^{t} r^{t} ∣ s_{0} = s]$ where $E_{π}$ is the expected value under the policy π.

Finally, to stabilize training and ensure convergence, we include a reward normalization step: (17) ${\hat{r}}_{t} = \frac{r_{t} - μ_{r}}{σ_{r}}$ where μ_r and σ_r are the mean and standard deviation of rewards observed so far. This normalization improves the agent’s ability to handle varying reward scales effectively

By continuously updating the Q-values and refining its policy, the RL agent learns to allocate resources optimally, maximizing both short-term and long-term research productivity. The system dynamically adjusts resource allocations in response to evolving research needs and outcomes, ensuring that resources are used efficiently and that research output is maximized.

4

Experiment

In this section, we describe the experimental setup, dataset details, evaluation metrics, and results used to validate the effectiveness of our proposed framework. The experiments are designed to demonstrate the impact of the BERT + GNN + Reinforcement Learning model in improving research output prediction and resource allocation for humanities PhD teachers at newly established undergraduate institutions.

4.1

Experimental Setup

We implement our model using Python and deep learning frameworks such as PyTorch and TensorFlow. The system is trained on a server with an NVIDIA GPU, ensuring efficient computation during the model training phase. The training process is divided into two main stages: (1) the initial training of the BERT and GNN modules for research activity modeling, and (2) reinforcement learning for optimizing resource allocation strategies based on the feedback from predicted research outcomes.

The hyperparameters for BERT, GNN, and reinforcement learning are tuned to ensure optimal performance. Specifically, BERT uses a pre-trained model with a learning rate of 2e- 5, while the GNN employs three layers with ReLU activation. The reinforcement learning module uses a Q- learning algorithm with a learning rate α = 0.01, and the discount factor γ = 0.9 to balance short-term and long-term rewards. The epsilon e for the epsilon-greedy policy is set to decay from 1.0 to 0.1 over the training period to ensure sufficient exploration.

4.2

Dataset

To validate the effectiveness of our model, we selected two widely-used and high-quality academic datasets: the Semantic Scholar Open Research Corpus (S2ORC)[20] and the Microsoft Academic Graph (MAG)[27]. These datasets provide a combination of textual and structured data, enabling comprehensive support for the various modules of our model.

S2ORC is a large-scale open-access academic document dataset, spanning multiple academic fields and well-suited for natural language processing and text mining tasks. The dataset contains approximately 81 million academic articles, each accompanied by metadata such as author information, publication year, publication type, citation records, and subject tags. In our research, S2ORC serves as the source of textual data for training the BERT model. By extracting text data from academic papers, abstracts, and research reports, BERT can generate context-aware semantic embeddings, which serve as a strong semantic foundation for the subsequent Graph Neural Network (GNN) module. Additionally, S2ORC’s citation information provides valuable input for modeling relationships between research activities.

MAG is a large-scale academic graph dataset that provides structured data, including information on academic papers, authors, institutions, citation relationships, and collaboration networks. MAG is ideal for analyzing academic collaboration networks and citation patterns. In our experiments, MAG is primarily used to construct research collaboration and citation networks, which are fed into the GNN and reinforcement learning modules for academic relationship modeling and resource allocation optimization. By extracting collaboration data from MAG, we can capture patterns of cooperation among researchers, and this data is further used to optimize decisions on research resource allocation.

By combining the S2ORC and MAG datasets, our framework effectively leverages both textual and structured data to analyze research activities, predict research output, and optimize resource allocation, ensuring that the model has sufficient expressiveness and generalizability across different levels of academic research tasks.

4.3

Data preprocessing

4.3.1

Data Cleaning

For the S2ORC dataset, we performed comprehensive text cleaning operations, including stop word removal, stemming, standardization, spelling correction, and sentence segmentation. The purpose of these steps is to remove unnecessary words, reduce vocabulary diversity, and ensure consistency in text format, thereby improving the results of subsequent natural language processing and topic analysis. For the MAG dataset, we cleaned and extracted faculty collaboration network data and calculated academic impact indicators such as the H-index and citation count to better reflect faculty academic influence. We also cleaned the collaboration network data to ensure its completeness and accuracy, laying the foundation for subsequent network analysis. Through these data cleaning steps, we provide reliable data support and foundation for further deep learning model training and scientific research difficulty analysis.

4.3.2

Data standardization

After data cleaning, we standardized multiple datasets to ensure data consistency and integrity. First, to integrate information from the S2ORC dataset and the MAG dataset, we used data fusion technology. We eliminated ambiguity for the same author in different datasets by unifying author identifiers (such as ORCID) to ensure consistency between the two datasets for the same author during the data fusion process. In addition, we standardized citation information to ensure uniformity in citation format and source to avoid data redundancy and conflict. During the data fusion process, we used rule-based matching and machine learning methods to handle record inconsistencies caused by issues such as spelling and format differences. These standardization measures help ensure the integrity and reliability of data sources and provide a solid data foundation for subsequent analysis. Through these standardization steps, we successfully fused data from different sources, laying a solid foundation for the training of deep learning models and in-depth analysis of faculty research output and cooperation patterns.

4.4

Evaluation Metrics

In this study, we employed several key evaluation metrics to assess the capability of our deep learning model in identifying and analyzing the research difficulties faced by PhD teachers in the arts. Initially, topic coherence metrics help us evaluate the quality and relevance of the themes extracted from the text by the model. Precision and recall are used to measure the model′ s accuracy in correctly identifying teachers facing research difficulties, while the F1 score, as the harmonic mean of precision and recall, provides a comprehensive indicator of the model′s overall performance. Additionally, model stability is assessed through repeated tests on different data subsets to ensure consistency in model outputs. The effectiveness of the network structure is verified through the analysis of the teacher collaboration network, ensuring that our approach accurately reflects the patterns of collaboration and research activity within the network. These evaluation metrics collectively reflect the model′s capability in academic applications and its effectiveness in practical implementation.

4.5

Result

In Table 1, we present a comparative analysis of the performance of several deep learning models applied to the PAMAP2 and MHEALTH datasets, assessed using four key metrics:

Table 1.

Performance metrics of different models on S2ORC and MAG datasets

Model Name	S2ORC dataset			MAG dataset
Model Name	Precision	Recall	F1 Score	Precision	Recall	F1 Score
Transformer + DNN	0.82	0.88	0.85	0.84	0.87	0.86
BERT + LSTM	0.8	0.87	0.83	0.82	0.86	0.84
GPT-3	0.83	0.85	0.84	0.85	0.88	0.86
Ours	0.85	0.9	0.87	0.87	0.89	0.88

The comparative analysis of the models presented in the table 1 above reflects their performance on two distinct academic datasets: S2ORC and MAG. The model denoted as “Ours,” demonstrates the highest levels of precision, recall, and F1 scores across both datasets, underscoring its efficacy in addressing the nuanced challenges of identifying research difficulties among PhD teachers in arts disciplines. On the S2ORC dataset, “Ours” achieved a precision of 0.85 and an F1 score of 0.87, slightly better than its performance on the MAG dataset, where it scored 0.87 in precision and 0.88 in F1. This indicates a consistent yet slightly varied ability to generalize across different academic texts, likely due to the diverse nature and scope of articles within each dataset. The MAG dataset, known for its broader interdisciplinary articles, may provide a more complex challenge in identifying specific research difficulties, which could explain the minor performance variation. Furthermore, the comparison with other advanced models like Transformer + DNN, BERT + LSTM, and GPT-3 highlights the added value of our approach, especially in the context of granular and context-specific tasks such as evaluating academic difficulties. While models like GPT-3 perform commendably with precision scores up to 0.85 on MAG, the tailored approach of combining BERT′ s contextual understanding with DNN′s learning capabilities and the strategic optimization of reinforcement learning in “Ours” provides a more focused analysis on the datasets tailored to the specific needs of arts PhD research. These results not only validate the robustness of “Ours” in handling varied data characteristics but also illustrate the potential for specialized models in academic analytics, where understanding the context and nuances of academic challenges is crucial. This analysis suggests that our model could be particularly useful for educational institutions aiming to better support their doctoral candidates by pinpointing and addressing specific research hurdles.

The results table 2 highlights the performance of our proposed model in identifying key research challenges from both the S2ORC and MAG datasets. The theme of “Funding challenges” stands out, with high consistency scores of 0.82 in S2ORC and 0.85 in MAG, reflecting the model′ s ability to effectively capture this common issue across different academic contexts. Additionally, the number of related publications, 95 in S2ORC and 120 in MAG, supports the significance of this challenge, showing that it is widely discussed in the literature. Similarly, for the theme “Resource scarcity,” the model maintains consistent performance, with scores of 0.79 in S2ORC and 0.80 in MAG. This indicates that the model reliably identifies resource limitations as a key obstacle in academic research. The theme is linked to 80 publications in S2ORC and 110 in MAG, further confirming its relevance in both datasets.

Table 2.

Comparison of the model-extracted themes and their consistency scores across S2ORC and MAG datasets.

Theme ID	Extracted Keywords	S2ORC dataset		MAG dataset
Theme ID	Extracted Keywords	Consistency Score (C_v, C_umass, C_npmi)	Related Publications	Consistency Score(C_v, C_umass, C_npmi)	Related Publications
1	Funding challenges	0.82, -0.12, 0.50	95	0.85, -0.10, 0.52	120
2	Resource scarcity	0.79, -0.15, 0.48	80	0.80, -0.14, 0.49	110
3	Publication bias	0.86, -0.09, 0.53	65	0.88, -0.08, 0.55	90
4	Collaboration issues	0.77, -0.19, 0.45	55	0.78, -0.18, 0.47	70
5	Methodological issues	0.84, -0.11, 0.51	100	0.86, -0.09, 0.53	130

The model also performs well in recognizing “Publication bias,” with high consistency scores of 0.86 in S2ORC and 0.88 in MAG. This suggests the model′ s accuracy in detecting issues related to biases in scholarly publishing, which is reflected by 65 related publications in S2ORC and 90 in MAG. Although the number of related documents is slightly lower than other themes, the high scores show that the model captures the essence of this issue effectively. “Collaboration issues,” on the other hand, have slightly lower consistency scores of 0.77 in S2ORC and 0.78 in MAG, suggesting that the model finds this theme more challenging to extract, possibly due to the more nuanced and multi-dimensional nature of collaboration problems in academic settings. The number of related publications, 55 in S2ORC and 70 in MAG, supports this interpretation, indicating that collaboration difficulties are less frequently discussed or more complex to quantify in academic literature.

Overall, our proposed model demonstrates strong consistency across both datasets in identifying key themes, with minor variations in performance that highlight the distinct characteristics of the datasets. The analysis underscores the model′ s ability to effectively extract and represent the research challenges faced by PhD scholars in arts disciplines.

This study combines BERT, RNN, and reinforcement learning methods to automatically extract and analyze the primary research difficulties faced by PhD teachers in new arts colleges, proposing targeted breakthrough strategies. These techniques effectively identify the correlation between five major research difficulties (Funding Challenges, Resource Scarcity, Publication Bias, Collaboration Issues, Methodological Issues) and their corresponding breakthrough strategies. BERT is responsible for semantic analysis of a large number of relevant documents, extracting key terms related to research difficulties. RNN is used to analyze the temporal relationship between difficulties and strategies, while reinforcement learning optimizes the strategy matching results.

In Figure 4, for Funding Challenges, the most effective strategy is Increased Funding Support. This indicates that directly increasing funding has a significant effect in alleviating the funding shortages faced by PhD teachers. Additionally, for Resource Scarcity, the primary breakthrough strategy is Building Research Networks, which effectively addresses the issue of insufficient resources by promoting cooperation and resource sharing. Regarding Publication Bias, the most effective measure is Reducing Workload, which shows that by reducing non-research tasks, teachers can focus more on academic publications, thereby mitigating the effects of publication bias. Meanwhile, Collaboration Issues can be significantly alleviated by Fostering Collaboration. This strategy highlights the importance of establishing collaborative relationships, particularly in new colleges, where interdisciplinary and inter-institutional collaboration can bring more research opportunities. Lastly, for Methodological Issues, the most effective strategy is Enhancing Methodology. Improving research methodology can help strengthen teachers′ research capabilities and overcome technical difficulties in research design. These findings are further validated through the combination of BERT, RNN, and reinforcement learning. Reinforcement learning, through iterative strategy optimization, selects the optimal strategy combinations, confirming the matching relationship between various research difficulties and corresponding breakthrough strategies.

Figure 5 presents the convergence speed of five strategies for optimizing research resource allocation: Increased Funding Support, Building Research Networks, Enhancing Methodology, Reducing Workload, and Fostering Collaboration. Each line in the graph represents the resource allocation efficiency over a series of iterations, highlighting how each strategy improves resource distribution. The graph shows that all strategies exhibit initial rapid improvement, with diminishing returns as they approach maximum efficiency. Strategies like Increased Funding Support and Enhancing Methodology converge more quickly, reaching near-optimal efficiency within the first 10 iterations, suggesting their effectiveness for short-term interventions. On the other hand, strategies such as Fostering Collaboration and Reducing Workload, while still effective, demonstrate a slower convergence. These strategies appear to require more time and iterations to reach similar levels of efficiency, indicating that they may be better suited for long-term optimization. Despite their slower convergence, the consistent upward trend suggests that these strategies continue to yield improvements, albeit at a slower pace. Overall, the graph highlights the differences in how quickly each strategy can optimize research resources, allowing for strategic decision-making depending on whether immediate or sustained improvements are desired.

5

Conclusions

In this study, we proposed a method combining the BERT model, Recurrent Neural Network (RNN) layers, and reinforcement learning to address the diverse challenges faced by PhD teachers in the humanities during their research processes. By integrating natural language processing techniques with a reinforcement learning framework, we effectively developed a system that can automatically identify and analyze the research difficulties encountered by these teachers. The experimental results demonstrate that the proposed model performs well in analyzing research challenges and matching appropriate breakthrough strategies. Specifically, the model accurately identifies multiple research difficulties, such as funding challenges, resource scarcity, publication bias, collaboration issues, and methodological problems, and proposes corresponding strategies, including increased funding support, building research networks, reducing workload, fostering collaboration, and enhancing methodology. The model′s performance was evaluated using metrics such as confusion matrix, precision, recall, F1-score, and consistency score, showing high accuracy and consistency in solving the identified issues.

However, our model has certain limitations. First, its performance heavily relies on the quality and scale of the datasets used. Due to the diverse data sources and complex structures, data loss or information bias may occur during the data cleaning and standardization processes, potentially affecting the overall performance of the model. Second, the current model has limitations in handling multidimensional text data and multitask learning. Specifically, although the BERT model effectively captures contextual information in text, it requires significant computational resources when trained on large-scale corpora and may underperform when dealing with very complex semantics or long texts. Furthermore, the policy optimization process in the reinforcement learning module is highly uncertain and may be influenced by various factors in practical applications, resulting in instability and convergence issues.

Future research will focus on optimizing and expanding the methods proposed in this study. Firstly, we plan to introduce more diverse datasets to further improve the generalization ability and robustness of the model. Additionally, exploring more efficient fusion and denoising techniques in data preprocessing could help minimize information loss and bias. To address computational resource constraints, future work could explore lightweight models or model compression techniques to reduce computational costs and enhance model applicability. Furthermore, to meet the demands of multitask learning, we could integrate other deep learning models, such as Graph Convolutional Networks (GCNs) and self-supervised learning methods, to better capture complex semantic relationships and multimodal information. This study provides a valuable tool for understanding the research challenges faced by PhD teachers in the humanities and lays an important theoretical and practical foundation for developing intelligent research support systems. We hope that these improvements and extensions will more comprehensively support the research work of teachers, enhance research output and collaboration efficiency, and promote the advancement of scientific research.

In conclusion, this study presents an innovative approach to analyzing and addressing the research difficulties faced by PhD teachers in the humanities and validates its effectiveness through experiments. However, we also acknowledge the current limitations of the model and outline future directions for improvement and research. With further optimization and enhancement, this approach is expected to realize greater potential and value in future research endeavors, providing more intelligent and efficient solutions for the academic community.

Lingua:: Inglese

Frequenza di pubblicazione:: 1 volte all'anno
Argomenti della rivista:: Scienze biologiche, Scienze della vita, altro, Matematica, Matematica applicata, Matematica generale, Fisica, Fisica, altro

Feed RSS della rivista

Enhancing Research Support for Humanities PhD Teachers: A Novel Model Combining BERT and Reinforcement Learning

Peng Wang

Pubblicato online: 27 feb 2025

Ricevuto: 08 ott 2024

Accettato: 12 gen 2025

DOI: https://doi.org/10.2478/amns-2025-0125

Parole chiaveResearch Challenges, PhD Teachers, Humanities, Reinforcement Learning, BERT Model

© 2025 Peng Wang, published by Sciendo

This work is licensed under the Creative Commons Attribution 4.0 International License.

Parole chiave
Research Challenges, PhD Teachers, Humanities, Reinforcement Learning, BERT Model