Multi-task learning algorithm design and implementation strategy for intelligent police academy management system

Today, the development of electronic information technology is changing day by day, especially the rapid popularization of Internet technology, its application has long been deeply into all aspects of people’s lives, a variety of network services such as spring like a strong vitality. In the process of informationization construction of institutions of higher education, the application of student management system has also been widely used in various colleges and universities and received good results [1-2].

At present, most of the police colleges and universities, due to the characteristics of school running, generally adopt the police management system to regulate the daily learning life of the students, which makes the student management of police colleges and universities with different characteristics from other colleges and universities [3-5]. The most obvious feature lies in the policing management system. Both the police and the military are disciplinary forces of an armed nature, and this nature identifies the psychology and greatly strengthens the belief in the implementation of militarized management in the work of students in public security colleges and universities [6-8]. Similar to militarized management, police management tries to manage life in three main ways, i.e., House Rules, Disciplinary Orders, and Formation Orders, to ultimately achieve the style of students who are forbidden to obey orders and follow commands. That is to say, the police management system is different from the military management system, but it is based on the requirement of students to obey orders and follow the command and strict police dress and sitting system, for students to learn, life, style of thought and many other aspects of strict requirements [9-13]. Police colleges and universities in adherence to police management on the basis of management methods and ways must be adjusted, in the training of students police management quality at the same time to fully guarantee the development of their personal ability and comprehensive quality [14-16]. And such a demand must be realized through the introduction of computer information management system.

The research of this paper is about the student management system of the police higher education institution, which is built on the basis of the overall design based on the B/S three-tier architecture, and the functional detailed design of the system’s sub-modules with package diagrams, class diagrams, sequence diagrams and activity diagrams. Through the student behavioral data included in the management system, multi-task learning algorithms are used to practice on the problem of student cognitive diagnosis and assessment. By introducing students’ skill mastery and the time weight function T(u,j) to solve their similarity, the multi-task learning algorithm is utilized to analyze and construct a learner model of similar exercise mastery. Finally, the MTL algorithm is experimentally compared with other traditional recommendation algorithms on three real datasets based on the learners’ exercise mastery and similar learners’ answer data for exercise recommendation.

2

Design of the police academy management system

Any engineering project must be designed before production, and software engineering projects are the same. Software design must be carried out before software coding. Requirements research and analysis are the basis of software design. Software design is a key step in software development, which directly affects the quality of software. If the software requirements analysis stage has been completely clear about the various needs of the software, a better solution to the development of the software “what to do” problem, and has been in the software requirements specification in detail and fully articulated these requirements, then the next step is to start on the functional structure of the software system, the data structure and the user interface and so on. The next step is to design the functional structure, data structure and user interface of the software system, i.e., the software design stage should solve the problem of “how to do”, and finally reflect the result of the design through the “design model diagram”.

In this paper, we will use B/S architecture to establish a student management system for police colleges and universities to realize rolling generalized police management; this kind of model is widely used in the development of all kinds of application systems, and it can effectively realize the mutual transfer of resources in a wide range.

2.1

System design

2.1.1

Principles of system design

The analysis and design principles of the student management information system for police-oriented institutions of higher education are: 1)

Based on the specification requirements of the national standard and the ministry standard, follow the standard of top-level design. Adopting the analogy method, on the basis of summarizing and learning from the experience of similar information systems, combining the current status and needs of the actual research on the manual system, the analysis, design and archiving are carried out in accordance with the norms of software engineering and the principle of regulatory compliance;

2)

Principle of simple operation. Maintain the consistency of the flow of operational information, the operable layer is not greater than three layers, visual information in line with the requirements of ergonomic view application, that is, in accordance with the user’s habits from top to bottom, from left to right, color difference is soft, the mouse, the keyboard and the minimum number of times to touch;

3)

The principle of ease of use. Operational interface, interactive interface, tips and help information at a glance, a natural transition from manual systems to information systems, without retraining;

4)

The principle of easy application system integration. Leave the data exchange can be pushed information, can pull information interface.

5)

The principle of security. The system provides security mechanisms to prevent illegal authorization of operations, for each subsystem to provide a strict and flexible division of users and permissions, for database management, the establishment of a complete data backup recovery mechanism. Provide log records for the operation of critical information.

6)

Maintainability principle. To consider changes in demand and system maintenance, all functional modules of the system need to be abstracted, componentized, and designed with the idea of reusability.

2.1.2

System architecture design

The complex system can adopt the six-layer architecture design based on boundary service, WEB service, portal service, application service, database and storage service, and set up four security fire zones, which provides a safe, reliable and sturdy software and hardware system operation support environment for the application system.

This system is a simple system, adopting the general three-tier B/S architecture as the software and hardware support environment of the application system.

B/S architecture, as shown in Figure 1, is mainly the use of Web technology, combined with the browser’s ability to interpret scripts, with a common browser to achieve the original need for complex special software to achieve powerful functions, and save the development cost and maintenance costs.

In the B/S architecture, in addition to the database server, the application program is stored on the Web server in the form of static or dynamic web pages, and the user only needs to enter the corresponding URL in the browser on the client side to run an application program. The user runs an application by entering the URL into the browser on the client. The application on the Web server is called and operates on the database to complete the corresponding data processing work, and finally the result is displayed to the user through the browser. It can be said that in the application system of B/S architecture, the application program is centralized to a certain extent.

In a software system based on the B/S architecture, installation, modification, and maintenance are all handled on the server side. When users use the system, they only need a browser to run all the modules, which really achieves the function of “zero client”, and it is easy to upgrade automatically at runtime.B/S architecture also provides the most realistic and open foundation for the on-line, networking, and unified service of heterogeneous machines, heterogeneous networks, and heterogeneous application services.

Before the emergence of the B/S architecture, the functions of the management information system were mainly internal to the organization, and the B/S-style “zero-client” approach made it easy for the computers of the organization’s suppliers and customers (who might be potential, that is to say, might be unknown in advance) to become clients of the management information system, and then to query organization-related information within a limited range of functions, complete data exchange and processing of various business transactions with the organization, and expand the functional coverage of the organization’s computer application system. In the limited functional scope of the query organization-related information, complete with the organization of the various business transactions of the data exchange and processing work, expanding the organization of the computer application system’s functional coverage, you can make fuller use of a variety of resources on the network, with the application of the workload is also greatly reduced. In addition, the combination of the B/S architecture application system and the Internet also makes it possible to realize some new enterprise computer applications (e.g. e-commerce, cloud computing).

As shown in Figure 1, the system is divided into three layers: presentation layer, business logic layer, and data layer. The performance layer uses the business logic layer, and the business logic layer uses the data layer. The performance layer mainly includes the UI display class to display the interface. The business logic layer includes the entity class and service class, while the data layer includes the mapping class and data control class. The business layer depends on the WEB GUI to show that the data layer relies on ADO.NET technology. The three-tier structure is built on the basis of the .net framework.

2.1.3

System Functional Architecture Design

The functional architecture design defines the relationship between the main parts and components of the software system. As shown in Figure 2, the system consists of six modules and one sub-system, which are: student variable information management module, difficult student financial aid management module, dormitory management module, leave management module, queue management module and discipline management module. This subsystem is the background management subsystem, which is divided into four modules: user management, log management, system role management, and user level management.

2.2

Functional Module Design

2.2.1

Basic information management subsystem for trainees

The basic information management business process mainly includes the registration of new trainees, the division of classes or teams, and the establishment of trainee registration, etc. This business process is based on the analysis of the basic needs of trainee information management and the process formulated with the management personnel of the Academic Affairs Section of the university, and the business process has added a lot of new processing operations, which are described as follows. 1)

Registration of new students

When a new trainee enrolls in a school, he/she needs to register his/her school registration, including the information of his/her class or formation, and his/her school registration number, etc. The school registration number of the new trainee is automatically registered. The registration number of the new trainee is assigned to each new trainee by automatic batch processing, and then the new trainee is automatically assigned to the class or formation according to the order of the registration number, and manual adjustment of the class or formation to which the new trainee belongs is supported.

2)

Cadet registration management service

The school registration management service mainly includes information on changes to school registration, rewards, and punishments. Changes in school registration include natural and abnormal changes. Natural changes refer to the promotion of trainees to higher grades, while abnormal changes include downgrading, sick leave, dismissal, and so on.

3)

Statistical report operations

Reporting business involves the statistical table of the number of trainees and the report of the list of trainees in a class, including basic information such as trainee number, trainee name, trainee gender, trainee’s place of origin, etc. These processing businesses will be further analyzed to break down the trainees. Figure 3 illustrates the framework for the basic trainee information management subsystem.

2.2.2

Learner registration management subsystem

The basic information management subsystem is a series of business processes that must be completed after the enrollment of new students, and it is a series of business processes that must be carried out for each new student, while the school registration management subsystem does not have exactly the same business processes for all the students, and this subsystem mainly deals with various school registration businesses of the students, such as registration, rewards and punishments, suspension and withdrawal of students, graduation information and employment information, and so on. Data. The main business flow diagram of the student registration management subsystem is shown in Figure 4 and described as follows.

1)

Data Import/Export

Before the development of the system in this paper, the trainee academic registry information of our school is usually saved in the form of Excel files in the Student Work Office, and the Academic Affairs Section and the teachers of the courses have backup files saved. In order to better connect with the existing work, this part of the function must provide the function of data import, the existing decentralized student registration Excel file data into the database of this system. When importing, the system checks the uniqueness of the trainee’s school number, and the teacher user specifies the matching relationship between the external data file and the fields in the school registration database.

At the same time, in order to adapt to the working habits of the staff of each section, teachers and class teachers, this part must also have the function of data export, i.e., to export the student registration information in the database to an Excel file, so as to help the teachers and class teachers gradually adapt to the operating habits of the system.

2)

Import of students’ photos and images

Classes and teams are used as batch conditions for importing students’ photo and image files. In the design of this paper, students’ photo and image are saved in the system database, and other information of the students is linked with the student’s school number or student ID number.

3)

Entry, deletion and modification of student registration information

Student registration information includes student number, student name, student’s former name, student gender, student ethnicity, student date of birth, student political outlook, student’s institute, student’s department, student’s major, student’s class, student’s entrance grade, student’s entrance date, student’s graduation date, student’s certificate number, student’s place of origin, student’s home address, student’s contact information, etc. This part needs to support the database of this system. This part needs to support the conversion between the database data of this system and Excel format files.

4)

Batch modification of student registration information

It can batch modify the attribute information of a specified range of trainees, such as the class or formation in which the trainee belongs to, the enrollment date of the trainee, the graduation date of the trainee, the political outlook of the trainee and so on. There are many ways to determine the range of students, such as age, enrollment date, grade, etc. This section also needs to be integrated with Excel files. This part also needs to support the import and export functions for Excel files.

5)

Batch Deletion of Student Registration Information

Batch delete the student registration records of the specified range of students, and at the same time delete all other relevant information of these students, such as student performance records, student selection records, student awards and penalties records. Provide multiple ways to determine the range of students, such as age, enrollment date, and grade level. In order to avoid misuse, the design of this part of the paper is to use two ways to delete student registration information in batch.

Cache deletion method: After the user deletes the student registration information in batch, the information will be firstly cached in a specific data table in the database, and the system will automatically clear the information after a certain period of time. Users can undo the deletion before the timeout period.

Direct deletion: After the user submits a deletion request, the student registration information is deleted directly and cannot be recovered. Provide users with an interface to select the way to delete student registration information in batch.

6)

Inquiry of student registration information

Combined query by conditions such as faculty, department, student major, student class, student gender, student ethnicity, student political profile, student date of birth, student grade, student name, student number range, student enrollment date, student graduation date, etc. Query results can be displayed in pages, batch deletion, batch modification and other operations.

7)

Printing of student registration information

The student registration management sub-system provides all teachers and students with printing services for student registration information, for example, selecting the information to be printed according to the screening function of student registration attribute fields; specifying the printing results by using the sorting function, with the sorting fields including the student’s grade, the student’s department, the student’s specialty, the student’s class, the student’s student number and the student’s name; selecting whether to print by class; generating print titles or resetting print titles according to the query results automatically; and printing the information by class. The print title is automatically generated or reset based on the query result. In addition, this part needs to provide the conversion function in Excel, PDF, and other formats.

3

MTL-based cognitive diagnosis of students under police academy management

The design of the police academy management system was completed previously, and this chapter relies on the system to track and assess students’ learning using intelligent methods. In the process of student assessment, the cognitive diagnostic model can better model the cognitive state of students, and the modeling results have played a good role in student counseling, withdrawal warning, and other education-related applications.

3.1

Multi-task learning network algorithm design

3.1.1

Multi-task learning networks

Multi-task learning (MTL) is a learning strategy that forces algorithms to learn multiple tasks simultaneously. In particular, the learning performance of the main task can be improved by learning the auxiliary tasks and will also reduce label annotations by learning the shared knowledge between different tasks.MTL networks are usually categorized as either hard-parameter sharing networks or soft-parameter sharing networks. In hard parameter sharing (also known as multi-head), the network outputs separately by using separate task layers on top of the shared encoder.

Whereas in soft parameter sharing all parameters are specific to different tasks, but all networks have the ability to handle cross-task learning. The hard parameter sharing approach was chosen to keep the model simple and minimize the risk of overfitting. The structure of the MTL based on hard parameter sharing for neural networks is shown in Fig. 5 and consists of three layers: the input layer, the shared layer and the task-related layer. The shared layer is where we learn the shared representations of all tasks, while the task-related layer is where we learn the specific representations of each task. The key idea of multi-task learning is to find the relationship between tasks, which can provide useful information for the primary and secondary tasks to achieve better learning results.

It can be seen that the hidden layers of the network are shared by multiple tasks, while the task-related layers perform different task learning. Sharing several hidden layers at the bottom of the network, different tasks learn common low-level features from labeled samples related to different tasks, which can effectively utilize the labeled sample information and also reduce the risk of data overfitting. It should be emphasized that in this learning paradigm, auxiliary tasks can help improve the learning performance of the main task, i.e: (1) $L e a r n i n g A \lg (M a i n T a s k | | Re l a t e d t a s k s) > L e a r n i n g A \lg (M a i n T a s k)$

However, it is also noted that different learning algorithms may achieve different learning results for the same task. Typically, most machine learning methods employ single-task learning, which focuses on accomplishing a task with a specific model. It is possible to deal with complex problems by decomposing them into different independent tasks solved by a specific model, but ignoring the intrinsic correlation between the various tasks. In contrast, MTL can utilize a single model to solve complex problems by learning tasks in parallel. In general, the training cost of multiple models is significantly higher than that of a single model. Therefore, MTL models are more acceptable in practice than single-task models, unless the results of MTL are significantly worse. Moreover, MTL usually shows better performance when different tasks can be considered as related or similar. The main reason is that MTL can obtain correlation information between different tasks. By exploiting the useful information contained in multiple related tasks, the generalization performance of all tasks can be improved.

In the initial stage when multi-task learning was proposed, the most important motivation was to cope with the problem of data sparsity, because each individual task has limited labeled data by itself, and the amount of respective labeled data is not enough to support the monologue to train accurately, on the contrary, MTL learns the knowledge of all the tasks at the same time with the purpose of enhancing the labeling of the data, which can lead to a more accurate answer for each task. So multi-task learning can actually make better use of the existing knowledge and thus reduce the required labeling cost. When the era of big data arrived in areas such as computer vision and natural language processing, it was discovered that deep MTL models could achieve better performance than their single-task counterparts and reduce the risk of overfitting in each task.

MTL has attracted a lot of attention in the field of artificial intelligence and machine learning in the last few decades. Many MTL models have been designed and many MTL applications in other fields have been developed by virtue of their flexibility and efficiency, but there are few applications in the chemical industry, and this chapter applies the strategy of multi-task learning to the classification of faults in chemical processes.

3.1.2

Multi-task learning algorithms

The multi-task learning algorithm is enabled to learn both the primary task classification and the secondary task simultaneously. The two outputs can be considered as y^fc and y^de respectively in the final task layer, denoted as (2) $y^{f c} = \frac{1}{\sum_{m = 1}^{M} e^{z_{n}^{j c}}} [\begin{matrix} e^{z_{1}^{j c}} \\ e^{z_{2}^{j c}} \\ ... \\ e^{z_{N}^{j c}} \end{matrix}]$ (3) $y^{d e} = [\begin{matrix} z_{1}^{d e} \\ z_{2}^{d e} \\ ... \\ z_{D}^{d e} \end{matrix}]$

where M is the number of category labels for the primary task; D is the dimension of the input data; $z_{i} = w_{i}^{T} h + b_{i}$ is a linear sum of weighting vectors and deviations; and ${z_{m}^{k}}_{m = 1}^{M}$ and ${z_{d}^{k c}}_{d = 1}^{D}$ represent the final linear projections of the primary and secondary tasks, respectively.

The designed objective function (L_MLCD) is specified as (4) $L_{M L C D} = β_{1} L_{F C} ({\hat{y}}^{k}; ω_{s h}, ω_{f c}) + β_{2} L_{D E} ({\hat{y}}^{d e}; ω_{s h}, ω_{d e})$

where L_FC and L_DE are the loss functions for the main and auxiliary tasks; ω_sh, ω_fc and ω_de represent the training parameters for the shared, main and auxiliary task layers, respectively; and β₁, β₂ is the weighted hyperparameters for each task, respectively. The parameters are updated as follows: (5) $ω_{s h}^{i + 1} = ω_{s h}^{i} - λ (β \frac{\partial L_{F C}}{\partial ω_{s h}^{i}} + β_{2} \frac{\partial L_{D E}}{\partial ω_{s h}^{i}})$ $$\omega _{sh}^{i + 1} = \omega _{sh}^i - \lambda (\>\beta {{\partial {L_{FC}}} \over {\partial \omega _{sh}^i}} + {\beta _2}\>{{\partial {L_{DE}}} \over {\partial \omega _{sh}^i}})$$ (6) $ω_{f c}^{t + 1} = ω_{f c}^{t} - λ \frac{\partial L_{F C}}{\partial ω_{f c}^{t}}$ (7) $ω_{f d}^{t + 1} = ω_{f d}^{t} - λ \frac{\partial L_{F D}}{\partial ω_{f d}^{t}}$

where t is the number of iterations during training; λ is the learning rate; and β and λ are key hyperparameters, where λ adjusts the extent of training while β manages the relative importance between tasks. In general, hyperparameters are specified empirically; however, this rarely helps to achieve desired algorithmic performance. For example, λ too large can lead to non-convergence of the overall training, while β too small can neglect learning of the main tasks. Here, a Bayesian optimization approach is used to select key hyperparameters λ by using an agent model and Bayesian updating to achieve a solution. Its general step is to search for every possible combination and randomly select the top few. Then, based on the performance of these hyperparameters, the next best possible value is determined. Thus, the selection of each hyperparameter depends on previous attempts. The next set of hyperparameters is selected based on a historical evaluation of performance until the best combination is found or the maximum number of tests is reached. It provides superior results compared to traditional methods such as random and grid search.

3.2

Multi-task cognitive diagnosis of student assessment problems

3.2.1

Description of the problem

Cognitive diagnosis, as one of the important research contents of educational data mining, is mainly to explore the potential cognitive state of students based on their answer records and related learning behaviors, such as the degree of knowledge mastery, etc. Accurate results of student assessment are the basis of many education-related researches and applications, such as test question recommendation, early warning of dropping out, etc. Therefore, research on cognitive diagnosis of students is attracting attention both from researchers and the general public. Therefore, research related to student cognitive diagnosis has attracted the attention of researchers and the public.

In recent years, researchers have proposed several cognitive diagnostic models for student assessment. Figure 6 shows an example of using a cognitive diagnostic model for student assessment. As can be seen in Figure 6, most of the existing static cognitive diagnostic models for student assessment target the information from one exam. Usually, the assessment is performed based on the student’s answer record to obtain the student’s cognitive state and the corresponding test parameters (test difficulty, differentiation, etc.). Some experts in the field of pedagogy have pointed out that there is no direct comparison between the scores obtained by students taking different exams. For example, if Student A and Student B take the same test, and Student A scores higher than Student B, Student A is considered to be more capable than Student B. However, if students A and B do not take the same test, even if student A scores higher than student B, student A cannot be considered to be more capable than student B, because there may be differences in the difficulty of the questions themselves, resulting in no comparability between the scores of the two students. Similar to the problem that students’ scores are not directly comparable between different exams, the existing cognitive diagnostic model, when applied to real-world scenarios, also suffers from the problem that there is no comparability between the student assessment results obtained by diagnosing the cognitive diagnostic model when applied to two independent exams if the two students have taken different exams. In most practical teaching scenarios, students in different classes and schools usually use different test questions in their daily practice and examinations, but when taking a unified selection test (e.g., secondary school examination, college entrance examination, etc.), candidates need to be selected through their ranking order of performance according to the selection criteria for candidates in a specific ranking interval or score band. Therefore, it is necessary to diagnose an accurate and comparable cognitive state of students based on independent exercises and examinations, so as to have a more comprehensive and objective understanding of them.

3.2.2

Problem Formal Definition

Assume that there is an exam set E = {E₁, E₂, ⋯, E_T}, where each exam E_t(t = 1, 2, ⋯, T) contains a student set $U_{t} = {U_{t 1}, U_{t 2}, \dots, U_{t U}}$ and a test question set $V_{t} = {V_{t 1}, V_{t 2}, \dots, V_{t V}}$ , and consider the student diagnosis for each exam E_t as a separate task t(t = 1, 2, ⋯, T). It is worth noting that the students and test questions contained in each separate task are non-intersecting with each other. For each task t, there is a matrix of student answer records Y_t, where Y_tuv is the student U_tu answers on the test questions V_tv. In common scoring methods and cognitive diagnostic models, M Y_tuv is usually scored as 1 when a student U_tu correctly answers question V_tv, and M Y_tuv is scored as 0 when a student U_tu incorrectly answers question V_tv. Thus, the student answer record matrix is usually a binary matrix of 0s and 1s. In addition to the student answer record matrix, the question face information of the corresponding test question is needed as supplementary information connecting each individual student assessment task. For each task t, a test question feature matrix F_t is generated based on the question face information of the test questions, where the question face feature of each test question V_tv is represented by a row vector F_tv, and the feature vectors of all the test questions in task t form a test question feature matrix: F_t = (F_t1, F_t2, ⋯, F_tV).

Based on the above description of the research problem and data, the multi-task student assessment problem can be formally defined as follows:

Given the set of exams E = {E₁, E₂, ⋯, E_T}, as well as the set of students U_t corresponding to each exam E_t, the set of test questions V_t, the matrix of student answer records Y_t, and the corresponding matrix of test question facet features F_t, the main objectives of multi-task student assessment are to (1) conduct student assessment of T independent exams simultaneously, so as to obtain accurate and comparable student cognitive states and corresponding test question features (e.g., difficulty, differentiation, etc.); and (2) Prediction of student responses based on the student cognitive states and test question parameters obtained from the assessments.

4

Algorithm testing results and analysis

4.1

Data Set Preparation and Migration Performance Testing

In this section, two sets of experiments on some knowledge tracking tasks using multi-task learning algorithms are conducted to demonstrate the effectiveness of the MTL algorithm. They are data distribution minimization and migration between schools, respectively, to verify the prominence and effectiveness of the MTL algorithm in data distribution minimization and migration between schools.

Regarding the experimental dataset, since the existing knowledge tracking datasets do not contain the text of the topic question faces, we used datasets from two disciplines from police colleges across the country, i.e., Mathematics and Physics, which contain a large corpus of the topic question face text. The basic statistical information is summarized in Table 1. Meanwhile, four school datasets were selected and constructed from the math discipline dataset. The specific operations are described below:

Table 1.

Data statistics

DS	Question Number	Students Number	Question Answered Number	Knowledge Point Number
zx,math	70,628	5,000	387,329	8517
zx.pyhsics	57,720	5,000	231,826	684
School-A	602	52	3.974	29
School-B	698	61	5,011	31
School-C	654	59	4,748	23
School-D	758	56	4,398	19

zx,math: It is a math dataset containing math data collected from many police academies in China. It is a huge dataset, we randomly selected 329533 answer records from 5000 students with 60752 questions.

zx.physics: It is a physics dataset, similar to the zx.math dataset, collected from many police colleges in China. This dataset is extensive, comprising 397,263 answer records randomly chosen from 5,000 students, with 75,141 questions.

School A,B,C and D: These datasets are math subject data from four Chinese police colleges, which were selected from zx.math. Its purpose is to conduct a school migration loop experiment to evaluate the effectiveness of adaptive migration among different schools.

4.1.1

Minimizing the distribution of experimental data

The first experiment is to verify the effectiveness of the MTL algorithm for the optimization objective of minimizing distributional differences in domain differences. The distribution of knowledge states in the source and target domains before and after hard parameter sharing are compared. Figure 7 shows the visualization results of the MTL algorithm for dimensionality reduction, before migration on the left and after migration on the right. There is an interesting conclusion about the M → P task. Before minimizing the knowledge discovery between the source and target domains, the knowledge state distributions of the different domains are inconsistent due to their distance from each other. However, after sharing through hard parameters, the distributions of the two domains completely overlap. This indicates that hard parameter sharing is an effective method to realize domain adaption, and hard parameter sharing can be used effectively for domain adaption.

4.1.2

Migration between schools

In order to fully validate the effectiveness of the proposed MTL algorithm, this paper compares several common methods. All of these methods are capable of tracking knowledge and applying it to task transfer to some extent. Specifically, these methods include:

Original: It refers to the scenario where no migration operation is performed. Its experimental results are based on the best performance of the following baselines, which are trained on the source domain and directly applied to the target domain. Replacing only the output layer solves the problem of different output dimensions in different domains without any migration process.

DKT: It is an important knowledge tracking model that applies recurrent neural networks to model the learning process of students in order to estimate their mastery level.

GKT: It is a knowledge tracking method based on graph neural networks, which uses only prerequisite relationships to construct knowledge edge structures.

DKVMN: It is able to utilize the relationships between basic concepts to directly output the learner’s proficiency level for each knowledge point.

SKVMN: It combines the strengths of recurrent neural network modeling and memory network capabilities. It is better equipped to detect the relationship between potential knowledge points and topics, and to track students’ knowledge status.

The above methods are not migratable due to the inconsistency of output dimensions between different domains. I add fine-tuning operations to make them migratable. Among them, MTL algorithm and its variants, i.e., DKT+F, GKT+F, DKVMN+F, and SKVMN+F, are the models after adding fine-tuning operations to DKT, GKT, DKVMN, and SKVMN, respectively; MTL algorithm is the MTL algorithm that includes only the fine-tuning process; MTL-Q is the MTL algorithm that includes only the process of minimizing the difference in distributions and the fine-tuning process; MTL-M is the MTL algorithm that the MTL algorithm that includes only the problem selection and fine-tuning processes, and MTL algorithm is the complete algorithm we propose.

In order to examine the performance of the methods for inter-domain knowledge and semantic migration, we selected the datasets of four schools of thought as four independent domains and named them A, B, C, and D. As a result, there are 12 migration tasks (e.g., A → B).The experimental results of AUC are shown in Table 2.

Table 2.

Results of inter school transfer experiment

	A→B	B→A	A→C	C→A	A→D	D→A	B→C	C→B	B→D	D→B	C→D	D→C
Original	0.678	0.676	0.686	0.651	0.641	0.630	0.627	0.673	0.662	0.691	0.650	0.638
DKT+F	0.653	0.646	0.696	0.663	0.620	0.637	0.642	0.683	0.695	0.656	0.634	0.689
SKVMN+F	0.654	0.708	0.688	0.656	0.657	0.692	0.663	0.654	0.632	0.647	0.683	0.670
GKT+F	0.700	0.692	0.659	0.635	0.666	0.638	0.692	0.643	0.676	0.639	0.644	0.669
DKVMN+F	0.707	0.622	0.655	0.682	0.653	0.645	0.698	0.625	0.674	0.654	0.685	0.664
MTL-QM	0.639	0.628	0.649	0.684	0.623	0.706	0.684	0.663	0.694	0.697	0.629	0.701
MTL-Q	0.728	0.719	0.744	0.726	0.700	0.729	0.704	0.728	0.734	0.721	0.731	0.708
MTL-M	0.724	0.728	0.736	0.724	0.703	0.739	0.738	0.730	0.722	0.726	0.742	0.706
MTL	0.754	0.752	0.751	0.759	0.765	0.758	0.762	0.759	0.760	0.761	0.762	0.754

It is easy to see that among all the methods, MTL performs best in the inter-school migration task. This proves that the multitasking algorithm has superior migration capabilities. However, there is more to be explained in the table. First, comparing DKT+F, SKVMN+F, GKT+F and DKVMN+F with Original, it is found that fine-tuning using a small amount of target data helps to improve the knowledge tracking performance. Second, MTL slightly outperforms DKT+F because DKT+F and MTL-QM are structurally very similar, whereas MTL-QM draws on topic text word vectors. Third, MTL outperforms all variants, which is consistent with the intuition that the three migration stages are effective.

4.2

MTL Cognitive Diagnostic Algorithm Testing

In order to verify the rationality of the cognitive diagnosis model based on multi-task learning, this study also conducts MTL cognitive diagnosis algorithm testing experiments, this experiment is validated on three public datasets, and the cognitive diagnosis model is experimentally validated and its results are analyzed on three different datasets respectively, and it is also compared with the other recommended methods.

4.2.1

Experimental results and analysis of the time weighting function

The student skill mastery and time weighting function T(u,j) are introduced to solve the student similarity, and the effect of different values of T(u,j) on MAE is experimentally determined. In view of the dynamic tracking of student answering behavior data in Math1, Math1 was chosen as the dataset for the experiments in this subsection to confirm the optimal value of T(u,j).

As can be seen from Figure 8, for the three sets of student information contained in Math1, the trend of the MAE curves is roughly similar. With the increase of the time weight parameter, the value of MAE firstly decreases very quickly, and then the decrease begins to slow down, the value of MAE gradually slows down when the parameter takes the value of 160 days, and then gradually tends to stabilize after taking the value of 200, and finally roughly stabilizes at about 0.65. Then this paper will use the time-weighted parameter T(u, i) instead of the above time value. The final result will be very close.

4.2.2

Validation of the effectiveness of the algorithm

In order to verify the effectiveness of the algorithm, the algorithm probability matrix factorization (PMF), the traditional collaborative filtering algorithm (CF), and the MTL algorithm of this paper are compared. The comparison of accuracy between each algorithm is shown in Table 3.

Table 3.

Comparison of accuracy of different recommendation algorithms

DS	Precision
DS	PMF	CF	MTL
FrcSub	0.4385	0.2568	0.5273
Math1	0.4975	0.2956	0.5528
Math2	0.4394	0.2754	0.5181

As can be seen from Table 3, MTL, the personalized exercise recommendation algorithm based on improved cognitive diagnostic model proposed in this paper, has better results in terms of accuracy. On the FrcSub dataset, the accuracy of this paper’s MTL algorithm is 8.88% and 27.05% higher than the PMF model and CF algorithm, respectively. The main reason is that the MTL algorithm proposed in this paper is based on the cognitive diagnostic model for recommendation, and the algorithm focuses on describing the skill mastery of the students and takes it as one of the important influencing factors for similar students’ solutions, which can more accurately reflect which skill the students mastered better, which skill they mastered more weakly, or which skill they did not master at all. Moreover, the MTL algorithm takes into account the characteristics of students’ skill mastery over time in the recommendation process, and depicts the influence of students’ answer data in different periods on the recommendation effect. The PMF algorithm and the traditional CF algorithm make recommendations at the level of exercises, ignoring the degree of students’ skill mastery, which leads to lower accuracy of these two methods. And the traditional CF algorithm does not introduce the two features of students’ potential response scores on exercises and the factor of students’ skill mastery over time into the calculation formula when solving for similar students, resulting in the accuracy of the traditional CF algorithm being much lower than that of the MTL algorithm. For the Math1 and Math2 datasets, which have a larger amount of data, the accuracy of the MTL algorithm is higher than the PMF model and CF algorithm by 5.53% and 25.72%, and 7.87% and 24.27%, respectively, which once again validates the effectiveness of the MTL algorithm.

5

Conclusion

In this study, the police academy management system was constructed and a cognitive diagnosis method for assessing students’ learning status using multi-task learning algorithms was implemented on it. The research conclusions are as follows: 1)

The effectiveness of MTL algorithm in the optimization objective of the minimum distribution difference of domain differences. Hard parameter sharing is an effective method to realize domain self-adaptation, and hard parameter sharing can achieve domain self-adaptation well.

2)

Among all the methods, multi-task learning performs best in the inter-school migration task. This proves that the multi-task learning algorithm has superior migration capability and can adapt to the special learning tasks of police academies.

3)

The MTL cognitive diagnosis algorithm built in this paper has better results in terms of accuracy. The diagnostic accuracy of the FrcSub dataset is 8.88% and 27.05% higher than that of the PMF model and CF algorithms, respectively.

This study demonstrates the value of multi-task learning algorithms in the police academy management system, as they can help build an effective practical path for intelligent management of students.

Language:: English

Publication timeframe:: 1 times per year
Journal Subjects:: Life Sciences, Life Sciences, other, Mathematics, Applied Mathematics, General Mathematics, Physics, Physics, other

Journal RSS Feed

Multi-task learning algorithm design and implementation strategy for intelligent police academy management system

Zemin Qin

Published Online: Mar 24, 2025

Received: Oct 23, 2024

Accepted: Feb 03, 2025

DOI: https://doi.org/10.2478/amns-2025-0712

KeywordsMulti-task learning, Police academy management system, Student behavior, Cognitive diagnosis

© 2025 Zemin Qin, published by Sciendo

This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.

Keywords
Multi-task learning, Police academy management system, Student behavior, Cognitive diagnosis