A Study of the Effectiveness of Using Deep Learning Algorithms to Analyze Legal Risk Identification in Social Work Programs
Data publikacji: 24 mar 2025
Otrzymano: 15 lis 2024
Przyjęty: 18 lut 2025
DOI: https://doi.org/10.2478/amns-2025-0769
Słowa kluczowe
© 2025 Xinxin Fan, published by Sciendo
This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
Social work is a professional field aimed at solving social problems and helping individuals, families and communities. However, legal risks are inevitable in social work practice [1-2]. Social workers need to have legal awareness and take necessary measures to prevent and manage legal risks in order to protect their legal rights and provide high quality services. Therefore, the identification of legal risks in social work is of great significance for social workers to carry out social work in a better way [3-6].
Legal risk refers to the social work program in the implementation of legal provisions, rules, regulations and other factors may lead to loss or litigation risk of potential danger. Related workers in the market economy, must comply with the relevant laws and regulations, otherwise they will face the risk of punishment, litigation or even closure [7-10]. Therefore, identifying legal risks is a very important part of social work programs. Social workers need to conduct a comprehensive risk assessment of social work programs to identify potential legal risks and take appropriate preventive measures [11-14]. Risk assessment may include methods such as reviewing and understanding existing laws and regulations, communicating with law enforcement and regulatory agencies, and collecting and summarizing past case experiences. For example, in the case of social work organizations, internal and external audits should be conducted regularly to ensure that their activities comply with the requirements of laws and regulatory agencies [15-18]. In addition, social workers should develop reasonable work processes and standard operating procedures to regulate their behavior and reduce the occurrence of legal risks, while deep learning algorithms can improve the effectiveness of legal risk identification [19-22].
The study first elaborates on the specific definition of social work programs, introduces social work agencies, and introduces social work implementation paths for social work programs. Then the deep learning theory is analyzed, so as to begin the research on legal risk recognition using deep learning. In this paper, a DNN-based risk identification model is constructed for the legal risks of social work programs, and the model’s performance is verified through its training results. In addition, by comparing different models, the effectiveness of the constructed risk identification model in identifying legal risks in social work programs is being deeply studied. The final research results show that the risk identification model based on deep learning in this paper is effective in identifying legal risks in social work programs and has certain practical value.
Social work project refers to the general term for activities in the field of social work services in which social work agencies and social workers apply professional theories and methods, reasonably utilize internal and external resources, and carry out social work services in a planned and purposeful manner according to the project management process, to satisfy the personal needs of the service recipients, to integrate community resources, and to solve social problems. Most of the current social work programs come from government-purchased services, and the execution and implementation of these programs depend on the cooperation between government agencies and social workers.
The organizational framework of social work agencies is shown in Figure 1. Social work agencies are committed to professional values, theories and skills of social work, serving the needy people in society, adhering to standardized management, project-based operation, participatory innovation, high-performance outputs, and aiming at personal empowerment, community empowerment, and professional enrichment. The scope of the agency’s services mainly includes children’s social work, youth social work, gerontological work, social work for the disabled, and the main source of projects is government-funded services.

Organization framework of social work agencies
The way in which judicial administrative bodies obtain social work services is through the purchase of services and the commissioning of projects. Prior to the promulgation and implementation of the Law, most regions purchased services from “social workers”. Most of these “social workers” did not possess the professional qualifications of professional social workers, lacked experience in the practice of social work, and were mainly engaged in assisting the judicial administrative organs in carrying out administrative support services, most of the content of which had nothing to do with social work, and were in fact of the nature of temporarily employed persons. Although they are also referred to as “social workers”, this is merely a term used by the judicial administrative organs to distinguish them from other temporary staff, and they are not “social workers” in the professional sense. In addition, even professionals who are assigned by social work service organizations to work in judicial administrative organs are susceptible to interference by administrative forces in their work, and their professional independence is easily limited. Project entrustment, also a kind of purchased service, is a system whereby the judicial administrative organs entrust the educational and supportive work of community corrections in the form of “project packaging” to professional organizations and pay the fees, with the professional organizations participating in the whole process of project design, implementation and acceptance, and carrying out the services as a package. The content of the work entrusted to the project is usually highly specialized, requiring overall assessment, overall budgeting and overall design, and is able to make up for the lack of specialization on the part of the community corrections staff of the judicial administrative organs.
In the process of social work implementation, on the one hand, it is due to the hierarchical relationship between social work organizations and the government. On the other hand, it is the complexity and diversity of the social work program itself, which leads to the problems of the legal system being difficult to implement, the legal risks to be solved and the regulatory system to be improved in the social work program. Therefore, this paper will focus on the identification of legal risks in social work programs.
The legal risks of social work programs can be mainly classified into violation risks, exercise risks, contract risks, tort risks, liability risks and other risks. Violation risk mainly refers to the legal sanctions or losses that may be suffered because of the violation of laws, statutes, regulations or provisions of the industry, and this risk may involve administrative liability, civil liability or even criminal liability. Contract risk is the legal risk that arises due to issues with contract signing and fulfillment. Exercise risk, in the social work program involves the handover of the upper and lower levels and the management of financial options, so the exercise risk is mainly due to the risk of not having enough funds above the cooperation and not being able to exercise the right. Infringement risk refers to the risk of infringing others’ intellectual property rights and labor rights in social work projects. Liability risk mainly refers to the negligent behavior of individuals or groups due to their involvement in the social work program. In addition to these other legal risks, this paper summarizes them as other risks.
Machine learning is a branch of artificial intelligence and also the core of artificial intelligence. The theory of machine learning focuses on giving computers the ability to learn by designing and analyzing algorithms, and is a simulation of the main cognitive processes of human beings, and machine learning has gone through the culmination of two developments: shallow learning and deep learning.
Among them, deep learning is to build a network structure similar to the structure of the human brain by stacking multiple implicit layers in order to simulate the process of human brain thinking and learning. Each layer of the network takes the features extracted from the previous layer as learning objects to discover the feature representation of the input information, thus realizing a more accurate model structure and improving the network’s ability to classify and predict. The most common deep learning methods are stacked self-encoders, convolutional neural networks, and deep confidence networks. Learning training methods can be classified as supervised and unsupervised depending on the sample.
The unsupervised learning property of self-encoders makes them commonly used for training and testing of unlabeled samples. In the structure it is assumed that the output value of the network is equal to the input value, i.e., input layer
The running process of the autoencoder algorithm can be divided into two phases according to its function: encoding phase and decoding phase. From the input layer to the hidden layer is the encoding phase, the purpose is to downsize the input sample.
Encoding phase process:
where
From the hidden layer to the output layer is the decoding phase, which aims to remap the samples that have been dimensionalized to the original feature space again.
Decoding phase process:
where
Self-encoder network training usually needs to be done by stochastic gradient descent (SGD) algorithm and error back propagation (BP) algorithm. The whole training process is a process of minimizing the reconstruction error, which can be expressed as a minimization cost function:
As the training of the network is optimized, the output data
Convolutional neural net is likewise a feed-forward deep neural network that utilizes convolutional budgets instead of matrix multiplication operations in network training, which reduces the computational difficulty of the network and reduces the risk of network overfitting, while at the same time improving robustness and stability. It generally consists of a stack of input, convolutional, pooling, fully connected and output layers.
The fully connected layer integrates the features extracted through convolution and pooling, filters the features of lower importance, retains the features of higher importance, connects each unit of the upper and lower layers, and finally outputs the labels. In order to improve the performance of the DNN network and avoid gradient explosion, the fully connected layer often uses the Relu function as the activation function. The output of the fully connected layer will be used as the input to the loss function of the network model to calculate the loss error.
The training of neural network is a process of constantly optimizing the loss function, the purpose is to make the loss value gradually become smaller, and finally reach a stable state. The commonly used loss functions include mean square error function, cross-entropy loss function, and quadratic cost function.
Mean square error function:
Cross-entropy loss function:
Quadratic cost function:
Where,
The loss function can express the degree of similarity between the output labels and the real labels, so the loss function can be directly used to measure the strength of the DNN network training results. Optimizing the loss function algorithm can improve the accuracy of network recognition. Currently commonly used optimization algorithms are divided into: stochastic gradient descent, stochastic gradient descent with momentum, root mean square propagation and adaptive moment estimation.
Stochastic gradient descent method The standard gradient descent method is to update the network weights, bias and other parameters along the negative gradient direction of the loss function at each iteration to minimize the loss function. It can be expressed as:
where
Standard gradient descent algorithms need to utilize the entire dataset for computation, whereas stochastic gradient descent, the SGD algorithm mentioned above, needs to employ only a partial subset of the training data.
Stochastic gradient descent with vectorization Adding a momentum term to the gradient descent process can effectively reduce the problem of fluctuations. Stochastic Gradient Descent with Momentum (SGDM) can be expressed as:
Where: Root mean square propagation method The Root Mean Square Propagation method is a loss function optimization algorithm that uses the moving average of the squared gradient to dynamically adjust the learning rate. Different from the stochastic gradient descent method using a single learning rate, the root-mean-square propagation method can use different learning rates to continuously optimize the network training, which can automatically adapt to the loss function in the optimization process, accelerate the convergence speed of the model, improve robustness and avoid the problem of gradient disappearance. Its expression is:
where
where Adaptive Moment Estimation Adaptive Moment Estimation (Adam) is similar to the Root Mean Square Propagation (RMSP) method, the difference is that the RMSP algorithm uses a weighted average of the moving exponential of the squared gradient to adjust the learning rate, whereas Adaptive Moment Estimation (AME) takes into account both the mean and the variance of the gradient, and incorporates a momentum term to adaptively adjust the learning rate. Its expression is:
Among them:
where
Neural networks, also known as multilayer perceptrons, are one of the most basic neural network structures and the most widely used. A deep neural network (DNN) is a multilayer perceptron containing many (more than two) hidden layers.The DNN deep neural network is shown in Fig. 2 and consists of one input layer, three hidden layers, and one output layer. Here the input layer of the DNN is set as layer 0 and the output layer is set as layer L. The operations of the neural network can be defined as:

Deep seeding network
Here
The sigmoid function:
hyperbolic tangent function:
The hyperbolic tangent function is a tuned version of the sigmoid function and they have the same modeling capabilities. The difference is that the value domain of the sigmoid function is (0, 1), which helps to get a sparser representation. Whereas the value domain of tanh is (-1, 1) is symmetric and easier to train.
Rectified Linear Unit (ReLU):
Its derivatives are more concise and do not suffer from gradient vanishing as the number of layers increases.
The selection of the output layer of a deep neural network usually varies depending on the task, and in the case of a regression task, a linear layer is usually used:
In case of multi-categorization tasks, it is common to use each neuron of the output layer to represent a category, where the value of the
Given the input of the feature vectors, the output of the DNN is determined by the model parameters
The DNN constructed in this paper consists of an input layer, a hidden layer, an output layer and a softmax function, where the input layer consists of four neurons corresponding to the four features in the IRIS dataset, which are used as input vectors, and the hidden layer has two layers, each consisting of five and six neurons, respectively, followed by an output layer, which consists of three neurons corresponding to the number of categories of the target variable in the IRIS dataset. Finally, there is the softmax function created for solving the multicategorization problem.
In this section, Pytorch deep learning framework is used as a tool to build DNN-based legal risk recognition model. The 1602 pieces of data are made according to the ratio of training set and test set as 2:1, and there are 1073 pieces of training set and 529 pieces of test set after making. Due to the small amount of data, the DNN network built in this paper is relatively simple, including three layers: the input layer, a hidden layer, and the output layer, in which the hidden layer contains 16 neurons, and the risk level used for legal risk assessment has three levels: high, medium, and low, so the category of the output layer is 3. The Softmax classifier is used to calculate the output probability of each category, and the ReLU activation function is used to prevent the gradient dispersion, to speed up the computation and increase the sparsity of the network, the function expression is:
Then the hyperparameters are set, the learning rate is set to 0.001, the number of iterations is set to 200, the number of batches processed each time is set to 50, the cross entropy loss function is selected for the loss function, and the training mode is retraining. After the setup is completed, the GPU is called to train the network, during the training process, we need to pay attention to the loss value (loss) and accuracy (accuracy) of the two parameters change, the loss value decreases to prove that the error is converging, the accuracy value increases to prove that the accuracy rate is increasing, these trends can indicate whether the generated model is ideal or not.
The change of loss curve of DNN risk recognition model built in this paper is shown in Fig. 3, from the change of loss value, we can see that the Train loss curve and Test loss curve, the change trend of the two curves is basically the same, starting from the loss value of about 1 to start a sharp decline, after 40epochs of iteration the curve begins to slow down the decline, and then to the After 100epochs the train loss and test loss curves gradually stabilize, indicating that the network begins to converge, and the training is completed at 200 Epochs, and the loss value drops to 0.1.

Change in the loss curve
The change of the accuracy curve of the DNN risk recognition model is shown in Fig. 4, as can be seen from the accuracy curve, the training accuracy curve of the DNN network increases dramatically after the increase of the number of iterations accuracy, starts to become slower at about 20epochs and reaches about 90%, and with the hierarchical increment reaches the convergence at about 80epochs and the accuracy reaches about 98%. Thereafter it remains stable until the end of training. The test accuracy trend is roughly the same. From the loss curve and accuracy curve, it can be seen that the network convergence of the DNN legal risk identification model is stable, and from the accuracy rate and loss value, it can be seen that the generated model has better performance.

Change in the accuracy curve
Common model evaluation metrics include precision rate, recall rate, and F1 value. Precision rate is the number of key elements correctly identified by the model (TP), as a proportion of all identified key elements (TP+FP). Where FP is the number of misidentified key elements. Recall is the ratio of the number of key to be (TP) correctly identified by the model to all true key elements (TP+FN). Where FN is the number of non-critical elements that were incorrectly identified.
In the previous section the risk identification model is constructed and trained to verify its performance. The next section will validate the effectiveness of the legal risk identification model of this paper in identifying legal risks in social work programs. A total control group was designed to verify its feasibility and effectiveness in the constructed legal risk dataset of social work programs, and HMM, CRF, BiLSTM, LSTM, BiLSTM-CRF, and BERT-CRF were used for comparison. The results of the experimental comparison are shown in Fig. 5, from which it can be seen that this paper’s deep learning-based legal risk recognition model achieves optimal results in all three metrics, with precision, recall, and F1 values of 89.32%, 86.23%, and 87.35%, respectively. Outside of the first time, the BiLSTM-CRF model and BERT-CRF model have relatively high recognition rates, with recognition precision rates of 87.25% and 88.12%, respectively.

Experimental comparison
The legal risk elements of social work programs are classified into six types, including violation risk (S1), exercise risk (S2), contract risk (S3), tort risk (S4), liability risk (S5), and other risks (S6). In the identification of legal risks in social work programs, the identification of legal risk elements by different models is shown in Table 1. From the situation of the identification accuracy of the six legal risk elements, the average accuracy of the risk identification of the model in this paper is 88.152%, which is higher than the other models. In terms of risk identification, the identification accuracy of this paper’s model on the exercise risk (S2) is slightly lower than that of the BERT-CRF model, which is 0.03% lower. However, the accuracy rate of each other legal risk identification has achieved the highest results, and this paper’s model has good performance in the process of risk identification, verifying the effectiveness of this paper’s legal risk identification model in the identification of social work programs.
The identification of the risk elements of the law
| Model | Legal risk | ||||||
|---|---|---|---|---|---|---|---|
| S1 | S2 | S3 | S4 | S5 | S6 | Mean | |
| HMM | 93.85 | 80.85 | 48.15 | 32.42 | 54.58 | 83.18 | 65.505 |
| CRF | 95.73 | 93.12 | 48.25 | 35.15 | 58.92 | 94.32 | 70.915 |
| BiLSTM | 96.15 | 96.11 | 50.12 | 41.53 | 91.04 | 94.38 | 78.222 |
| BiLSTM-CRF | 97.63 | 94.60 | 48.15 | 55.35 | 91.53 | 95.21 | 80.412 |
| LSTM | 96.99 | 86.15 | 50.02 | 40.15 | 92.65 | 94.21 | 76.695 |
| BERT-CRF | 97.42 | 95.35 | 51.36 | 57.82 | 93.32 | 96.15 | 81.903 |
| Ours | 97.52 | 95.32 | 71.35 | 70.15 | 96.68 | 97.89 | 88.152 |
The issue of legal risk is particularly important in the implementation of social work programs, and this paper, based on deep learning, centers on the identification of legal risk in the implementation of social work programs. The full work is summarized as follows:
A legal risk identification model is constructed based on the DNN model, and the model is trained. Among them, after the stabilization of the model, its loss value is stable below 0.1 and the accuracy rate is kept around 98%, which initially verifies the performance of the model. Comparing this paper’s model with HMM, CRF, BiLSTM, LSTM, BiLSTM-CRF, BERT-CRF, etc., the sub-precision rate, recall rate, and F1 value of this paper’s legal risk identification model are 89.32%, 86.23%, and 87.35%, respectively. Compared to the better performance of the BERT-CRF indicators, which are still higher than 1.2%, 5.18%, and 2.82%, respectively. This paper’s model compared to other models of legal risk identification effect has obvious improvement. The model in this paper achieves optimal results in the recognition accuracy of six types of risks: violation risk, exercise risk, contract risk, tort risk, liability risk and other risks. Its average recognition accuracy is 88.152%, which is improved by 7~23% compared with other models, further proving its effectiveness in social work programs.
