Deep Learning-based User Behavior Data Mining in Precise Recommendation of E-commerce Platforms
Publié en ligne: 17 mars 2025
Reçu: 16 oct. 2024
Accepté: 28 janv. 2025
DOI: https://doi.org/10.2478/amns-2025-0327
Mots clés
© 2025 Lanyan Yang, published by Sciendo
This work is licensed under the Creative Commons Attribution 4.0 International License.
Under the environment of “Internet +”, China’s e-commerce market is developing at a high speed, deep penetration of business, and the network has become the backbone of the e-commerce market to promote the development of e-commerce. In the face of fierce peer competition coupled with a lack of experience in transformation, traditional enterprises in e-commerce operations and product promotion is poor, facing the pressure of transformation, which compared with e-commerce giants have a lot of room for progress [1-3]. In the peer fierce competition environment to tap the marketing potential, leading the e-commerce market share, is the traditional enterprise development of e-commerce market urgently need to solve the problem.
At present, enterprises in the e-commerce environment can use a variety of data mining techniques and tools to deal with massive amounts of data in order to find out which valuable and potential data features and use them to assist enterprises in making the right decisions [4-6]. Among these data, mobile user behavior information is of great significance to e-commerce enterprises and users because it can reflect users’ preferences and purchase intentions [7-8]. On the one hand, e-commerce recommender system can realize the analysis and optimization of a large amount of data, accurately find the user’s real needs in the user browsing behavior, but also for the business to pull in more potential consumers, for the old users will be the business of new products on-line reminder to achieve the scope of the sales and the expansion of the group [9-12]. On the other hand, for consumers, the recommender system optimizes the purchasing interface, the home page can display the products of interest, and personalized product recommendations to the user, which saves the user’s time to think about the search, improves the user’s goodwill towards the platform, and makes it easier to make shopping behavior [13-15]. In this way, the recommendation system realizes the win-win situation between merchants and users, and pushes the development of e-commerce to another climax.
In e-commerce, product recommendation based on matching speculation of user preferences to maximize the interests of merchants while providing users with better quality of service, e-commerce platform personalized recommendation system needs to solve the target problem. Literature [16] constructed a usefulness recommendation classification framework with a review semantic extractor and an item recommendation generator as the core, using the review semantic extractor to classify the usefulness of user reviews, and using the recommendation generator to model the user’s item preferences to support the user’s purchasing decision in personalized recommendation services. Literature [17] proposes a general framework for Pareto-efficient recommendation with theoretical guarantees, and introduces the Pareto-efficient multi-objective recommendation method generated under the framework into e-commerce recommendation to optimize the number of item transactions and click-through rate, and experimentally demonstrates that the experimental results validate the Pareto-efficiency of the framework. Literature [18] studied the online shopping behavior analysis and prediction system, in order to avoid the phenomenon of data overfitting, the fusion algorithm was used to fuse the prediction results based on the logistic regression model and the XGBoost model, and the experiments showed that the proposed hybrid model had a high classification accuracy. Literature [19] shows that the cold-start problem of recommender systems can lead to a decline in the performance of product recommendation for new users, so a cross-domain recommender system that shares users between the online shopping domain and the advertisement recommendation domain is established, and the use of deep learning techniques to model the users can more accurately recommend the products to the users in other domains. Literature [20] proposed an intelligent recommendation system based on integrated learning for a large number of unnecessary recommendations and unpredictable new products in e-commerce platforms, which significantly reduces repeated recommendations and irrelevant recommendations in the recommendation process based on analyzing behavioral information such as users’ purchase records. Literature [21] developed OntoCommerce, a semantics-driven online e-commerce product recommendation system, which provides personalized product recommendations by collecting user query information, user navigation records, and user profiles and performing semantic similarity computation using a rich normalized point-by-point mutual information metric.
In this paper, based on the deep learning model and user behavior sequence, we constructed a sequence recommendation model PMCA-BiLSTM based on pre-trained multi-scale convolutional attention recurrent neural network, and then designed an accurate recommendation system for e-commerce platform goods according to the recommendation model, and examined the performance of the recommendation system and the recommendation model.The PMCA-BiLSTM model utilizes the BiLSTM for the user behavior The PMCA-BiLSTM model uses BiLSTM to encode user behavior sequences, pooling, attention and multi-scale convolutional residual neural networks to extract user interest features, and interacts user interest vectors with target product vectors to predict whether a user will click on the target product or not. In order to examine the system and model performance, this paper stress tests the system and compares the HR and NDCG metrics of the model from different datasets, different iterations, and different Top-K.
In this paper, a sequence recommendation model PMCA-BiLSTM based on pre-trained multi-scale convolutional attention recurrent neural network is proposed by combining deep learning technology and user behavioral sequences, which realizes the accurate recommendation of goods for users of e-commerce platforms by mining user behavioral sequence data with the use of deep learning model.
The user set
In the sequence recommendation problem, the user’s interactions with items are usually arranged into an ordered sequence according to the timestamps as the inputs to the sequence recommendation model, and the model predicts the most probable interaction behavior of the user at the next time by finding out the dependency relationship between them through complex modeling of the inputs, which is defined as shown in Equation (1):
where
The structure of the PMCA-BiLSTM model is shown in Fig. 1, which mainly consists of an input layer, a sequence encoding layer, a feature extraction layer, a splicing layer and an output layer. The input layer consists of multiple pre-trained models, the sequence encoding layer adopts a bidirectional recurrent neural network, the feature extraction layer adopts pooling, attention and multi-scale convolutional residual neural network to extract user interest features from multiple perspectives, the splicing layer splices the user interest feature vectors, and the output layer adopts the MLP network with residual structure, which interacts the user interest vectors with the target product vectors to predict whether the user will click on the target product.

The structure of PMCA-BiLSTM model
As shown in Fig. 1, given a sequence of user history behaviors
In this paper, several pre-training models are used to pre-train the user behavior sequence represented by
In recommendation tasks based on user behavioral data, Bidirectional LSTM (BiLSTM) is usually used to better extract contextual relationship information between items and capture bidirectional dependencies between items.
The LSTM network effectively alleviates the long-term dependency problem that the information features of early inputs will not be apparent at a later stage. The LSTM realizes the functions of information retention control of previous units, preservation of the amount of input information, and determining the output of the units through three gating structures [22]. Equations (2) to (8) show the update process of LSTM:
The parameters used in the above update process are:
BiLSTM consists of two LSTMs moving in opposite directions to realize bi-directional processing of sequence data to further obtain sequence dependencies [23]. The BiLSTM update process is as follows:
As an improvement of LSTM, BiLSTM also has the same computation of LSTM,
The feature extraction layer is divided into three parts: attention mechanism, multi-scale convolutional residual neural network and pooling.
In sequence recommendation, modeling user behavior through the attention mechanism can distinguish the relevance of items in a sequence [24]. By dynamically weighting and reorganizing the items, the user’s long-term preferences and true intentions are captured. The main process is to determine the weight value of attention by comparing the hidden state vectors of BiLSTM. Firstly, a weight matrix is mapped in a nonlinear way, and then its similarity with the hidden layer state matrix is calculated, and all the hidden vectors are weighted and summed with the attention weights obtained from the Softmax function to get the coded feature vectors processed by the attention mechanism. Let the output vector of BiLSTM hidden layer be
where
To better capture information about multiple behavioral patterns in the user’s behavioral sequences, this paper introduces the residual SENet, which is a more effective residual mechanism network than ResNet [25]. In response to the traditional residual structure which only adds inputs to outputs and cannot select different in-channel features in the input, SeNet introduces a channel-weighted residual technique, SE-Block, by which channel data can be weighted to emphasize valid information within multiple channels and suppress invalid information, similar to an in-channel attention mechanism.
SE-Block is mainly divided into two parts: the squeeze operation and the excitation operation.Squeeze encodes the entire spatial features of a channel into a global feature, which is implemented by using global average pooling to obtain the global description features of a channel. Excitation operation targets the relationship between different channal relationships are modeled. The weights on a channnel are learned. Two fully connected layers are utilized, the first FC layer acts as a dimensionality reduction with a coefficient of
where
In this paper, we choose two pooling operations, maximum pooling and average pooling operation, and its calculation formula is shown below:
The purpose of the output layer is to predict the probability that the target user
In this paper, an MLP network with residual structure is used as the output of the model.The use of residual structure not only better ensures that there is no loss of valid information in the model, but also further avoids overfitting of the model due to the depth of the layers.The MLP network dimensions are set as 512 → 256 → 128 → 1. The model will be used for the CTR task, and the CTR prediction can usually be defined as a binary classification task. The objective of the model is to learn the prediction function
where
In order to further validate the effectiveness and feasibility of the model in the actual commodity recommendation, combined with the proposed recommendation model and user behavior dataset, this paper designs and implements an e-commerce platform commodity recommendation system, which visually demonstrates the function of the recommendation system.
A common product recommendation system usually consists of three parts: user online shopping, administrator background management, and personalized product recommendations. The users of the website are divided into registered members and unregistered visitors. Registered members can log in by entering their account password, face recognition, or cell phone verification code. For unregistered visitors, they can only browse and view the product information on the web page, and cannot make personalized recommendations based on their history. Through the above analysis, the user and administrator requirements of the whole system are summarized as follows:
User Functional Requirements: User Login and Registration, Product Category Viewing, Product Searching and Finding, Personalized Recommendations, Product Collections, Product Additions, Product Purchases, Product Comments and Ratings. Administrator functional requirements: administrator login, commodity additions, deletions and changes, user rights management, website traffic statistics.
For ordinary users who enter the recommendation system, since there is no historical information and data, only the sorting of popular products is displayed. For ordinary users, both for its popular products recommended, while personalized recommendations based on its historical behavior records, while the user’s new behavior logs, record guide database for the next time the user to use the site to provide data support.
In this paper, the e-commerce platform commodity accurate recommendation system is designed using B/S architecture, and the overall structure of the system is shown in Figure 2, the whole recommendation system module is mainly composed of interface layer, business logic layer and data access layer.

System architecture
The main function of the interface layer of the recommendation system is the display of the user interface, after the user enters the system by logging in, the user clicks on the web page, adds the purchase, buys and other operational behavior to generate behavioral logs, and the resulting data is transmitted backward through http requests. The front-end and back-end separation design, the front-end is built using Vue3, responsible for the display of commodity data, the back-end is responsible for the realization of business logic, to avoid conflicts between the front and back-end of the development, and to facilitate the maintenance of the code.
The business logic layer is mainly responsible for the interaction between the front-end and the back-end, the request generated by the interface layer is transmitted to the business logic layer through http, and after processing, the request is transmitted to the back-end to operate on the database, and then the results of the completed operation are fed back to the front-end display. The business logic layer opens up the interaction channel between the interface layer and the data access layer, which makes communication between the entire system smoother.
The data access layer chooses MySQL database to store the user behavior data generated by the system, trains the stored data with the recommendation model built by TensorFlow, and then feeds the results back to the front-end through the business logic layer to generate a recommendation list for the user.
Users are judged when logging into the system. If they are a new user, they will have to register their account, while if they are an old user, they will enter their account password to access the system normally. Users are required to enter their cell phone number or e-mail account when they first log in to the system, so that they can verify and retrieve their password when they forget it. The flow of the user registration and login module is shown in Figure 3.

User registration and login module
When the user enters the system to determine their identity, if they enter the new user product recommendation page, the page provides the current popular product recommendations and product search.If it is an old user, it will be recommended based on its historical behavior.The overall process of the accurate recommendation module is shown in Figure 4.

Accurate recommendation module for goods
After the new user enters the system, the page provides a list of popular commodity recommendations, if the user is interested in it, then display the recommended accuracy, record user behavior data, if the user is not interested in the recommended content is replaced with a new list of popular recommendations until the user produces a view of the behavior. If the user uses the commodity search directly on the commodity keyword search, the user behavior data is recorded and calculated by the recommendation algorithm pushed to the user.
When the old user enters the system, the system displays the recommended list according to their historical shopping behavior and their data, such as product collection, rating, add-on purchase, and purchase history.Will become the basis for pushing. At the same time, if the user has purchased consumable goods for continuous recommendations, non-consumable goods will reduce the weight of the recommendation accordingly. If the user is not interested in the recommended products, the corresponding recommendation weight will be reduced, and the user’s real-time operation will realize dynamic recommendation until the user goes offline until the end of the recommendation.
To cope with various stress scenarios that will occur in real applications, this paper uses the JMeter tool to conduct performance and stress tests on the system. In this test, the concurrency is set to 4 threads per second, each thread simulates a user, and the duration of the thread group is 100 seconds, which contains a variety of user requests for product interactions as well as a variety of recommendation scenarios. The system performance and stress test results are shown in Table 1.
Performance and stress test results of the system
| Tags | Sample size | Average response time/ms | Median response time /ms | The first 90% response time /ms | The first 99% response time/ms | Response time minimum/ms | Maximum response time/ms | Error rate |
|---|---|---|---|---|---|---|---|---|
| Log-in | 1600 | 128 | 84 | 159 | 347 | 58 | 482 | 0.00% |
| Click | 3200 | 56 | 41 | 68 | 239 | 52 | 354 | 0.00% |
| Collect | 2000 | 65 | 49 | 75 | 226 | 35 | 417 | 0.00% |
| Purchase | 1600 | 291 | 173 | 329 | 648 | 121 | 2213 | 0.00% |
| Evaluation | 1200 | 535 | 378 | 691 | 2384 | 171 | 2796 | 0.00% |
| Personalized recommendation | 3200 | 296 | 281 | 397 | 715 | 178 | 969 | 0.00% |
In the test environment, the back-end runs on the Windows operating system, and the front-end uses Chrome and Edge browsers. The results in Table 1 show that the longest response time for user requests is 2796ms, which is within 3 seconds, and the error rate of all test requests is 0, indicating that all requests simulated are correctly processed by the system. The performance and stress test results in this paper show that the system is capable of providing users with a good experience.
Two publicly available representative e-commerce datasets were chosen for the training and testing of the recommendation model in this paper: the Yoochoose and Diginetica datasets.The Yoochoose dataset is taken from the RecSys Challenge 2024 website, which contains the clickstream data of the users on e-commerce websites over a period of 6 months. The Diginetica dataset is from the CIKM Cup 2024 website, which uses only its transaction data. Yoochoose is divided into two datasets, Yoochoose1/64 and Yoochoose1/4, where the sessions in Yoochoose that are closest to the time of occurrence of the test session are selected as 1/64 and 1/4, respectively. In this paper, Yoochoose1/64 and Diginetica one small and one big two datasets are selected for experiments, and the information of the datasets processed and used in the experiments of this paper is shown in Table 2.
Data set statistics used in experiments
| Tags | Yoochoose1/64 | Diginetica |
|---|---|---|
| Click number | 605134 | 993528 |
| Training session | 372496 | 725681 |
| Test session | 59759 | 61543 |
| Commodity number | 17355 | 44262 |
| Average length | 6.22 | 5.34 |
The dataset needs to be processed before the experiment in order to utilize the data more effectively and to make the task and purpose of model training clearer. Firstly, the dataset is filtered, and for the Yoochoose dataset and Diginetica dataset, the items with less than 5 occurrences are filtered out, as well as all the sessions with length 1. At this point, the Yoochoose dataset contains 7632469 sessions and 35847 item data, while the Diginetica dataset is left with 205374 sessions and 42875 item data.
Next, a data augmentation operation is performed for each session to generate sequences and corresponding labels by splitting the input sequence. The session of the last day’s time slot is set as the test set for Yoochoose and the session of the last week’s time slot is set as the test set for Diginetica, and the data other than this in both datasets, constitutes the two training sets.
Finally, the Yoochoose dataset needs to be further processed by sorting the sessions of the training set in Yoochoose by time, and selecting the 1/64th and 1/4th proportion of sessions that are closest to the time of the test set to form the two training sets. To maintain consistency in the test set, it is ensured that the commodities do not appear in both the test set and the training set at the same time.
Due to the large amount of data in the dataset, in order to achieve better experimental results, so in this paper, instead of using the entire Yoochoose dataset for experiments, Yoochoose1/64 is selected for experiments.
The evaluation metrics used in this paper are Hit Rate (HR) and Normalized Discounted Cumulative Gain (NDCG).HR is a commonly used measure of recall, which measures whether a test item is in a Top-N list, while NDCG is a position-sensitive metric, which assigns higher scores to hits in the forward position.
The hit rate reflects whether a recommendation list of length K contains items that users actually interact with. Its calculation formula is as follows:
Where
Normalized discount cumulative gain is used as an evaluation metric to correlate with the position of the item in the recommendation list, giving a higher score to the top ranked item in the recommendation list. If
In order to evaluate each recommendation list horizontally, each prediction score needs to be normalized. The concept IDCG is introduced, which refers to the optimal recommendation results returned by the recommender system for a particular user, i.e., it is assumed that the returned results are sorted by the relevance magnitude, and those with high relevance are ranked first. Then the evaluation score of a specific user’s recommendation list after normalization is processed:
Then for all users in the user set, the evaluation score of the recommendation list is:
This paper is in the experimental phase, using python language on deep learning framework pytorch. In this paper, we use mini-batch Adam optimizer to learn and optimize the parameters, the initial learning rate is set to 0.001, every 3 cycles will decay 0.1, batch_size=256, hidden_size=256, epoch=10, 12=1e-05 to avoid overfitting.
There have been many experiments proving that traditional sequence recommendation algorithms, such as Markov chain and other methods perform much worse than deep learning sequence recommendation, so in evaluating the performance of the proposed PMCA-BiLSTM model, this paper compares it with the following deep learning based sequence recommendation models:
GRU: Gated Recurrent Neural Network, an improved version of the RNN model, with the same basic principle as the LSTM model, but improved on its basis, the model is simpler and does not have the problem of gradient disappearance caused by LSTM during model training.
SR-GNN: a session sequence recommendation model that models session sequences through graph neural networks, which is more efficient than conventional sequence methods.
SR-GNN variant: in this paper, in order to compare the role of the attention mechanism in recommendation, the SR-GNN model removes the attention layer and retains only the main structure of the graph neural network.
Comparison of evaluation metrics
The comparison of the results of each model on the assessment index HR@20 and NDCG@20 is shown in Figure 5. As can be seen from Fig. 5, the PMCA-BiLSTM model of this paper has optimal results on both datasets and under both assessment metrics, which verifies the effectiveness of this paper’s method. In the comparison, the effect of SR-GNN is better than that of GRU and SR-GNN variants. Compared with the SR-GNN model, the HR@20 and NDCG@20 of the proposed model are increased by 2.62% and 5.54%, respectively, on the Yoochoose1/64 dataset. On the Diginetica dataset, the HR@20 and NDCG@20 of the proposed model are increased by 0.79% and 1.75% compared with the SR-GNN model, respectively. It can be seen that the advantage of this paper’s model improves significantly on the Yoochoose1/64 dataset, while it does not improve much on the Diginetica dataset.

Comparison of evaluation indicators results
Changes of assessment metrics with the number of iterations
The changes of HR@20 and NDCG@20 with the number of iterations for each model on the two datasets are shown in Fig. 6, where (a) to (d) denote the HR@20 comparison of Yoochoose1/64, the NDCG@20 comparison of Yoochoose1/64, the HR@20 comparison of Diginetica, and the Diginetica NDCG@20 comparison.

The evaluation index changes with the number of iterations
As can be seen in Figure 6, from the overall point of view, the four models in the first three iterations of the change trend are very large, after three iterations, the indicator changes tend to stabilize, and ultimately the model indicators of this paper are more than the other models, and the effect is optimal. From the final results, through the continuous iterative training until the indicators tend to stabilize, the evaluation indicators are arranged from large to small: PMCA-BiLSTM, SR-GNN, GRU, SR-GNN variant. Both this paper’s model and SR-GNN have the attention mechanism, and both are better than the GRU and SR-GNN variants without the attention mechanism, because the adoption of the attention mechanism can capture the user’s preference weights, so as to efficiently select the most important items for conversion, and the final prediction results are more interpretable and accurate. And also with the attention mechanism, this paper’s method outperforms SR-GNN, which indicates the feasibility of this paper’s method of introducing residual convolutional neural networks for optimizing recommendation results.
Comparison of assessment metrics under different Top-Ks
The comparison of the changes of HR and MRR when choosing K as 10, 20, 30, 40, and 50 on the two datasets is shown in Figure 7. Among them, (a) to (d) represent the changes of HR on Yoochoose1/64 dataset, NDCG on Yoochoose1/64, HR on Diginetica dataset, and NDCG on Diginetica, respectively.

Comparison of the evaluation indexes under different Top-K
As can be seen from Fig. 7, the PMCA-BiLSTM model in this paper outperforms the other models on both datasets under both evaluation metrics, regardless of the length of the recommendation list. And with the increase of the length of the recommendation list, the performance of all four models improves, while the model with the addition of the attention mechanism is always better than the model with the attention, and the PMCA-BiLSTM model with the simultaneous introduction of the residual convolutional neural network is better than the SR-GNN model with the sole use of the attention mechanism.
In this paper, based on the deep learning model and user behavior sequences, we constructed PMCA-BiLSTM, an accurate recommendation model for e-commerce platform products, based on which we designed a recommendation system and evaluated the performance of the system and the model.
The system was tested with the back-end running on the Windows operating system and the front-end environment using Chrome and Edge browsers. The maximum response time of the measured user requests is 2796ms < 3s and the error rate of all the tested requests is 0, which indicates that all the simulated requests are correctly processed by the system, i.e., the system performance is better and can give the users a good experience of using the system.
In this paper, two datasets, Yoochoose1/64 and Diginetica, are constructed using relevant data from the RecSys Challenge and CIKM Cup websites, respectively, and the recommendation performance of the models is evaluated in terms of two metrics, Hit Rate (HR) and Normalized Discounted Cumulative Gain (NDCG). SR-GNN has the advantage over GRU and SR-GNN variants in experiments, while the PMCA-BiLSTM model in this paper is superior to all compared models.Compared with the SR-GNN model, the HR@20 and NDCG@20 of the proposed model are increased by 2.62% and 5.54%, respectively, on the Yoochoose1/64 dataset.On the Diginetica dataset, the HR@20 and NDCG@20 of the proposed model are increased by 0.79% and 1.75% compared with the SR-GNN model, respectively.In different iterations, the four models in the first three iterations of the change trend are very large, after three iterations, the indicator changes tend to stabilize, and finally the indicators of the PMCA-BiLSTM model in this paper are more than the other models, with optimal results. And under different Top-K settings, this paper’s model also outperforms the other models, which fully demonstrate the effectiveness of this paper’s model in e-commerce platform product recommendations and has high practical value.
