FoodS and FoodIM: Food-Testing Item Recommendation Models for Two Different Users with Different Usage Abilities

Recommendation systems should adopt different recommendation strategies for different users' usage abilities. For the question of what testing items are required for a food, we have designed two food-testing item recommendation models, called Food Similarity recommendation (FoodS) and Food Testing Item Matching recommendation (FoodIM). FoodS is suitable for unprofessional users who are not aware of testing items. FoodS processes different attributes by different techniques to calculate the similarity between foods, and directly recommends the testing items of the most similar food as the results. FoodIM is suitable for professional users who are aware of testing items. FoodIM calculates the degree of matching between food and testing items through the two-tower structure, and recommends the testing items that match the food. We use macBERT for embedding in FoodIM and named entity recognition (NER) to enhance the representation of food. To improve inference speed, we use GPT-3 for data augmentation and obtain embedding by contrastive learning instead of macBERT. Our experiments on the food-testing item dataset show that both of our recommendation models outperform state-of-the-art methods.


INTRODUCTION
Food safety is a global issue that directly affects people's lives and health.The main purpose of food testing is to conduct comprehensive testing and evaluation of the food production process to ensure the quality of food.The testing items of food include chemical testing, microbiological testing, physicochemical testing, genetic testing and so on.The selection of testing items directly affects the accuracy and effectiveness of food testing.
Due to varying abilities and experience, users may encounter issues such as low efficiency and errors when choosing testing items, which pose potential risks to food safety.Therefore, we use attributes such as Testing Type, Food Name and Production Process, etc. of food to recommend testing items, thereby improving the efficiency and safety of food testing.
The existing model can be divided into four stages.Early recommendation models mainly use collaborative filtering technology [1,2,6,7].Collaborative filtering is a recommendation method based on the scoring data between users and items, in which users' scoring of items is predicted according to other similar users' scoring of similar items.The earliest collaborative filtering models are neighborhood-based methods, such as user-based and item-based collaborative filtering.Content-based recommendation models [3] use the content features of items (such as text, images, or sounds, etc.) to calculate the similarity between items and recommend items based on user historical behavior [4,5].Matrix decomposition techniques, such as singular value decomposition (SVD) [8] and non-negative matrix decomposition (NMF), can be used to learn implicit representations of users and items.Deep learning methods, such as convolutional neural networks (CNN) [19] and recurrent neural networks (RNN) [20], can be used to learn advanced content features of items and further improve the accuracy of recommendations.Hybrid recommendation [9,12] can combine different types of recommendation methods to form a hybrid recommendation system, achieving higher recommendation accuracy and coverage.However, these technologies cannot be well applied in recommendations for food-testing item, as the data from food and testing items has the following characteristics: a) Sparse data.When testing food, the number of testing items for a food is far less than the total number of testing items, and the number of foods corresponding to a testing item is far less than the total number of foods.In addition, due to human factors, there are some missing data, so the data for food testing is very sparse.b) Highly professional domain knowledge.Food testing involves many highly professional domain knowledge, such as Standard Number, Testing Method, and Testing Limit.We need to understand professional knowledge deeply to design food-testing item recommendation.c) Diversified needs.The recommended results of testing items vary with different needs, so we need to study providing different recommendations for different needs.
In the field of food testing, the user group is very broad.Due to different knowledge, we divide them into professional users and unprofessional users.We propose two recommendation models based on the characteristics of food-testing item data, namely the Food Similarity (FoodS) recommendation and the Food Testing Item Matching (FoodIM) recommendation, as shown in Figure 1.FoodS is aimed at unprofessional users, while FoodIM is aimed at professional users.FoodS calculates the similarity between the food to be tested and the food with known testing items through different similarity modules to provide reasonable testing items.On the one hand, for non-semantic attributes in food, we use full-text matching, exponential fitting, and longest common subsequence (LCS) techniques to calculate similarity.On the other hand, for semantic attributes in food, we use SimCSE [10] to obtain embeddings and calculate similarity.FoodS is based on the assumption that similar foods have the same testing items, without the need to know which testing items are available in advance, and is suitable for unprofessional users.FoodIM calculates the matching degree between the food to be tested and each testing item, and recommends from all testing items.FoodIM uses a two-tower structure to calculate the similarity between food and testing items during recommendation.It is worth mentioning that although FoodIM calculates the similarity between food and testing items, it ultimately performs binary classification and outputs whether it matches.FoodIM will output all matching testing items, but will not sort them.Because FoodIM is aimed at professional users, we hope that users can freely choose testing items based on their professional knowledge, rather than being influenced by our ranking.We add a named entity recognition (NER) module to enhance the representation of food.In order to improve recommendation speed, we use GPT-3 [11] for data augmentation, and then perform contrastive learning [13] to obtain embedding of food and testing items, and calculate the similarity of embedding.In this way, we improve the recommendation speed of FoodIM.The recommendation results of FoodIM are more flexible and suitable for professional users.Professional users can freely choose from the recommended results.
In summary, our contributions in this paper are as follows: a) We design two recommendation models specifically based on the characteristics of food-testing item data to recommend testing items for food.To our knowledge, this is the first food-testing item recommendation model.b) In order to meet the requirements of different users, we propose two different food-testing item recommendation models, namely the Food Similarity (FoodS) model and the Food Testing Item Matching (FoodIM) model.FoodS calculates the similarity between the food to be tested and the food with known testing items, and indirectly recommends testing items through similar food.FoodS is suitable for unprofessional users.FoodIM calculates the degree of matching between the food to be tested and the testing items, while considering the recommendation speed.FoodIM is suitable for professional users.c) We conducted extensive experiments to evaluate our models.The results of experiments show that our FoodS and FoodIM have excellent performance.

PRELIMINARY
In this section, we specifically describe the characteristics and applicable users of two types of food-testing item recommendation models, and formally define the problems of two types of food-testing item recommendation.In addition, we introduce the food-testing item dataset used in this paper.

Food similarity model
When users are unprofessional, they do not know which testing items are related to the food.At this point, it is wise for recommendation model to recommend similar foods.The testing items of similar foods can be directly transferred to the food that needs to be tested.This is also the basic assumption of food similarity model: If two foods are similar, their testing items are the same.Therefore, in this case, we define the food-testing items recommendation as the problem of calculating food similarity.The formal definition of the problem is as follows: Given the food  to be tested and the historical food-testing item database  = { 1 ,  2 , . . .,   }, where  has attribute  = { 1 ,  2 , . . .,   }.The task of food similarity model is to use attribute  to calculate the similarity score for  and  =1,..., ∈ , and return a similarity ranking list.The testing items of the top ranked foods are recommended.

Food testing item matching model
Although food similarity model has strong interpretability, its recommendation results are not flexible and it is difficult to explore the deep relationship between food and testing items.Therefore, for professional users, who know various testing items, food similarity model cannot meet the requirements well.Food testing item matching model calculates the matching degree between the food to be tested and each testing item, and recommends more flexible results.We only focus on the recall results, and professional users can freely choose from them.Specifically, we define the food-testing items recommendation as a binary classification problem, calculating whether the food and testing items match or not.The formal definition of the problem is as follows: Given the food  to be tested and the testing items set  = { 1 ,  2 , . . .,   }, the task of food testing item matching model is to determine if food  and item  =1,2,..., ∈  match.If they match, the label is 1, and if they do not match, the label is 0.

Food-testing item dataset
The food-testing item dataset we used includes 636,543 structured data.Table 1 shows the attributes of one data.Since food similarity model does not involve testing items, we only use attributes related to food.After deduplicating these data, the total number of data is 162,235.Food testing item matching model requires information about food and testing items, so we use data from all attributes.The total number of data is 636,543.We uniformly identify the missing values in the data and standardize the expression of each attribute.

METHODOLOGY
In this section, we introduce two recommendation techniques that will be suitable for personalized recommendations for different users.The key technologies will be detailed in the following sections.

FoodS
FoodS is suitable for unprofessional users who do not know what testing items are available.Therefore, FoodS uses the attributes of food to calculate the most similar foods in the historical foodtesting item database and recommends the testing items of the most similar food as results.Since we do not need information on testing items, we only use the data that type is food in Table 1.
Because the semantics of attributes are different, we use different techniques to handle different attributes: a) The attributes of Testing Type, Food Classification, Quality Level, Production Process, Food Form, etc. are fixed in content.We use fulltext matching, where complete matching is similar and incomplete matching is not.b) For the Production Date attribute, we consult relevant experts in the field, and base on their opinions, we set the calculation method for similarity.We believe that a difference of 30 days is considered identical, with a similarity of 100%.If the difference is greater than 30 days but less than 180 days, it is considered as 80% similarity.When the difference is greater than 180 days, the similarity gradually decreases.We use the exponential function (Eq.1) to fit the similarity greater than 180 days, where  is the number of days that differ. = 0.8  recommendation results.The framework of FoodS is shown in Figure 2.

FoodIM
FoodIM is suitable for professional users who are familiar with testing items.Therefore, FoodIM calculates the matching score between food and each testing item to recommend suitable testing items for food.We use two-tower structure [14] as the recall layer for FoodIM, with food and testing items as inputs to the two-tower structure.We use the binary cross entropy loss as the loss function of the two-tower model: where   is the true value, ŷ is the probability value predicted by the model, and  is the number of a batch.
To represent the semantic information of each attribute, we finetune the best Chinese pre-training model macBERT [15] as the pre-training model.In addition, we use named entity recognition (NER) in Food Name attribute to enrich the input feature space.We extract entities [ 1 ,  2 , . . .,   ] from Food Name   _ and then use macBERT to obtain the embedding [ℎ 1 , ℎ 2 , . . ., ℎ  ] of entities.Finally, we concatenate the embedding of Food Name and entities to obtain the total embedding ℎ  _ .The details are as follows: ℎ  _ = macBERT   _ , ℎ 1 , ℎ 2 , . . ., ℎ  (6) where HanLP is the NER technology we use.Although using macBERT for embedding has high accuracy, high computational complexity and slow inference speed making it difficult to meet practical recommendation requirements.To solve this problem, we propose a fast recommendation technique and use it in FoodIM.We focus on the embedding computation, and the fast recommendation technique uses a simpler structure instead of macBERT to compute embedding.This technology combines GPT-3 data augmentation and contrastive learning, significantly reduces training and inference time at the expense of a little portion of accuracy.This technology ensures the accuracy and speed of FoodIM, and is suitable for practical scenarios.
Our data augmentation strategy targets two aspects.On the one hand, we enhance Food Name, and on the other hand, we enhance Testing Item Name.We input the Food Name into GPT-3 and require the model to generate food names similar to the input.Through this approach, we obtain more food names with similar attributes.Our treatment of Testing Item Name is the same as Food Name, it allows us to obtain more data similar to Testing Item Name.The prompt we use in GPT-3 is shown in Figure 3.
We use enhanced data for contrastive learning, train two fully connected layers, and use Triplet Loss as the loss function.The Triplet Loss function is as follows: L (, , ) = max (0,  (, ) −  (, ) + ) (7) where  is the anchor sample (anchor),  is the positive sample (positive), and  is the negative sample (negative).We use the embedding of source Food Name as anchor, the embedding of enhanced data as positive, and the embedding of other Food Name in the same batch as the negative.We use Word2Vec to initialize the embedding layer.
Our complete technology is shown in Figure 4. We use the model trained by contrastive learning instead of macBERT to construct embedding, which greatly improves computational speed.Finally, we construct two FoodIM models by two techniques.The first model uses fine-tuned macBERT for embedding and adds NER to enhance the representation of food.The second model uses GPT-3 for data augmentation and trains the embedding network by contrastive learning.The framework of FoodIM is shown in Figure 5.

EXPERIMENTS
In this section, we evaluate the two recommendation models by different metrics.We compare different baseline models to demonstrate the superiority of FoodS and FoodIM

FoodS experiments
We use the food-testing item dataset from the previous section to generate test data.Specifically, we generate a large amount of test data to simulate the food information submitted by users.Through this approach, we evaluate the effectiveness of FoodS.
For different attributes, our generation method is as follows: a) For attributes with fixed content such as Food Form, Quality Level, Testing Type, and Food Classification, the likelihood of users making mistakes is low.But sometimes users may leave NULL because they don't know what to fill in.Therefore, for these attributes, we adopt a strategy of retaining the original data or leaving NULL when generating the test set.b) For attributes with flexible content such as User Name, Company Name, and Food Name, we randomly add words, delete words, reverse word order, and replace by synonyms when generating the test set.This strategy is more flexible and suitable for actual scenarios.
The above two strategies help to increase the diversity of the test set, thereby better verifying the robustness of the model.We are able to comprehensively evaluate the various types of data that the model may encounter in actual scenarios.Finally, we randomly select 5,000 data from the dataset and generate 160,000 test data by the above two strategies.We compare our model with other baselines, including fasttext [16], Word2Vec [17], LCS, and SimCSE.It should be noted that these baselines are used to calculate the similarity of Standard Number, User Name, Company Name, and Food Name attributes, while for other attributes, we use full-text matching or exponential fitting methods.We use baselines to calculate the similarity of each attribute, and then weighted sum to obtain the total similarity.In addition, we also use LCS to calculate the similarity of Standard Number and Word2Vec to calculate the similarity of User Name, Company Name, and Food Name attributes.We evaluate our model by two different metrics.The experimental results are shown in Table 2 and Table 3.
We find that if LCS is used to calculate the similarity of these attributes, semantic information in the text will be lost, resulting in poor results.If Word2Vec and SimCSE are used to calculate the similarity of these attributes, these models not only fail to learn better representation for Standard Number, but also tend to generate sparse vectors.In a word, in the actual recommendation, we need to choose appropriate methods according to the attributes of data.

FoodIM experiments
We also use the dataset from the previous section to generate test data.We constructe negative samples by the bootstrap difficult negative sample mining method in a ratio of 1:10, resulting in 636,543 positive samples and 6,000,000 negative samples.Among them, each data consists of one food and one testing item, representing the testing item used in the food.The labels in the data are 0 or 1, representing mismatch and match, respectively.As shown in Table 4 and Table 5, some feature values in attributes appear few times, and these sparse features can affect the performance of model.Therefore, we use the feature filtering strategy.We discard the feature values that appear few times and set the corresponding feature values NULL.This method avoids overfitting sparse features of the model and improves the generalization ability.
We compare our models with other baselines, including matrix factorization based on two-tower structure (MF), vectorization model by Word2Vec (Word2Vec+MF, Word2Vec+deepFM), BERTbase [18] and macBERT.We use PyTorch to construct the two-tower recommendation model, where the embedding vector dimensions for food and testing items are set to 256.At the top of two-tower structure, we adopt a fully connected layer with an output dimension of 1.We use the Adam optimizer for optimization, with the learning rate of 0.001 and the batchsize of 32.We train 5 epochs on 4 GeForce RTX 3090 GPUs.The experimental results are shown in Table 6, where the thousand inference times represent the time required for the model to query 1000 times.
We find that our models perform best after fine-tuning macBERT and adding the NER module.This proves that our proposed method can more accurately recommend testing items.In addition, although our GPT-3-CL model has a slight decrease in indicators compared to the highest, it still outperforms other traditional recommendation methods.Crucially, GPT-3-CL outperforms macBERT-finetuned-NER in inference speed.This means that GPT-3-CL significantly In practical application, we need to select appropriate methods according to the attributes of data.

Results and analysis of FoodIM. Our experiments on FoodIM
show that our two embedding methods achieve excellent performance in different aspects.We fine-tune macBERT to embed food information into the pre-trained model.At the same time, we add the NER module to enrich the information of food and improve the effectiveness of similarity calculation.The macBERT-finetuned-NER method achieves the best results in accuracy.In addition, we use GPT-3 for data augmentation and then use contrastive learning to train the embedding network.This method greatly reduces the complexity of the embedding network and improves the speed of embedding computation compared to macBERT.The GPT-3-CL method achieves excellent results in computational speed while ensuring high accuracy.

RELATED WORK
Content-Based Filtering [3,4] relies on the information of items for recommendation.However, Content-Based Filtering models only recommend data related to user historical evaluations of projects [21], without incorporating similarity information between individual preferences [22].This model is mainly used to recommend items based on user-item information [5], such as recommendations based on music characteristics [23], recommendations based on movie characteristics [21], e-commerce recommendations [24], and educational material recommendations [25].Collaborative filtering [6] relies on user evaluation data to build the database.At present, the research related to collaborative filtering mainly focuses on improving the performance of collaborative filtering [26,27,29,32].However, the Content-Based Filtering model relies on use-item metadata, and collaborative filtering relies on user-item score data.The hybrid recommendation model [9] addresses the limitations of the above two models and improves recommendation performance.In summary, in practical applications, we need to choose appropriate recommendation methods based on specific fields to ensure that the recommendation system provides users with high-quality recommendation results.

CONCLUSION
In this paper, we propose two recommendation models (FoodS and FoodIM) for different users to recommend testing items.FoodS is suitable for unprofessional users who are unaware of testing items.FoodS calculates the most similar foods in the food database based on their attributes and recommends testing items for the most similar foods.FoodS uses different methods to calculate similarity for different attributes, and experiments have shown that our model is currently the most outstanding.FoodIM is suitable for professional users who know testing items.FoodIM calculates the degree of matching between food and testing items based on their attributes and recommends testing items that match the food.Professional users can freely choose from the recommendation results.FoodIM uses the two-tower structure, and embedding by macBERT-finetune-NER or GPT-3-CL.Experiments have shown that the macBERT-finetune-NER method can achieve the highest F1 value, and the GPT-3-CL method can greatly improve prediction speed while ensuring F1 value.

Figure 2 :
Figure 2: FoodS.We designed different modules to calculate the similarity of different attributes.FoodS recommends the testing items of the Top-k similar foods as the results.

Figure 4 :
Figure 4: Data augmentation and contrastive learning.We use data augmentation by GPT-3 and use contrastive learning technology to train the embedding layer and fully connected layers.

Figure 5 :
Figure 5: FoodIM.In Figure (a), we use fine-tuned macBERT for embedding and NER to enhance the representation of food.In Figure (b), we use GPT-3 for data augmentation and then apply contrastive learning to train the embedding network.

Table 1 :
We use the longest common subsequence (LCS) to calculate, as shown in Eq.2, where  1 is the user input Standard Number and  2 is the Standard Number in database.The attributes of User Name, Company Name, and Food Name contains rich semantic information.If the full-text matching or LCS technology is used to calculate the similarity, semantic information cannot be fully utilized.Therefore, we need to use a technology that can represent the semantic information of the text to calculate the similarity.SimCSE uses contrastive learning to construct text embedding.We use SimCSE to construct the embedding of text for these attributes.In summary, FoodS uses different techniques to calculate the similarity scores of different attributes in food.Finally, it weights and sums the different scores to obtain the final similarity score.FoodS returns the Top-k similar foods with their testing items as One data in the food-testing item data c) For Standard Number attribute, it is number composed of letters and numbers, and do not contain semantic information.At the same time, it is easy for user to make minor errors when inputting, such as a letter and a number input error.We need to use a robust technique to calculate the similarity of Standard Number.

Table 2 :
Top-k accuracy of different methods

Table 4 :
Sparse features of food in the data

Table 5 :
Sparse features of testing items in the data

Table 6 :
Comparison of our methods with other methods Results and analysis of FoodS.Our experiments on FoodS show that the methods using LCS and SimCSE achieve the best performance.In fact, we can use LCS or Word2Vec or SimCSE to calculate similarity without distinguishing between letters, numbers, and text.But experimental results show that the above methods are not effective.Because the Standard Number attribute composed of letters and numbers do not contain any semantic information, it is easy to generate sparse vectors when embedding.In addition, text needs embedding because it contains semantic information.