Tender Document Analyzer with the Combination of Supervised Learning and LLM-based Improver

Bidders often take a long time to read and understand tender documents because they require specialized knowledge, and tender documents are generally long. Bidders first overview the specific items, such as payment and warranty, in a tender document and then check the overall document. Therefore, the function that can extract specific items (i.e., item extractor) and the function that can highlight words or phrases related to specific items (i.e., word-phrase highlighter) are in great demand. To develop the above two types of functions, we need to solve two problems. The first problem is the problem related to the annotated data set. The second problem concerns the BERT NER-based prediction approach in a small training dataset setting. To solve the first problem, we created two types of sequence labeling datasets related to Item Extractor and Word-Phrase Highlighter. To solve the second problem, we propose the Information Extraction (IE) method, which combines (1) a supervised learning approach using Bidirectional Encoder Representations from Transformers (BERT) and (2) a large language model (LLM)-based improver. We then developed the web application system called Tender Document Analyzer (TDDA), which includes "Item Extractor" and "Word-Phrase Highlighter". Experimental evaluation shows that our approach is practical. Firstly, the evaluation for extraction ability shows that the performance of our proposed method is much higher than the baseline approach that uses GPT 3.5, as well as demonstrates that the proposed LLM-based improver can improve the IE ability. In addition, the usability evaluation shows that bidders can solve the task in less time using our system.


INTRODUCTION 1.Motivation
A tender such as UNITED NATIONS PROCUREMENT refers to an offer for a project such as constructing a railroad or power plant.In 2021, total UN procurement reached 29.6 billion USD, indicating its high importance in society 1 .To participate in a bidding process, the initial step involves comprehending the contract terms, quantities, and other factors described in the request for proposal.We must then clarify requirements and establish a negotiation plan.Proposal documents are often long and can occasionally be over 100 pages.Additionally, in-depth knowledge about the field of trade or materials is often necessary for a proper grasp of the subject matter.Thus, interpreting the contents of tender documents necessitates a considerable investment of time and effort.Therefore, there is a high demand for support systems that aid users in comprehending the tender document contents.
Our interviews with several users revealed that bidders typically follow two steps to read and comprehend a tender document.First, users review specific details such as payment and warranty.(2) Users review the document to verify the correctness of the specified items.Consequently, there is a high demand for an "Item extractor" that can automatically extract specific items during the initial stage and a "Word-phrase highlighter" that can highlight relevant words and phrases during the second stage.
There have been several researches for extracting information from specialized documents such as legal documents [5], financial documents [8], and chemical domain scientific papers [10].Although these techniques have proven successful for their respective domains, their effectiveness in extracting information from tender documents is restricted by differences in the domain, and there needs to be more research or tools for extracting information from tender documents.

Challenge & Our Approach
There are two significant issues in the IE task for tender documents.First, the annotated dataset is not available because tender documents are not open in general, and annotators are required to have specialized knowledge related to legal and tender.Second, the approach of using a language model in sequence labeling task is not suitable for the token labeling task related to numerical information such as weight, period, and price [4].
To solve the first issue, we developed two annotated datasets related to "Item extractor" and "Word-phrase highlighter." To solve the second issue, we introduce the IE method, which combines (1) the supervised learning approach using transfer learning with PLM and (2) the rule-based improver supported by the large language model (LLM).Our approach first extracts information for each label using the supervised learning approach.After that, we extract "Warranty Period, " "Bid Bond Period, " and "Delivery Allowance, " respectively, from the first extraction results labeled as "Warranty, " "Bid Bond, " and "Quantity Variation."We conduct this second extraction by detecting terms related to "Period" and "Delivery allowance" using the text-to-text model.Further, system development is also a crucial topic in this area.Therefore, we developed a web application called TDDA, which includes the "Item Extractor" and the "Word-phrase highlighter, " as shown in Fig. 1.

Contribution
The main contributions of this research are as follows: (1) We designed a "tender document item extraction" task and constructed the sequence labeling dataset for this task by collecting users' opinions and annotated labels.Part of this dataset will be open if this paper is accepted.
(2) We propose an IE method that combines (1) a supervised learning approach using BERT and (2) a rule-based improver supported by a generative model.
(3) We developed a web application system called Tender Document Analyzer (TDDA) where we can easily read and understand a tender document using "Item Extractor" function and "important Word Highlighter" function, and also experimentally demonstrate the availability of our proposed TDDA.
(1) Item extractor extracts 31 types of items, such as "Payment Terms" and "Delivery, " and displays the results in the right column.By clicking on each item in the right column, the position of the PDF viewer shifts to the source location, which allows users to check the results quickly.This function is expected to help users obtain an overview of a document.
(2) Word-phrase highlighter assigns 17 labels, such as "EQC" and "Delivery," to each word within a tender document.By selecting the labels, the corresponding words and phrases are displayed on the left side and highlighted on the PDF viewer.This function is expected to help users check the overall document manually.

Dataset
As described in Section 1.2, domain-specific dataset creation is an important issue.Therefore, we first created two types of sequence labeling datasets: Information Extraction (IE) and Important Word-Phrase datasets.
(1) IE Dataset includes a training dataset with a total of 671,871 words from 20 types of tender documents, out of which 78,492 words were labeled, and a validation dataset with a total of 32,520 words, out of which 5,733 words were labeled.Table 1 describes the details of this dataset.
(2) Important Word-Phrase Dataset includes a training dataset with a total of 73,213 from 5 types of tender documents, of which 6,400 words were labeled, and a validation dataset with a total of 16,724 words from 4 types of tender documents, of which 2,355 words were labeled.

Information Extraction
This section introduces the IE method for Item extractor and Wordphrase highlighter.Our method combines (1) the supervised learning and (2) LLM-based improver.
2.3.1 Supervised learning.First, the BERT-based sequence labeling method extracts each label item on the Tables 2 and 1 such as Bond and Pre-Bid meeting.In this step, each BERT-based token classification model was developed by fine-tuning a PLM2 using each training dataset.We set the epoch to 10 with early stopping.In this step, "Delivery Allowance," "Bid Bond Period," and "Warranty Period" are labeled as "Quantity Variance, " "Bid Bond, " and "Warranty, " respectively.

LLM-based improver. This step extracts the contents of Bid
Bond Period, Warranty Period, and Delivery Allowance from the extracted contents labeled as Bid Bond, Warranty, and Quantity Variation in the first step, respectively.To conduct this extraction, we first extract terms related to "Period" and "Delivery Allowance" from the extracted contents, as shown in the following examples, using a text-to-text model.
If any output of "Period" is extracted, we convert the "Bid Bond" or "Warranty" label to "Bid Bond Period" or "Warranty Period," respectively.If any output of "Delivery Allowance" is extracted, we convert the "Quantity Variation" label to "Delivery Allowance".

EXPERIMENTAL EVALUATION
This section evaluates our method from the (1) extraction ability and (2) system usability.

Evaluation Setting.
To evaluate the extraction ability of our method, we compared the extraction performance of the following three methods based on macro F1 score.We used the validation datasets for this evaluation.
(1) gpt-3.5-turbo(baseline): This extraction method uses the GPT model in a zero-shot learning setting (baseline method) using the validation dataset.We used "gpt-3.5-turbo-16k"model3 for the gpt model.This method corresponds to the baseline method.
(2) LEX-LM: This extraction method uses a fine-tuned BERT model using each training dataset.We utilized the Legal RoBERTa (LexLM) Large model [2] as a pre-trained language model.
(3) LEX-LM + Rule by LLM: This method is the proposed approach as described in Section 2.3.In LLM-based improver, we utilized ICL with four examples using the "Jurassic-2 Ultra" model as the LLM, which is the model released by AI21Lab4 .3 represents the results, showing that the performance of LEX-LMs is higher than the baseline GPT 3.5-based method.Here, "Total" means the overall evaluation results for all 48 Item Extractor and Word-Phrase Highlighter labels.In addition, from Table 3 results, we can see that our "LLM-based improver" can improve the extraction performance.

Evaluation for System Usability
3.2.1 Evaluation Setting.We evaluate our system by comparing completion times and success rates with and without its use.Test participants read a tender document from the validation data and solve the following case study tasks.
Case 1: Summarize the information related to quantity, changes in contract quantity, and delivery allowance.
Case 2: Look for all the information related to Bond.Two experts who are typically involved in the tender process and three beginners who do not usually participate in the tender process (a total of eight users) participated in this evaluation.4 describes the results, indicating that users can more quickly solve the task using our system than in cases where they do not use it.Moreover, the effects of our system on the success rate and completion time are more significant for beginners.

RELATED WORKS
[9] is one of the related studies on IE from tender documents.This research reports that a deep learning model that combines BERT and CNN can extract general information such as dates, numbers, addresses, company names, and departments from a tender summary with high performance.On the other hand, this research does not consider the extraction of "quality assurance" and "payment terms, " which are necessary for reading tender documents; however, these labels are crucial for real use cases.Other related research includes IE from legal documents [5][3] [7][2], financial documents [8][6], and chemical domain scientific papers [10] [11].While these methods are effective for each domain, their effects on tender documents are limited due to the domain difference.In a similar area, several legal review products 5 .They can be effective for general legal document analysis.On the other hand, they cannot work for tender documents.However, our TDDA can be effective for Tender documents.

CONCLUSION
This paper proposes an IE method that combines (1) a supervised learning approach using BERT and our original dataset and (2) LLM-based improver and then develops a web application called TDDA that includes an "Item extractor" and "Word highlighter" to support the reading and understanding tender documents.Experimental evaluation demonstrates that our approach is valuable.In the future, we will improve our method by increasing the dataset size.You can see the demo and use TDDA from our project page https://dev.azure.com/TomokIto/Tender%20Document%20Analyzer/_git/Tender%20Document%20Analyzer.

Figure 1 :
Figure 1: Screenshot of TDDA.The original contents are written in Japanese; the Google translation result is shown above.

Table 1 :
Overview of IE Dataset

Table 2 :
Overview of Important Word-Phrase Dataset

Table 3 :
Result of Extraction Ability Evaluation

Table 4 :
Result of System Usability Evaluation