This paper presents a deep neural architecture for Natural Language Sentence Matching (NLSM) that aims to identify the semantic similarity of input text pairs. The objective of NLSM models is to predict a category or scale value for a pair of input texts, which indicates their similarity or relationship. To achieve this goal, NLSM models generally use two main steps: designing a model to obtain a proper representation of whatever text will be analyzed so that it can extract semantic features from it, and by using the representation obtained from texts, developing a matching mechanism to extract complex interactions. In this study, the authors collected a Persian Religious question matching dataset containing 18,000 samples with two questions, appropriate answers, and match or not-match labels for each question pair. The dataset was used for designing a chatbot by crawling religious questions from religious question-answering websites and annotating two similar questions for each question and generating dissimilar questions automatically. They implemented BERT with Deep Recursive Encoder (BERT-DRE) model by adding recursive encoder module to BERT (Devlin et al., 2019). Three Bi-LSTM layers with residual connection were used to design a recursive encoder and an attention module was used on top of this encoder. To obtain the final vector, a pooling layer consisting of average and maximum pooling was used. The authors evaluated BERT-DRE with related models using the introduced religious and benchmark datasets such as SNLI, FarsTail, MultiNLI, SciTail. It was noted that their BERT-DRE model achieved an F1-score of 90.27% on the test data when trained and evaluated using annotated religious dataset making it the strongest model among those studied. Furthermore, in order to better evaluate the BERT-DRE model's performance on other datasets such as SNLI, MultiNLI, FarsTail, and SciTail they achieved appropriate F1-scores. The authors investigated related models to the field of NLSM and found that deep neural networks are the preferred approach. To determine semantic features and relationships among sentences, researchers use convolutional and recurrent neural networks. The structure of convolutional neural networks (CNNs) make them very capable of extracting local features while recurrent neural networks can extract temporal features by considering natural language texts as a sequence of words. The Long Short Term Memory (LSTM), a variant of RNN’s is capable of extracting long-term dependency in an appropriate way. In conclusion, this paper focuses on improving BERT results in NLSM task by adding deep recursive encoder to BERT so called BERT with Deep Recursive Encoder (BERT-DRE). The authors experiment their model on four benchmarks: SNLI; FarsTail; MultiNLI; SciTail; as well as novel Persian religious questions dataset. Comparisons between BERT-DRE and BERT are conducted showing that in all cases BERT-DRE outperforms its predecessor - achieving an accuracy score 89.70% using only the original algorithm versus 90.29% when employing proposed architecture on same dataset..
- - The paper presents a deep neural architecture for Natural Language Sentence Matching (NLSM) to identify semantic similarity in input text pairs.
- - NLSM models predict a category or scale value for a pair of input texts to indicate their similarity or relationship.
- - Two main steps are used: designing a model to obtain proper representation of the text and developing a matching mechanism to extract complex interactions.
- - The authors collected a Persian Religious question matching dataset containing 18,000 samples with two questions, appropriate answers, and match or not-match labels for each question pair.
- - BERT-DRE model was implemented by adding recursive encoder module to BERT and achieved an F1-score of 90.27% on the test data when trained and evaluated using annotated religious dataset making it the strongest model among those studied.
- - Deep neural networks are preferred approach in NLSM field and researchers use convolutional and recurrent neural networks to determine semantic features and relationships among sentences.
- - BERT-DRE outperforms its predecessor achieving an accuracy score of 90.29% when employing proposed architecture on same dataset.
This paper talks about a computer program that can tell if two sentences mean the same thing. The program uses special math called neural networks to do this. First, the program looks at the words in each sentence and makes them into a kind of picture. Then, it compares the pictures to see if they look similar. The people who made the program tested it using questions about religion in Persian language and it worked really well! They made a new version of the program that works even better than before.
Definitions- Neural networks: A type of math used by computers to learn things on their own.
- Semantic similarity: When two sentences mean almost the same thing.
- Representation: A way of showing something so that you can understand it better.
- Dataset: A collection of information or data used for testing or studying something.
- F1-score: A way of measuring how well a computer program is doing at its job.
Natural Language Sentence Matching (NLSM) with Deep Neural Architecture
Natural language sentence matching (NLSM) is a task that aims to identify the semantic similarity of input text pairs. The goal of NLSM models is to predict a category or scale value for a pair of input texts, which indicates their similarity or relationship. To achieve this goal, two main steps are generally used: designing a model to obtain an appropriate representation of whatever text will be analyzed so that it can extract semantic features from it; and by using the representation obtained from texts, developing a matching mechanism to extract complex interactions.
In this research paper, the authors present an approach for improving Natural Language Sentence Matching (NLSM) by introducing BERT with Deep Recursive Encoder (BERT-DRE). This model was tested on four benchmark datasets as well as on a novel Persian religious questions dataset. Results showed that BERT-DRE outperformed its predecessor in all cases - achieving an accuracy score 90.29% when employing proposed architecture versus 89.70% when using only the original algorithm on same dataset.
Dataset Collection and Preprocessing
The authors collected a Persian Religious question matching dataset containing 18,000 samples with two questions, appropriate answers and match or not-match labels for each question pair. The dataset was used for designing a chatbot by crawling religious questions from religious question-answering websites and annotating two similar questions for each question and generating dissimilar questions automatically.
BERT with Deep Recursive Encoder Model
The authors implemented BERT with Deep Recursive Encoder (BERT-DRE) model by adding recursive encoder module to BERT (Devlin et al., 2019). Three Bi-LSTM layers were used in order to design the recursive encoder module while an attention module was placed on top of this encoder layer in order to better capture long distance dependencies between words within sentences. Finally, pooling layer consisting of average and maximum pooling was used in order to obtain final vector representation of input sentence pairs which would then be fed into classification layer responsible for predicting whether given sentences are semantically related or not based on previously assigned labels during preprocessing step described above .
Evaluation Results
The authors evaluated BERT-DRE with related models using the introduced religious and benchmark datasets such as SNLI, FarsTail, MultiNLI, SciTail etc.. It was noted that their BERT-DRE model achieved an F1-score of 90.27% on test data when trained and evaluated using annotated religious dataset making it strongest among those studied compared against other related models such as CNNs , RNNs , LSTMs etc.. Furthermore , they also found out that their proposed architecture achieves good results even when applied onto other datasets such as SNLI , MultiNLI , FarsTail , SciTail etc..
Conclusion
In conclusion , this paper focuses on improving BERT results in NLSM task by adding deep recursive encoder module onto existing architecture so called BERT with Deep Recursive Encoder (BERT-DRE). Experiments conducted show promising results since proposed method outperforms its predecessor across various benchmarks including novel Persian Religious Questions Dataset .