Spam Review Detection Using Deep Learning

AI-generated keywords: Spam Review Detection Machine Learning Deep Learning Natural Language Processing Psycholinguistic Features

AI-generated Key Points

Online shopping is popular due to its convenience, but fake reviews can mislead customers.
A reliable system for detecting spam reviews is needed.
Machine learning techniques have been introduced to solve the problem of spam review detection.
Traditional machine learning classifiers such as Nave Bayes (NB), K Nearest Neighbor (KNN), and Support Vector Machine (SVM) have been applied to detect spam reviews.
Deep learning methods such as Multi-Layer Perceptron (MLP), Convolutional Neural Network (CNN), and Long Short-Term Memory (LSTM) have also been proposed for spam review detection.
Previous studies attempted to mine and summarize all customer reviews of a product using natural language processing methods.
Some researchers incorporated sentiment analysis or added psycholinguistic features in their models to improve performance in detecting fake or spam reviews.
A hybrid approach was proposed that detected duplicate reviews first before creating a hybrid dataset with the help of active learning.
Various CNN architectures composed of Topic Categorization tasks and Sentiment Analysis on various classification datasets were evaluated by researchers who achieved very good performance.
In the first phase of the proposed model, a dataset of gold-standard deceptive opinion spam was produced using crowdsourcing through Amazon Mechanical Turk.
Detecting spam reviews remains a critical issue in making online reviews reliable.

Also access our AI generated: Comprehensive summary, Lay summary, Blog-like article; or ask questions about this paper to our AI assistant.

Authors: G. M. Shahariar, Swapnil Biswas, Faiza Omar, Faisal Muhammad Shah, Samiha Binte Hassan

2019 IEEE 10th Annual Information Technology, Electronics and Mobile Communication Conference (IEMCON). IEEE, 2019

arXiv: 2211.01675v1 - DOI (cs.CL)

License: CC BY-NC-SA 4.0

Abstract: A robust and reliable system of detecting spam reviews is a crying need in todays world in order to purchase products without being cheated from online sites. In many online sites, there are options for posting reviews, and thus creating scopes for fake paid reviews or untruthful reviews. These concocted reviews can mislead the general public and put them in a perplexity whether to believe the review or not. Prominent machine learning techniques have been introduced to solve the problem of spam review detection. The majority of current research has concentrated on supervised learning methods, which require labeled data - an inadequacy when it comes to online review. Our focus in this article is to detect any deceptive text reviews. In order to achieve that we have worked with both labeled and unlabeled data and proposed deep learning methods for spam review detection which includes Multi-Layer Perceptron (MLP), Convolutional Neural Network (CNN) and a variant of Recurrent Neural Network (RNN) that is Long Short-Term Memory (LSTM). We have also applied some traditional machine learning classifiers such as Nave Bayes (NB), K Nearest Neighbor (KNN) and Support Vector Machine (SVM) to detect spam reviews and finally, we have shown the performance comparison for both traditional and deep learning classifiers.

Submitted to arXiv on 03 Nov. 2022

Ask questions about this paper to our AI assistant

You can also chat with multiple papers at once here.

AI assistant instructions?

Results of the summarizing process for the arXiv paper: 2211.01675v1

Comprehensive Summary
Key points
Layman's Summary
Blog article

In today's world, online shopping has become a regular thing for most people as it saves time and effort. However, one of the drawbacks of online purchasing is the prevalence of fake reviews or spam reviews that can mislead customers into making wrong decisions. To address this issue, a robust and reliable system for detecting spam reviews is needed. Prominent machine learning techniques have been introduced to solve the problem of spam review detection. The majority of current research has concentrated on supervised learning methods, which require labeled data - an inadequacy when it comes to online review. In this article, the focus is on detecting any deceptive text reviews using both labeled and unlabeled data. To achieve this goal, deep learning methods such as Multi-Layer Perceptron (MLP), Convolutional Neural Network (CNN), and a variant of Recurrent Neural Network (RNN) called Long Short-Term Memory (LSTM) have been proposed for spam review detection. Traditional machine learning classifiers such as Nave Bayes (NB), K Nearest Neighbor (KNN), and Support Vector Machine (SVM) have also been applied to detect spam reviews. Finally, performance comparison for both traditional and deep learning classifiers has been shown. Previous studies have attempted to mine and summarize all customer reviews of a product using natural language processing methods. Some authors classified spam reviews into three categories: non-reviews, brand-only reviews, and untruthful reviews while others used supervised learning and manually labeled reviews crawled from Epinions to detect product review spam. In addition to these approaches, some researchers incorporated sentiment analysis or added psycholinguistic features in their models to improve performance in detecting fake or spam reviews. A hybrid approach was also proposed that detected duplicate reviews first before creating a hybrid dataset with the help of active learning. Various CNN architectures composed of Topic Categorization tasks and Sentiment Analysis on various classification datasets were evaluated by researchers who achieved very good performance. Semantic clustering was introduced by adding an additional layer in the CNN architecture, and an efficient bag-of-words representation for input data was used to reduce the number of parameters for the network. In the first phase of the proposed model, a dataset of gold-standard deceptive opinion spam was produced using crowdsourcing through Amazon Mechanical Turk. Although part-of-speech n-gram features give a fairly good prediction on whether an individual review is fake, the classifier actually performed slightly better when psycholinguistic features were added to the model. Overall, detecting spam reviews remains a critical issue in making online reviews reliable.

- Online shopping is popular due to its convenience, but fake reviews can mislead customers.
- A reliable system for detecting spam reviews is needed.
- Machine learning techniques have been introduced to solve the problem of spam review detection.
- Traditional machine learning classifiers such as Nave Bayes (NB), K Nearest Neighbor (KNN), and Support Vector Machine (SVM) have been applied to detect spam reviews.
- Deep learning methods such as Multi-Layer Perceptron (MLP), Convolutional Neural Network (CNN), and Long Short-Term Memory (LSTM) have also been proposed for spam review detection.
- Previous studies attempted to mine and summarize all customer reviews of a product using natural language processing methods.
- Some researchers incorporated sentiment analysis or added psycholinguistic features in their models to improve performance in detecting fake or spam reviews.
- A hybrid approach was proposed that detected duplicate reviews first before creating a hybrid dataset with the help of active learning.
- Various CNN architectures composed of Topic Categorization tasks and Sentiment Analysis on various classification datasets were evaluated by researchers who achieved very good performance.
- In the first phase of the proposed model, a dataset of gold-standard deceptive opinion spam was produced using crowdsourcing through Amazon Mechanical Turk.
- Detecting spam reviews remains a critical issue in making online reviews reliable.

Online shopping is when you buy things on the internet. Sometimes people write fake reviews to trick others into buying something that isn't good. Scientists are trying to make a computer program that can tell if a review is real or fake. They use different types of computer programs like Nave Bayes, K Nearest Neighbor, and Support Vector Machine to help them. They also use more advanced programs called Multi-Layer Perceptron, Convolutional Neural Network, and Long Short-Term Memory. Some scientists try to read all the reviews for a product and figure out if they are good or bad using computers. Other scientists look at how people talk in their reviews to see if they are lying or not. One group of scientists made a new way of finding fake reviews by looking for ones that were copied from other reviews first. Many scientists are still working on this problem so that online shopping can be safer for everyone. Definitions- Online shopping: buying things on the internet - Fake reviews: when someone writes something untrue about a product or service - Reliable system: a computer program that works well and can be trusted - Machine learning techniques: ways for computers to learn how to do things without being told exactly what to do - Sentiment analysis: figuring out if someone's words have positive or negative feelings behind them

Introduction to Spam Review Detection

Previous Studies

Previous studies have attempted to mine and summarize all customer reviews of a product using natural language processing methods. Some authors classified spam reviews into three categories: non-reviews, brand-only reviews, and untruthful reviews while others used supervised learning and manually labeled reviews crawled from Epinions to detect product review spam. In addition to these approaches, some researchers incorporated sentiment analysis or added psycholinguistic features in their models to improve performance in detecting fake or spam reviews. A hybrid approach was also proposed that detected duplicate reviews first before creating a hybrid dataset with the help of active learning. Various CNN architectures composed of Topic Categorization tasks and Sentiment Analysis on various classification datasets were evaluated by researchers who achieved very good performance results . Semantic clustering was introduced by adding an additional layer in the CNN architecture ,and an efficient bag-of-words representation for input data was used to reduce the number of parameters for the network .

Dataset Creation

In order create effective models capable of detecting deceptive opinion spams ,a dataset containing gold standard deceptive opinion spams must be created first . This dataset was produced using crowdsourcing through Amazon Mechanical Turk . Part-of speech n -gram features gave fairly good prediction on whether an individual review is fake but better results were obtained when psycholinguistic features were added into model .

Conclusion

Detecting spam reviews remains a critical issue in making online purchases reliable . With advances in deep learning technologies , more accurate models are being developed which are capable not only classify between genuine/fake but also identify different types/categories within each type . Traditional machine leaning algorithms such as Naive Bayes , K Nearest Neighbors & Support Vector Machines still remain popular due its simplicity & low computational cost however they are outperformed by Deep Learning Algorithms like MLP , CNN & LSTM when given enough training data & resources .

Created on 17 Apr. 2023

Assess the quality of the AI-generated content by voting

Score: 0

The previous summary was created more than a year ago and can be re-run (if necessary) by clicking on the Run button below.

Similar papers summarized with our AI tools

56.9%

Answer ranking in Community Question Answering: a deep learning approach

cs.CL

54.2%

An Empirical Survey of Data Augmentation for Limited Data Learning in NLP

cs.CL

52.9%

BotTriNet: A Unified and Efficient Embedding for Social Bots Detection via Me…

cs.AI

52.5%

Predicting Stock Price Movement as an Image Classification Problem

q-fin.PR

52.4%

Hierarchical Classification of Variable Stars Using Deep Convolutional Neural…

astro-ph.SR

51.5%

Pre-train, Prompt, and Predict: A Systematic Survey of Prompting Methods in N…

cs.CL

Navigate through even more similar papers through a

tree representation

Look for similar papers (in beta version)

By clicking on the button above, our algorithm will scan all papers in our database to find the closest based on the contents of the full papers and not just on metadata. Please note that it only works for papers that we have generated summaries for and you can rerun it from time to time to get a more accurate result while our database grows.

Disclaimer: The AI-based summarization tool and virtual assistant provided on this website may not always provide accurate and complete summaries or responses. We encourage you to carefully review and evaluate the generated content to ensure its quality and relevance to your needs.