Abstract:
Merchants selling products on the Web often ask their customers to review the products that they have purchased and the associated services. As e-commerce is becoming more and more popular, the number of customer reviews that a product receives grows rapidly. For a popular product, the number of reviews can be in hundreds or even thousands. This makes it difficult for a potential customer to read them to make an informed decision on whether to purchase the product. It also makes it difficult for the manufacturer of the product to keep track and to manage customer opinions. The Detection of Reviews using Sentiment analysis deals with finding a positive review from a thousands of reviews. As the numbers of customers are growing, reviews received by products are also growing in large amount. Thus, mining opinions from product reviews is an important research topic. In the past decade considerable research has been done in academia. However, existing research is more focused towards categorization and summary of such online opinions.
INTRODUCTION
Online shopping has become very popular to purchase things for our livelihood, and it is a convenient way to buy things like electronic appliances, furniture, cosmetics, and many more. With the rapid expansion of e-commerce, more and more products are sold on the Web, and more and more people are also buying products online. To enhance the customer satisfaction, merchants and product manufacturers allow customers to review or express their opinions on the products or services. The customers can now post a review of products at merchant sites. These online customer reviews, thereafter, become a cognitive source of information which is very useful for both potential customers and product manufacturers. With more and more common users becoming comfortable with the Web, an increasing number of people are writing reviews. As a result, the number of reviews that a product receives grows rapidly. Some popular products can get hundreds of reviews at some large merchant sites. Furthermore, many reviews are long and have only a few sentences containing opinions on the product. This makes it hard for a potential customer to read them to make an informed decision on whether to purchase the product. If he/she only reads a few reviews, he/she may get a biased view.
Product re-views exist in a variety of forms on the web. For product manufacturer perspective, understanding the preferences of customers is highly valuable for product development, marketing and consumer relationship management. But this practice of asking customer for their reviews, gives good chances for “review spam” as anyone can write anything on web. Review spam refers to the fraud spam written by spammer to hype the product features or defame them. Though these reviews are important source of information there is no quality control on this user generated data, anyone can write anything on web which leads to many low quality reviews still worse review spam which mislead customers affecting their buying decisions. Though this is the case in past few years there is growing interest in mining opinion from these reviews by academicians and industries; Detecting spam reviews is very critical task for opinion mining.
Textual information in the world can be broadly categorized into two main types: facts and opinions. Facts are objective expressions about entities, events and their properties. Opinions are usually subjective expressions that describe people‟s sentiments, appraisals or feelings toward entities, events and their properties. The concept of opinion is very broad. In this paper, we focus on opinion expressions that convey people‟s positive or negative sentiments and also focus on detection of review as spam or non-spam.
SENTIMENT ANALYSIS TECHNIQUES
Sentiment analysis or opinion mining is the computational study of opinions, sentiments and emotions expressed in text. We use the following review segment on iPhone to introduce the problem. (A number is associated with each sentence for easy reference): “(1) I bought an iPhone a few days ago. (2) It was such a nice phone. (3) The touch screen was really cool. (4) The voice quality was clear too. (5) Although the battery life was not long, that is ok for me. (6) However, my mother was mad with me as I did not tell her before I bought it. (7) She also thought the phone was too expensive, and wanted me to return it to the shop. … ”
Sentences (2), (3) and (4) express positive opinions, while sentences (5), (6) and (7) express negative opinions or emotions. The opinion in sentence (2) is on the iPhone as a whole, and the opinions in sentences (3), (4) and (5) are on the “touch screen”, “voice quality” and “battery life” of the iPhone respectively. In general, opinions can be expressed on anything, e.g., a product, a service, an individual, an organization, an event, or a topic.
The above opinion can be classified into positive or negative using the following approaches.
1. Document Sentiment Classification
2. Feature-Based Sentiment Analysis
3. Sentiment Analysis of Comparative Sentences