14-07-2014, 04:01 PM
Literature Work
Literature Work.docx (Size: 31.04 KB / Downloads: 10)
Introduction
Now a days, online video sharing sites are trying to give more freedom to their users by allowing them to generate and distribute their contents. With these online video sharing sites like You Tube, users can not only watch the video but also they can express their view about the video by responding to that particular video. Also users can generate their own contents and upload it on these sites. That’s the reason these systems are experiencing a dramatic growth in terms of popularity. In particular, video content is becoming a predominant part of users’ daily lives on the Web. A number of Web services are offering video-based functions as alternative to text-based ones, such as video reviews for products, video ads and video responses
CONTENT-BASED SPAM FILTERING ON VIDEO SHARING SOCIAL NETWORKS
This system mainly focuses on spam and not on individual polluting the contents that is not on the spammers. It is based on features from the videos that is by extracting local features of videos and by analyzing them the system tries to identify the semantics of that particular video and then decides whether it is a spam or not. For this purpose, it uses SIFT and bag-of-visual-features (BoVF) for feature selection and Latent Semantic Analysis (LSA) and SVM classifier. But the problem is that getting the exact semantics of video is extremely difficult.
Problem Statement
The proposed system identifies spammers and promoters. It first classifies the users manually in three categories spammers, promoters and legitimate users. It then crawl the You Tube for getting three types of attributes user attributes, video attributes and social attributes and analyses them and classifies users by using two classification algorithms Support Vector machine (SVM)
System Overview
The proposed system identifies spammers and content promoters in video sharing site, You Tube. As, You Tube allows its users to respond to the video and to generate and upload their contents, the problem of spam is becoming a serious problem on such sites. Spammers are the individuals who post at least one unrelated video response that is considered unrelated to the responded video. Content promoters are the individuals who post a large number of video responses to a responded video so as to promote that particular video unnecessarily. The system identifies these spammers and content promoters using five-step approach. The first step is to manually classify the users in three categories spammers, content promoters and legitimate users. The next step is to crawl the YouTube to collect the three types of attributes user, video and social attributes and other information. Then the user test collection is built from the collected information. After that the next step is to analyze the attributes to identify their distinguishing power to classify the users in one of the three categories. And the last step is to identify the spammers and content promoters using Support Vector Machine (SVM) and Lazy Associative Classification (LAC). As the system uses supervised classification algorithm which requires labeling, but the labeling cost in case of video is high. So the system uses Active Lazy Associative Classification (ALAC) to reduce the labeling efforts. In this way the proposed system identifies the spammers and content promoters.
Analyzing User Behavior Attributes
Behavior of users in different categories is different. So, it is necessary to analyze their attributes in order to classify them as spammers, promoters and legitimate users. The three attributes considered, user attributes, video attributes and social networking (SN) attributes.
Video attributes gives information about video uploaded by the user. It includes attributes such as duration, numbers of views and of commentaries received, ratings, number of times the video was selected as favorite, as well as numbers of honors and of external links. Videos are divided into three groups. The first group contains all the videos uploaded by the user. It is useful to determine how other users see the contribution of that particular user. The second group contains only video responses. These video responses can be the polluted contents. The last group contains the responded videos which are also known as target videos to which the users have posted video responses. In this way the videos are divided into three categories mentioned above and then their attributes are analyzed.
Conclusion
In this way, the system uses supervised classification and active learning to detect spammers and content promoters. It first crawls YouTube using a crawling algorithm and then builds user test collection containing pre-classified users manually. After building user test collection, it analyzes the attributes. The system considers three types of attributes, user attributes which gives information about the individual characteristics of the user, video attributes which gives information about the videos uploaded and the video responses, and the social network attributes captures the social relationships established via video response interactions. All the three attributes are analyzed. After analyzing the attributes, the next step is the detection of spammers and content promoters using supervised classification algorithms Support Vector machine and Lazy Associative Classification. But, the supervised approaches requires labeled data and labeling in case of video is costly. So, Active Lazy Associative Classification algorithm is used to reduce the labeling efforts.