LEARNING SIMILARITY METRICS FOR EVENT IDENTIFICATION IN SOCIAL MEDIA REPORT

**project girl** · 16-01-2013, 12:47 PM

LEARNING SIMILARITY METRICS FOR EVENT IDENTIFICATION IN SOCIAL MEDIA

INTRODUCTION

The ease of publishing content on social media sites brings to the Web an ever increasing amount of content captured during and associated with real-world events. Sites like Flickr, YouTube, Facebook and others host user-contributed content for a wide variety of events. These range from widely known events, such as presidential inaugurations, to smaller, community-special events, such as annual conventions and local gatherings. By automatically identifying these events and their associated user-contributed social media documents, which is the focus of this paper, we can enable powerful local event browsing and search, to complement and improve the local search tools that Web search engines provide. In this paper, we address the problem of how to identify events and their associated user-contributed documents over social media sites. In one scenario, consider a person who is thinking of attending “All Points West," an annual music festival that takes place in early August in Liberty State Park, New Jersey. Prior to purchasing a ticket, this person could search the Web for relevant information, to make an informed decision. Unfortunately, Web search results are far from revealing for this relatively minor event: the event's website contains marketing materials, and traditional news coverage is low. Overall, these Web search results do not convey what this person should expect to experience at this event. In contrast, user-contributed content may provide a better representation of prior instances of the event from an attendee's perspective. A user-centric perspective, as well as coverage of a wide span of events of varying type and scale, make social media sites a valuable source of event information. Identifying events and their associated documents over social media sites is a challenging problem, as social media data is inherently noisy and heterogeneous. In our \All Points West" example, some photographs might contain the event's name in the title, description, or tag _elds, while many others might not be as clearly linked, with titles such as \Radiohead" or \Metric" and descriptions such as \my favorite band." Photographs geo-tagged with the coordinates of Liberty State Park, and taken on August 8, 2008, are likely to be related to this event, regardless of their textual description, but not every photograph taken on August 8, 2008, or titled \Radiohead," necessarily corresponds to this event. Overall, social media documents generally include information that is useful for identifying the associated events, if any, but this information is far from uniform in quality and might often be misleading or ambiguous.

PROBLEM DEFINITION

Given a set of social media documents associated with events, the problem that we address in this paper is how to 292 identify the events that are reected in the documents (e.g., President Obama's inauguration, or Madonna's October 6, 2008 concert in Madison Square Garden), and to correctly assign the documents that correspond to each event. We cast our problem as a clustering problem over social media documents (e.g., photographs, videos, social network group pages), where each document includes a variety of \context features" with information about the document. Some of these features (e.g., title, description, tags) are manually provided by users, while other features (e.g., upload or content creation time) are automatically generated.
Problem Definition. Consider a set of social media documents where each document is associated with an (unknown) event. Our goal is to partition this set of documents into clusters such that each cluster corresponds to all documents that are associated with one event.
As the formal definition of\event,"we adopt the version used for the Topic Detection and Tracking (TDT) event detection task over broadcast news .

Possibly Related Threads…
Thread		Author	Replies	Views	Last Post
	Biometrics Security System Full Download Seminar Report and Paper Presentation	computer science crazy	30	190,561,110	24-02-2021, 08:13 AM Last Post: buy cialis generic
	Ultrasonic Trapping In Capillaries For Trace-Amount Bi (Download Full Seminar Report)	Computer Science Clay	2	104,277,107	17-01-2018, 11:59 AM Last Post: dhanabhagya
	nanorobotics full report	project topics	24	176,551,278	16-01-2018, 05:50 PM Last Post: Guest
	robotic surgery full report	project report tiger	16	150,961,205	07-01-2018, 07:28 PM Last Post: Raymondnof
	Human Computer Interface : Seminar Report and PPT	seminar post	1	1,337	22-09-2017, 11:23 AM Last Post: jaseela123
	4G Broadband : Seminar Report and PPT	study tips	1	1,261	22-09-2017, 11:19 AM Last Post: jaseela123
	Amoeba full report	project topics	1	1,631,984	22-09-2017, 10:38 AM Last Post: jaseela123
	Itanium Processor : Seminar Report and PPT	seminar projects maker	1	1,052	21-09-2017, 12:46 PM Last Post: jaseela123
	Design and Analysis Of Algorithms : Seminar Report and PPT	seminar projects maker	1	1,315	21-09-2017, 12:04 PM Last Post: jaseela123
	Data Mining: What is Data Mining? Report	project girl	1	2,262	21-09-2017, 11:47 AM Last Post: jaseela123

Quick Reply
Message Type your reply to this message here. Disable Smilies	You have selected one or more posts to quote. Quote these posts now or deselect them.