Improving Summarization Performance by Sentence Compression – A Pilot Study pdf

**project girl** · 24-12-2012, 04:13 PM

Improving Summarization Performance by Sentence Compression –
A Pilot Study

.pdf

1Improving Summarization.pdf (Size: 196.01 KB / Downloads: 22)

Abstract

In this paper we study the effectiveness of
applying sentence compression on an extraction
based multi-document summarization
system. Our results show that pure
syntactic-based compression does not improve
system performance. Topic signature-
based reranking of compressed
sentences does not help much either.
However reranking using an oracle
showed a significant improvement remains
possible.

Introduction

The majority of systems participating in the past
Document Understanding Conference (DUC, 2002)
(a large scale summarization evaluation effort
sponsored by the United States government), and
the Text Summarization Challenge (Fukusima and
Okumura, 2001) (sponsored by Japanese government)
are extraction based. Extraction-based automatic
text summarization systems extract parts of
original documents and output the results as summaries
(Chen et al., 2003; Edmundson, 1969;
Goldstein et al., 1999; Hovy and Lin, 1999; Kupiec
et al., 1995; Luhn, 1969). Other systems based on
information extraction (McKeown et al., 2002;
Radev and McKeown, 1998; White et al., 2001)
and discourse analysis (Marcu, 1999; Strzalkowski
et al., 1999) also exist but they are not yet usable
for general-domain summarization. Our study focuses
on the effectiveness of applying sentence
compression techniques to improve the performance
of extraction-based automatic text summarization
systems.
Sentence compression aims to retain the most salient
information of a sentence, rewritten in a short
form (Knight and Marcu, 2000). It can be used to
deliver compressed content to portable devices
(Buyukkokten et al., 2001; Corston-Oliver, 2001)
or as a reading aid for aphasic readers (Carroll et
al., 1998) or the blind (Grefenstette, 1998). Earlier
research in sentence compression focused on compressing
single sentences, and were evaluated on a
sentence by sentence basis.

Unigram Co-Occurrence Metric

In a recent study (Lin and Hovy, 2003a), we
showed that the recall-based unigram cooccurrence
automatic scoring metric correlates
highly with human evaluation and has high recall
and precision in predicting the statistical significance
of results comparing with its human counterpart.
The idea is to measure the content similarity
between a system extract and a manual summary
using simple n-gram overlap. A similar idea called
IBM BLEU score has proved successful in automatic
machine translation evaluation (NIST, 2002;
Papineni et al., 2001).

Conclusions

In this paper we presented an empirical study of the
effectiveness of applying sentence compression to
improve summarization performance. We used a
good sentence compression algorithm, compared
the performance of five different ranking algorithms,
and found that pure a-sentence-at-a-time
syntactic or shallow semantic-based reranking was
not enough to boost system performance. However,
the significant difference between the ORACLE
run and the original run (ORG) indicated there is
potential in sentence compression but we need to
find a better compression selection function that
should take into account global cross-sentence optimization.
This indicated local optimization at the
sentence level such as Knight and Marcu’s (2000)
noisy-channel model is not enough when our goal
is to find the best compressed summaries not the
best compressed sentences.

Possibly Related Threads…
Thread		Author	Replies	Views	Last Post
	Software Crisis pdf	study tips	1	2,117	21-09-2017, 04:31 PM Last Post: jaseela123
	Performance Evaluation of Computer Networks	project uploader	1	3,234	21-09-2017, 12:32 PM Last Post: jaseela123
	HOW EMAIL WORKS pdf	project girl	1	3,067	20-09-2017, 11:39 AM Last Post: jaseela123
	Cyber crime detection, investigation and prosecution pdf	seminar projects maker	1	958	20-09-2017, 11:31 AM Last Post: jaseela123
	Review: Context Aware Tools for Smart Home Development pdf	study tips	1	1,227	20-09-2017, 11:22 AM Last Post: jaseela123
	Getting Started with the MAXQ1103 Evaluation Kit and the CrossWorks Compiler pdf	project girl	1	969	15-09-2017, 03:11 PM Last Post: jaseela123
	Wireless Application Protocol (WAP) pdf	project girl	1	1,531	15-09-2017, 02:42 PM Last Post: jaseela123
	MAC Protocol for Reliable Multicast over Multi-Hop Wireless Ad Hoc Networks pdf	study tips	1	1,029	15-09-2017, 12:39 PM Last Post: jaseela123
	Wireless Automotive Communications pdf	seminar projects maker	1	637	14-09-2017, 01:27 PM Last Post: jaseela123
	Adaptive Wavelet Compression	nit_cal	1	15,294,127	14-09-2017, 12:34 PM Last Post: jaseela123

Quick Reply
Message Type your reply to this message here. Disable Smilies	You have selected one or more posts to quote. Quote these posts now or deselect them.