The Information Bottleneck: Theory and Applications

**seminar projects maker** · 25-08-2017, 09:32 PM

The Information Bottleneck: Theory and Applications

.pdf

Information Bottleneck:.pdf (Size: 1.34 MB / Downloads: 99)

Abstract

This thesis introduces the first comprehensive review of the Information Bottleneck (IB) method along
with its recent extension, the multivariate IB. The IB method was originally suggested in [82] as a new
information-theoretic approach for data analysis. The basic idea is surprisingly simple: Given a joint distri-
μ, find a compressed representation of , denoted by , that is as informative as possible about
bution ́
. This idea can be formulated as a variational principle of minimizing the mutual information, ́
μ
(which controls the compactness of the representation ), under some constraint on the minimal level of
μ . Hence, the fundamental trade-off between
mutual information that preserves about , given by ́
the complexity of the model and its precision is expressed here in an entirely symmetric form, where the
exact same concept of information controls both its sides. Indeed, an equivalent posing of the IB principle
μ
would be to maximize the information maintains about , where the (compression) information ́
is constrained to some maximal level.
As further shown in [82], this constrained optimization problem can be considered analogous to rate distor-
tion theory, but with an important distinction: the distortion measure does not need to be defined in advance,
μ. Moreover, it leads to a tractable mathematical
but rather naturally emerges from the joint statistics, ́
analysis which provides a formal characterization of the optimal solution to this problem. As an immedi-
ate implication, the IB method formulates a well defined information-theoretic framework for unsupervised
clustering problems, which is the main focus of this thesis. Nonetheless, it is important to keep in mind
that the same underlying principle of a trade-off between information terms may have further implications
in other related fields, as recently suggested in [37].

Introduction

In this chapter we provide a basic introduction to the remaining chapters. In the first section we present high
level descriptions of the fundamental trade-off between precision and complexity. One important variant of
this trade-off is formulated as the problem of unsupervised clustering, which is the main problem we address
in this thesis. In the next section we present the necessary preliminaries for our analysis. We conclude this
chapter by presenting a simple example in order to elucidate the central ideas that will be discussed later on.

Preliminaries

In this section we introduce the basic concepts required for the next chapters. We start with some nota-
divergence and
tions, and further state the definitions of entropy, mutual and multi information,
divergence. Most of this section is based on Chapter 3⁄4 in [20], Chapter ¿ in [5] (which provides a friendly
introduction to the concept of entropy), and a work in progress by Nemenman and Tishby [55], which
introduces a new axiomatic derivative of mutual and multi-information.

Entropy and related concepts

Consider the following situation. We are given a finite collection of documents, denoted by
A person chooses to read a single document out of this collection, and our task is to guess which document
was chosen. Without any prior knowledge, all guesses are equally likely. We now further assume that we
3⁄4 ,
have access to a definite set of (exhaustive and mutually exclusive) probabilities, denoted by ́ μ
for all the possible choices. For example, let us assume that longer documents are more probable than shorter
ones. More specifically, that the probability of choosing each document is proportional to the (known) num-
ber of words that occur in it. If all the documents consist of exactly the same number of words, ́ μ is
uniform and obviously we are back at the starting point where no guess is preferable. However, if one docu-
ment is much longer than all the others, ́ μ will have a clear peak for this document, hence our chances of
providing the correct answer will improve. How can we quantify the difference between these two scenarios
in a well defined way?

Mutual information and multi-information

Let us reconsider our previous example of trying to guess what document was chosen. However, we now
assume that we have access not only to the prior distribution ́ μ, but rather to a joint distribution of
with some other random variable, . For concreteness, if values correspond to all the possible document
values correspond to all the distinct words occurring in this document
identities, let us assume that
μ
collection. Thus, more formally stated, we assume that we have access to the joint distribution ́
which indicates the probability that a random word position in the corpus is equal to 3⁄4
while the
document identity is 3⁄4 . 3

**jaseela123** · 26-08-2017, 01:32 PM

The bottleneck literally refers to the upper narrow part of a bottle. In engineering, it refers to a phenomenon in which the performance or capacity of an entire system is limited by a single or small number of components or resources.

Reference:
http://www.cs.huji.ac.il/labs/learning/T...im_PhD.pdf

Possibly Related Threads…
Thread		Author	Replies	Views	Last Post
	Embedded system and its applications Report	seminar post	0	1,026	10-05-2014, 04:54 PM Last Post: seminar post
	30 Years of Successes and Failures in Geopolymer Applications. Market Trends	seminar post	0	619	11-04-2014, 12:41 PM Last Post: seminar post
	Linear and Nonlinear Static and Dynamic Analysis and Design of Three-Dimensional	seminar projects maker	0	637	01-04-2014, 11:28 AM Last Post: seminar projects maker
	JEE (Advanced) -2013 Information Brochure	seminar projects maker	0	488	26-03-2014, 04:35 PM Last Post: seminar projects maker
	Nanotechnologies in Food & Agriculture: Potential Applications and Implications	seminar projects maker	0	487	10-03-2014, 04:50 PM Last Post: seminar projects maker
	DIMENSIONAL STONE RESOURCES IN INDIA AND ENVIRONMENT FRIENDLY MINING AND PROCESSING	seminar projects maker	0	815	13-09-2013, 12:58 PM Last Post: seminar projects maker
	Pushpagiri Medical College Central Library as an Information Center in digital	seminar projects maker	0	674	12-09-2013, 03:07 PM Last Post: seminar projects maker
	Magnetic Nanocomposite Materials for High Temperature Applications pdf	study tips	0	963	29-08-2013, 02:56 PM Last Post: study tips
	Profiling intelligent systems applications in fraud detection and prevention: survey	study tips	0	692	22-08-2013, 03:54 PM Last Post: study tips
	Aerobic and Anaerobic Digestion and Types of Decomposition	study tips	0	656	20-08-2013, 04:26 PM Last Post: study tips

Quick Reply
Message Type your reply to this message here. Disable Smilies	You have selected one or more posts to quote. Quote these posts now or deselect them.