- Introduction
Information theory is a relatively new branch of mathematics that was made mathematically rigorous only in the 1940s. The term 'information theory' does not possess a unique definition. Broadly speaking, informatin theory deals with the study of problems concerning any system. This includes information processing, information storage, informaion retrival and decision making. In a narrow sense, information theory studies all theoretical problems connected with the transmission of information over communication channels. This includes the study of uncertainty (information) measures and various practical and economical methods of coding information for transmission.
The first studies in this direction were undertaken by Nyquist in 1924 [75] and 1928 [76]and by Hartley in 1928 [44] who recognized the logarithmic nature of the measure of information. In 1948, Shannon [86] published a remarkable paper on the properties of information sources and of the communication channels used to transmit the outputs of these sources. Around the same time Wiener (1948) [120] also considered the communication situation and came up, independently, with results similar to those of Shannon.
Both Shannon and Wiener considered the communication situation as one in which a signal, chosen from a specified class, is to be transmitted through a channel. The output of the channel is described statically by each permissible input. The basic problem of communication is to reconstruct as closely as possible the input signal after observing the received signal at the output.
However, the approach used by Shannon differs from that of Wiener in the nature of the transmitted signal and in the type of decision made at the receiver. In the Shannon model messages are first encoded and then transmitted, whereas in the Wiener model the signal is communicated directly through the channel without being encoded.
In the past fifty years the literature on information theory has grown quite voluminous and apart from communication theory it has found deep applications in many social, physical and biological sciences, for example, economics, statistics, accounting, language, psychology, ecology, pattern recognition, computer sciences, fuzzy sets, etc..
A key feature of Shannon information theory
is the term "information" that can often be given a mathematical
meaning as a numerically measurable quantity, on the basis of a probabilistic
model, in such a way that the solutions of many important problems of information
storage and the transmission can be formulated in terms of this measure
of the amount of information. This important measure has a very concrete
operational interpretation: it is roughly equals the minimum number of
binary digits needed, on the average, to encode the message in question.
The coding theorems of information theory provide such overwhelming evidence
for the adequateness of the Shannon information measure that to look for
essentially different measures of information might appear to make no sense
at all. Moreover, it has been shown by several authors, starting with Shannon
(1948) [86], that the measure of amount of
information is uniquely determined by some rather natural postulates. Still,
all the evidence that the Shannon information measure is the only
possible one, is valid only within restricted scope of coding problems
considered by Shannon. As pointed out by Rényi (1961) [82]
in his fundamental paper on generalized information measures, in other
sort of problems other quantities may serve just as well, or even better,
as measures of information. This should be supported either by their operational
significance or by a set of natural postulates characterizing them, or,
preferably, by both. Thus the idea of generalized entropies arises in the
literature. It started with Rényi (1961) [82]
who characterized a scaler parametric entropy as entropy of order ,
which includes Shannon entropy as a limiting case.
On the other side, Kullback and Leiber in
1951 [65] studied a measure of information
from statistical aspects of view, involving two probability distributions
associated with the same experiment, calling
discrimination function,
later different authors named as cross entropy, relative information,
etc.. At the same time Kullback and Leiber also studied a divergence measure,
calling divergence,
the measure already studied by Jeffreys in 1946 [49].
Kerrige in 1961 [63] studied a different
kind of measure,calling inaccuracy measure, involving again two
probability distributions. Sibson in 1969 [95]
studied another divergence measure involving two probability distributions,
using mainly the concavity property of Shannon's entropy, calling
information radius. Later Burbea and Rao in 1982 [18],[19]
studied extensively the information radius and its parametric generalization,
calling this measure as Jensen difference divergence measure. Thus,
the Shannon's entropy, the Kullback-Leibler's relative information, the
Kerridge's inaccurary, the Jeffreys invariant(or
divergence)
and Sibson's information radius are the five classical measures of information
associated with one and two probability distributions. These five classical
measures have found deep applications in the areas of information theory
and statistics. During the past years various measures have been introduced
in the literature generalizing these measures. These generalizations include
one and two scalar parameters. Taneja in 1995 [108]
studied a new measure of divergence and its two parametric generalizations
involving two probability distributions based on arithmetic and geometric
mean inequaity.
The aim of this book is to study these generalized
information and divergence measures in the unified forms, and then apply
them towards transmission of information and statistical concepts. In what
to refer the content of the book we devided it in eleven chapters covering
the study of fundamental concepts of unified information
and divergence measures including entropy-type measures and the Shannon's
entropy and the application part include the applications of generalized
information measures in transmission of information covering important
areas of information theory viz., Noiseless coding, Huffman procedure,
Redundancy, Channel capacity, Coding theorems, Statistical aspects of generalized
information measures towards Fisher's information measure, Comparison of
experiment, and bounds on the bayesian probability of error in feature
selection problems. Connections to generalized information measures with
several probability distributions is also made. Still it is planned to
include in this part the applications of generalized information measures
in Fuzzy sets theory.