# Abbas El Gamal - isl. abbas/presentations/gallager- The Gallager Converse Abbas

date post

15-Jun-2018Category

## Documents

view

216download

0

Embed Size (px)

### Transcript of Abbas El Gamal - isl. abbas/presentations/gallager- The Gallager Converse Abbas

The Gallager Converse

Abbas El GamalDirector, Information Systems Laboratory

Department of Electrical Engineering

Stanford University

Gallagers 75th Birthday 1

Information Theoretic Limits

Establishing information theoretic limits, e.g., channel capacity, requires:

Finding a single-letter expression for the limit

Proving achievability

Proving a converse

While achievability tells us about how to improve system design,

converse is necessary to prove optimality

Proving a converse is typically harder and there are very few tools

available, e.g., Fanos inequality, data processing inequality, convexity

Gallagers 75th Birthday 2

Information Theoretic Limits

Establishing information theoretic limits, e.g., channel capacity, requires:

Finding a single-letter expression for the limit

Proving achievability

Proving a converse

While achievability tells us about how to improve system design,

converse is necessary to prove optimality

Proving a converse is typically harder and there are very few tools

available, e.g., Fanos inequality, data processing inequality, convexity

Gallagers identification of the auxiliary random variable:

Crucial step in proof of the degraded broadcast channel converse

Has been used in the proof of almost all subsequent converses in

multi-user information theory

I used it many times in my papers

Gallagers 75th Birthday 3

Broadcast Channel

T. M. Cover, Broadcast Channels, IEEE Trans. Info. Theory,

vol. IT-18, pp. 2-14, Jan. 1972

Discrete memoryless (DM) broadcast channel (X , p(y1, y2|x),Y1,Y2):

(W1,W2) Xn

p(y1, y2|x)

Y n1

Y n2

W1

W2

Encoder

Decoder 1

Decoder 2

Send independent message Wj [1, 2nRj ] to receiver j = 1, 2

Average probability of error: P(n)e = P{W1 6= W1 or W2 6= W2}

(R1, R2) achievable if there exists a sequence of codes with P(n)e 0

The capacity region is the closure of the set of achievable rates

Gallagers 75th Birthday 4

Example 1: Binary Symmetric BC

Assume that p1 p2 < 1/2, H(a), a [0, 1] binary entropy function

X

Z1 Bern(p1)

Z2 Bern(p2)

Y1

Y2

R1

R2

1 H(p2)

1 H(p1)

time-sharing

Gallagers 75th Birthday 5

Example 1: Binary Symmetric BC

Assume that p1 p2 < 1/2

X

Z1 Bern(p1)

Z2 Bern(p2)

Y1

Y2

{0, 1}ncloud (radius n)

cloudcenter Un

satellitecodeword Xn

Superposition coding: Use random coding. Let 0 1/2;

U Bern(1/2) and X Bern() independent, X = U + X

Generate 2nR2 i.i.d. Un(w2), w2 [1, 2nR2] (cloud centers)

Generate 2nR1 i.i.d. X n(w1), w1 [1, 2nR1]

Send Xn(w1, w2) = Un(w2) + X n(w1) (satellite codeword)

Gallagers 75th Birthday 6

Example 1: Binary Symmetric BC

Assume that p1 p2 < 1/2

X

Z1 Bern(p1)

Z2 Bern(p2)

Y1

Y2

R1

R2

C2

C1

time-sharing

Decoding:

Decoder 2 decodes Un(W2) R2 < 1 H( p2)

Decoder 1 first decodes Un(W2), subtracts it off, then decodes

X n(W1) R1 < H( p1) H(p1)

Gallagers 75th Birthday 7

Example 2: AWGN BC

Assume N1 N2, average power constraint P on X

X

Z1 N (0, N1)

Z2 N (0, N2)

Y1

Y2

Superposition coding: Let [0, 1], U N (0, (1 )P ),

X N (0, P ) independent, X = U + X

Decoder 2 decodes cloud center R2 C ((1 )P/(P + N2))

Decoder 1 decodes clound center, subtracts it off, then decode the

sattelite codeword R1 C (P/N1)

Gallagers 75th Birthday 8

Degraded Broadcast Channel

Stochastically degraded BC [Cover 1972]: There exists p(y2|y1) such

that

p(y2|x) =

y1

p(y1|x)p(y2|y1)

Special case of channel inclusion in:

C. E. Shannon, A note on a partial ordering for communication

channels, Information and Control, vol. 1, pp. 390-397, 1958

Gallagers 75th Birthday 9

Degraded Broadcast Channel

Stochastically degraded BC [Cover 1972]: There exists p(y2|y1) such

that

p(y2|x) =

y1

p(y1|x)p(y2|y1)

Special case of channel inclusion in:

C. E. Shannon, A Note on a Partial Ordering for Communication

Channels, Information and Control, vol. 1, pp. 390-397, 1958

Physically degraded version [Bergmans 73]:

X p(y1|x)Y1

p(y2|y1) Y2

Since the capacity region of any BC depends only on marginals p(y1|x),

p(y2|x) The capacity region of the degraded broadcast channel is the

same as that of the physically degraded version

Gallagers 75th Birthday 10

Degraded Broadcast Channel

Cover conjectured that the capacity region is set of all (R1, R2) such that

R2 I(U ; Y2),

R1 I(X ; Y1|U ),

for some p(u, x)

First time an auxiliary random variable is used in characterizing an

information theoretic limit

Gallagers 75th Birthday 11

Achievability

P. P. Bergmans, Random Coding Theorem for Broadcast Channels

with Degraded Components, IEEE Trans. Info. Theory, vol. IT-19,

pp. 197-207, Mar. 1973

Use superposition coding: Fix p(u)p(x|u)

Xn

Un X nUn

Decoder 2 decodes the cloud center Un R2 I(U ; Y2)

Decoder 1 decodes first decodes the cloud center, then the

sattelite codeword Xn R1 I(X ; Y1|U )

Gallagers 75th Birthday 12

The Converse

A. Wyner, A Theorem on the Entropy of Certain Binary Sequences

and Applications: Part II IEEE Trans. Info. Theory, vol. IT-10,

pp. 772-777, Nov. 1973

Proved weak converse for binary symmetric broadcast channel

(used Mrs. Gerbers Lemma)

Gallagers 75th Birthday 13

The Converse

A. Wyner, A Theorem on the Entropy of Certain Binary Sequences

and Applications: Part II IEEE Trans. Info. Theory, vol. IT-10,

pp. 772-777, Nov. 1973

Proved weak converse for binary symmetric broadcast channel

(used Mrs. Gerbers Lemma)

P.P. Bergmans, A Simple Converse for Broadcast Channels with

Additive White Gaussian Noise (Corresp.), IEEE Trans. Info. Theory,

vol. IT-20, pp. 279-280, Mar. 1974

Proved the converse for the AWGN BC (used Entropy Power

Inequality very similar to Wyners proof for binary symmetric

case)

Gallagers 75th Birthday 14

The Converse

A. Wyner, A Theorem on the Entropy of Certain Binary Sequences

and Applications: Part II IEEE Trans. Info. Theory, vol. IT-10,

pp. 772-777, Nov. 1973

Proved weak converse for binary symmetric broadcast channel

(used Mrs. Gerbers Lemma)

P.P. Bergmans, A Simple Converse for Broadcast Channels with

Additive White Gaussian Noise (Corresp.), IEEE Trans. Info. Theory,

vol. IT-20, pp. 279-280, Mar. 1974

Proved converse for AWGN BC (used Entropy Power Inequality)

R. Gallager, Capacity and Coding for Degraded Broadcast Channels,

Problemy Peredaci Informaccii, vol. 10, no. 3, pp. 3-14, July-Sept 1974

Proved the weak converse for the discrete-memoryless degraded

BC identification of auxiliary random variable

Established bound on cardinality of the auxiliary random variable

Gallagers 75th Birthday 15

Weak Converse for Shannon Channel Capacity

Shannon capacity: C = maxp(x) I(X ; Y ) (only uses channel variables)

Weak converse: Show that for any sequence of (n,R) codes with

P(n)e 0, R C

For each code form the empirical joint pmf:

(W, Xn, Y n) p(w)p(xn|w)n

i=1 p(yi|xi)

Fanos inequality: H(W |Y n) nRP(n)e + H(P

(n)e ) = nn

Now consider

nR = I(W ; Y n) + H(W |Y n)

I(W ; Y n) + nn

I(Xn; Y n) + nn data processing inequality

= H(Y n) H(Y n|Xn) + nn

n

i=1

I(Xi; Yi) + nn convexity, memorylessness

n(C + n)

Gallagers 75th Birthday 16

Gallager Converse

Show that given a sequence of (n,R1, R2) codes with P(n)e 0,

R1 I(X ; Y1|U ), R2 I(U ; Y2) for some p(u, x) such that

U X (Y1, Y2)

Key is identifying U

The converses for binary symmetric and AWGN BCs are direct and

do not explicitly identify U

As before, by Fanos inequality

H(W1|Yn1 ) nR1P

(n)e + 1 = n1n

H(W2|Yn2 ) nR2P

(n)e + 1 = n2n

So, we have

nR1 I(W1; Yn1 ) + n1n,

nR2 I(W2; Yn2 ) + n2n

Gallagers 75th Birthday 17

A First attempt: U = W2 (satisfies U Xi (Yi1, Y2i))

I(W1; Yn1 ) I(W1; Y

n1 |W2)

= I(W1; Yn1 |U )

=n

i=1

I(W1; Y1i|U, Yi11 ) chain rule

n

i=1

(

H(Y1i|U ) H(Y1i|U, Yi11 , W1)

)

n

i=1

(

H(Y1i|U ) H(Y1i|U, Yi11 , W1, Xi)

)

=n

i=1

I(Xi; Y1i|U )

So far so good.

Gallagers 75th Birthday 18

A First attempt: U = W2 (satisfies U Xi (Yi1, Y2i))

I(W1; Yn1 ) I(W1; Y

n1 |W2)

= I(W1; Yn1 |U )

=n

i=1

I(W1; Y1i|U, Yi11 )

n

i=1

(

H(Y1i|U ) H(Y1i|U, Yi11 , W1)

)

n

i=1

(

H(Y1i|U ) H(Y1i|U, Yi11 , W1, Xi)

)

=

n

i=1

I(Xi; Y1i|U )

So far so good. Now lets try the second inequality

I(W2; Yn2 ) =

n

i=1

I(W2; Y2i|Yi12 ) =

n

i=1

I(U ; Y2i|Yi12 )

But I(U ; Y2i|Yi12 ) is not necessarily I(U ; Y2i), so U = W2 does not

work