capacity of continuous channels with memory via directed
play

Capacity of Continuous Channels with Memory via Directed Information - PowerPoint PPT Presentation

Capacity of Continuous Channels with Memory via Directed Information Neural Estimator Ziv Aharoni 1 , Dor Tsur 1 , Ziv Goldfeld 2 , Haim H. Permuter 1 1 Ben-Gurion University of the Negev 2 Cornell University International Symposium on Information


  1. Capacity of Continuous Channels with Memory via Directed Information Neural Estimator Ziv Aharoni 1 , Dor Tsur 1 , Ziv Goldfeld 2 , Haim H. Permuter 1 1 Ben-Gurion University of the Negev 2 Cornell University International Symposium on Information Theory June 21 st , 2020 Ziv Aharoni Capacity via DINE 1 / 18

  2. Communication Channel X i Y i � M Encoder Channel Decoder M Y i − 1 ∆ Continuous alphabet Time invariant channel with memory channel is unknown Ziv Aharoni Capacity via DINE 2 / 18

  3. Capacity Feedback is not present: 1 nI ( X n ; Y n ) C FF = lim n →∞ sup P Xn Feedback is present: 1 nI ( X n → Y n ) C FB = lim sup n →∞ P Xn � Y n − 1 where I ( X n → Y n ) is the directed information (DI) Ziv Aharoni Capacity via DINE 3 / 18

  4. Capacity Feedback is not present: 1 n I ( X n → Y n ) C FF = lim n →∞ sup P Xn Feedback is present: 1 n I ( X n → Y n ) C FB = lim sup n →∞ P Xn � Y n − 1 where I ( X n → Y n ) is the directed information (DI) DI is a unifying measure for feed-forward (FF) and feedback (FB) capacity Ziv Aharoni Capacity via DINE 3 / 18

  5. Talk Outline Directed Information Neural Estimator (DINE) X i Y i � M Channel Decoder M Gradient Y i − 1 ∆ Ziv Aharoni Capacity via DINE 4 / 18

  6. Talk Outline Directed Information Neural Estimator (DINE) Neural Distribution Transformer (NDT) X i Y i � M Channel Decoder M Gradient Y i − 1 ∆ Ziv Aharoni Capacity via DINE 4 / 18

  7. Talk Outline Directed Information Neural Estimator (DINE) Neural Distribution Transformer (NDT) Capacity estimation X i Y i � M Channel Decoder M Gradient Y i − 1 ∆ Ziv Aharoni Capacity via DINE 4 / 18

  8. Preliminaries - Donsker-Varadhan Theorem ( Donsker-Varadhan Representation) The KL-divergence between the probability measures P and Q, can be represented by � e T � D KL ( P � Q ) = E P [T] − log E Q sup T:Ω − → R where, T is measurable and expectations are finite. For mutual information: � e T � I ( X ; Y ) = sup E P XY [T] − log E P X P Y T:Ω − → R Ziv Aharoni Capacity via DINE 5 / 18

  9. MINE (Y. Bengio Keynoe ISIT ’19) Mutual Information Neural Estimator : Given { x i , y i } n i =1 Approximation � e T θ � ˆ I ( X ; Y ) = sup E P XY [T θ ] − log E P X P Y θ ∈ Θ Estimation � n � n 1 T θ ( x i , y i ) − log 1 ˆ e T θ ( x i , � y i ) I n ( X , Y ) = sup n n θ ∈ Θ i =1 i =1 Ziv Aharoni Capacity via DINE 6 / 18

  10. Estimator Derivation DI as entropies difference I ( X n → Y n ) = h ( Y n ) − h ( Y n � X n ) � Y i | X i , Y i − 1 � where h ( Y n � X n ) = � n i =1 h Using an reference measure: I ( X n → Y n ) = I ( X n − 1 → Y n − 1 )+ D KL ( P Y n � X n � P Y n − 1 � X n − 1 ⊗ P � Y | P X n ) − D KL ( P Y n � P Y n − 1 ⊗ P � Y ) � �� � � �� � D ( n ) D ( n ) Y Y � X P � Y is some uniform i.i.d reference measure of the dataset. Ziv Aharoni Capacity via DINE 7 / 18

  11. Estimator Derivation DI Rate as a difference of KL-divergences: I ( X n → Y n ) = I ( X n − 1 → Y n − 1 ) D ( n ) Y � X − D ( n ) + Y � �� � increment in info. in step n Ziv Aharoni Capacity via DINE 8 / 18

  12. Estimator Derivation DI Rate as a difference of KL-divergences: n →∞ D ( n ) Y � X − D ( n ) − − − → I ( X → Y ) Y The limit exists for ergodic and stationary processes Ziv Aharoni Capacity via DINE 8 / 18

  13. Estimator Derivation DI Rate as a difference of KL-divergences: n →∞ D ( n ) Y � X − D ( n ) − − − → I ( X → Y ) Y The goal: Estimate D ( n ) Y � X , D ( n ) Y Ziv Aharoni Capacity via DINE 8 / 18

  14. Directed Information Neural Estimator Apply DV formula on D ( n ) Y � X , D ( n ) Y : � � �� D ( n ) � T( Y n − 1 , ˜ E P Y n [T( Y n )] − E P Y n − 1 ⊗ P ˜ = sup exp Y ) Y Y T :Ω → R P Yn | Y n − 1 where the optimal solution is T ∗ = log P ˜ Y Ziv Aharoni Capacity via DINE 9 / 18

  15. Directed Information Neural Estimator Approximate T with a recurrent neural network (RNN) � � �� D ( n ) � T θ Y ( Y n − 1 , ˜ E P Y n [T θ Y ( Y n )] − E P Y n − 1 ⊗ P ˜ = sup exp Y ) Y Y θ Y Ziv Aharoni Capacity via DINE 9 / 18

  16. Directed Information Neural Estimator Estimate expectations with empirical means � � n n � � 1 1 D ( n ) � y i | y i − 1 ) T θ Y ( y i | y i − 1 ) − log e T θ Y ( � = sup Y n n θ Y i =1 i =1 Ziv Aharoni Capacity via DINE 9 / 18

  17. Directed Information Neural Estimator Estimate expectations with empirical means � � n n � � 1 1 D ( n ) � y i | y i − 1 ) T θ Y ( y i | y i − 1 ) − log e T θ Y ( � = sup Y n n θ Y i =1 i =1 I ( n ) ( X → Y ) = � � D ( n ) XY − � D ( n ) Finally, Y Ziv Aharoni Capacity via DINE 9 / 18

  18. Consistency Theorem ( DINE consistency) Let { X i , Y i } ∞ i =1 ∼ P be jointly stationary ergodic stochastic processes. Then, there exist RNNs F 1 ∈ RNN d y , 1 , F 2 ∈ RNN d xy , 1 , such that DINE � I n (F 1 , F 2 ) is a strongly consistent estimator of I ( X → Y ) , i.e., a.s � lim I n (F 1 , F 2 ) = I ( X → Y ) n →∞ Ziv Aharoni Capacity via DINE 10 / 18

  19. Consistency Theorem ( DINE consistency) Let { X i , Y i } ∞ i =1 ∼ P be jointly stationary ergodic stochastic processes. Then, there exist RNNs F 1 ∈ RNN d y , 1 , F 2 ∈ RNN d xy , 1 , such that DINE � I n (F 1 , F 2 ) is a strongly consistent estimator of I ( X → Y ) , i.e., � a.s lim I n (F 1 , F 2 ) = I ( X → Y ) n →∞ Sketch of proof: Represent the solution T ∗ by a dynamic system. Universal approximation of dynamical system with RNNs. Estimation of expectations with empirical means. Ziv Aharoni Capacity via DINE 10 / 18

  20. Implementation � � � n � n 1 1 D ( n ) � y i | y i − 1 ) T θ Y ( y i | y i − 1 ) − log e T θ Y ( � = sup Y n n θ Y i =1 i =1 Ziv Aharoni Capacity via DINE 11 / 18

  21. Implementation � � � n � n 1 1 D ( n ) � y i | y i − 1 ) T θ Y ( y i | y i − 1 ) − log e T θ Y ( � = sup Y n n θ Y i =1 i =1 Adjust RNN to process both inputs and carry the state generated by true samples Ziv Aharoni Capacity via DINE 11 / 18

  22. Implementation � � � n � n 1 1 D ( n ) � y i | y i − 1 ) T θ Y ( y i | y i − 1 ) − log e T θ Y ( � = sup Y n n θ Y i =1 i =1 Adjust RNN to process both inputs and carry the state generated by true samples � � S 1 S T S 1 S T 0 S T − 1 ... F F F F � Y 1 Y 1 � Y T Y T Ziv Aharoni Capacity via DINE 11 / 18

  23. Implementation Complete system layout for the calculation of � D ( n ) Y Input Y i S i Dense T θ Y ( Y i | Y i − 1 ) Modified � D Y ( θ Y , D n ) LSTM DV Layer � � Y i S i Dense T θ Y ( � Reference Gen. Y i | Y i − 1 ) Ziv Aharoni Capacity via DINE 12 / 18

  24. NDT Neural Distribution Transformer (NDT) X i Y i � M Channel Decoder M Gradient Y i − 1 ∆ Ziv Aharoni Capacity via DINE 13 / 18

  25. NDT Model M as i.i.d Gaussian noise { N i } i ∈ Z . The NDT a mapping NDT : N i �− w/o feedback: → X i NDT : N i , Y i − 1 �− → X i w/ feedback: Ziv Aharoni Capacity via DINE 14 / 18

  26. NDT Model M as i.i.d Gaussian noise { N i } i ∈ Z . The NDT a mapping NDT : N i �− w/o feedback: → X i NDT : N i , Y i − 1 �− → X i w/ feedback: NDT is modeled by an RNN Y i − 1 X i Channel LSTM Dense Dense Constraint N i Ziv Aharoni Capacity via DINE 14 / 18

  27. Capacity Estimation Iterating between DINE and NDT. Y i − 1 Feedback ∆ X i NDT Channel (RNN) P Y i | X i Y i − 1 Noise N i Gradient Y i DINE Output � I n ( X → Y ) (RNN) Ziv Aharoni Capacity via DINE 15 / 18

  28. Results Channel - MA(1) additive Gaussian noise (AGN): Z i = α U i − 1 + U i Y i = X i + Z i i.i.d. where, U i ∼ N (0 , 1), X i is the channel input sequence bound to the power constraint E [ X 2 i ] ≤ P , and Y i is the channel output. Ziv Aharoni Capacity via DINE 16 / 18

  29. MA(1) AGN Results Estimation performance 1.8 2 1.8 1.6 1.6 1.4 1.4 1.2 1.2 1 1 0.8 0.8 0.6 0.6 0.4 0.4 0.2 0.2 0 0 -20 -15 -10 -5 0 5 10 15 -20 -15 -10 -5 0 5 10 15 (a) Feed-forward Capacity (b) Feedback Capacity Ziv Aharoni Capacity via DINE 17 / 18

  30. Conclusion and Future Work Conclusions: Estimation method for both FF and FB capacity. Pros: mild assumptions on the channel Cons: lack of provable bounds Ziv Aharoni Capacity via DINE 18 / 18

  31. Conclusion and Future Work Conclusions: Estimation method for both FF and FB capacity. Pros: mild assumptions on the channel Cons: lack of provable bounds Future Work: Generalize for more complex scenarios (e.g multi-user) Obtain provable bounds on fundamental limits Ziv Aharoni Capacity via DINE 18 / 18

  32. Conclusion and Future Work Conclusions: Estimation method for both FF and FB capacity. Pros: mild assumptions on the channel Cons: lack of provable bounds Future Work: Generalize for more complex scenarios (e.g multi-user) Obtain provable bounds on fundamental limits Thank You! Ziv Aharoni Capacity via DINE 18 / 18

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend