Efficient Online Portfolio with Logarithmic Regret Haipeng Luo - - PowerPoint PPT Presentation
Efficient Online Portfolio with Logarithmic Regret Haipeng Luo - - PowerPoint PPT Presentation
Efficient Online Portfolio with Logarithmic Regret Haipeng Luo (USC) Chen-Yu Wei (USC) Kai Zheng (Peking University) Online Portfolio Wealth Online Portfolio 0.5Wealth 0.3Wealth Wealth 0.2Wealth Online Portfolio
Online Portfolio
Wealthπ’
Online Portfolio
Wealthπ’ 0.5Wealthπ’ 0.3Wealthπ’ 0.2Wealthπ’
Online Portfolio
Wealthπ’ 0.5Wealthπ’ 0.3Wealthπ’ 0.2Wealthπ’
Γ 1.4 Γ 1.0 Γ 0.5
Online Portfolio
Wealthπ’ 0.5Wealthπ’ 0.3Wealthπ’ 0.2Wealthπ’ 0.7Wealthπ’ 0.3Wealthπ’ 0.1Wealthπ’
Γ 1.4 Γ 1.0 Γ 0.5
Online Portfolio
Wealthπ’+1 = 0.7 + 0.3 + 0.1 Wealthπ’ = 1.1Wealthπ’ Wealthπ’ 0.5Wealthπ’ 0.3Wealthπ’ 0.2Wealthπ’ 0.7Wealthπ’ 0.3Wealthπ’ 0.1Wealthπ’
Γ 1.4 Γ 1.0 Γ 0.5
Online Portfolio
Wealthπ’
π¦π’ (decision)
0.5Wealthπ’ 0.3Wealthπ’ 0.2Wealthπ’ 0.7Wealthπ’ 0.3Wealthπ’ 0.1Wealthπ’
Γ 1.4 Γ 1.0 Γ 0.5
Wealthπ’+1 = 0.7 + 0.3 + 0.1 Wealthπ’ = 1.1Wealthπ’
Online Portfolio
Wealthπ’+1 = 0.7 + 0.3 + 0.1 Wealthπ’ = 1.1Wealthπ’ Wealthπ’
π¦π’ (decision) π
π’ (price relative)
0.5Wealthπ’ 0.3Wealthπ’ 0.2Wealthπ’ 0.7Wealthπ’ 0.3Wealthπ’ 0.1Wealthπ’
Γ 1.4 Γ 1.0 Γ 0.5
Online Portfolio
Wealthπ’+1 = 0.7 + 0.3 + 0.1 Wealthπ’ = 1.1Wealthπ’ = π¦π’, π π’ Wealthπ’ Wealthπ’
π¦π’ (decision) π
π’ (price relative)
0.5Wealthπ’ 0.3Wealthπ’ 0.2Wealthπ’ 0.7Wealthπ’ 0.3Wealthπ’ 0.1Wealthπ’
Γ 1.4 Γ 1.0 Γ 0.5
Online Portfolio
π
1
π periods
Online Portfolio
π
1
π
2
= π¦1, π
1 π 1
π periods
Online Portfolio
π
1
π
2
= π¦1, π
1 π 1
π
3
= π¦2, π
2 π 2
π periods
Online Portfolio
π
1
π
2
= π¦1, π
1 π 1
π
3
= π¦2, π
2 π 2
π periods
ππ+1 π
1
=
π’=1 π
π¦π’, π
π’
Final wealth Initial wealth
Online Portfolio
Gain:
Online Portfolio
Gain: Benchmark:
Online Portfolio
Gain: Benchmark: Minimize (Regret)
Online Portfolio
Gain:
Online Convex Optimization [Zinkevichβ03]
Benchmark: Minimize (Regret)
Online Portfolio
Gain:
Online Convex Optimization [Zinkevichβ03]
Benchmark: Minimize
Maximum Relative Ratio
πΌβπ’ π¦
β βΎ π» β max π,π
π
π’,π
π
π’,π
(Regret)
But with possibly unbounded gradient
Previous Results and Our Results
- Lower bound: Ξ© π log π
π: number of stocks π: number of rounds
Previous Results and Our Results
- Lower bound: Ξ© π log π
Algorithm Regret Time (/round) Universal Portfolio
(Cover 1991, Kalai et al. 2002)
π log π π14π4
- Upper bounds:
π: number of stocks π: number of rounds π»: maximum relative ratio
Previous Results and Our Results
- Lower bound: Ξ© π log π
Algorithm Regret Time (/round) Universal Portfolio
(Cover 1991, Kalai et al. 2002)
π log π π14π4
- Upper bounds:
π: number of stocks π: number of rounds π»: maximum relative ratio
Previous Results and Our Results
- Lower bound: Ξ© π log π
Algorithm Regret Time (/round) Universal Portfolio
(Cover 1991, Kalai et al. 2002)
π log π π14π4 ONS (Hazan et al. 2007) π»π log π π3.5
- Upper bounds:
π: number of stocks π: number of rounds π»: maximum relative ratio
Previous Results and Our Results
- Lower bound: Ξ© π log π
Algorithm Regret Time (/round) Universal Portfolio
(Cover 1991, Kalai et al. 2002)
π log π π14π4 ONS (Hazan et al. 2007) π»π log π π3.5
- Upper bounds:
π: number of stocks π: number of rounds π»: maximum relative ratio
Previous Results and Our Results
- Lower bound: Ξ© π log π
Algorithm Regret Time (/round) Universal Portfolio
(Cover 1991, Kalai et al. 2002)
π log π π14π4 ONS (Hazan et al. 2007) π»π log π π3.5 Soft-Bayes (Orseau et al. 2017) ππ π
- Upper bounds:
π: number of stocks π: number of rounds π»: maximum relative ratio
Previous Results and Our Results
- Lower bound: Ξ© π log π
Algorithm Regret Time (/round) Universal Portfolio
(Cover 1991, Kalai et al. 2002)
π log π π14π4 ONS (Hazan et al. 2007) π»π log π π3.5 Soft-Bayes (Orseau et al. 2017) ππ π
- Upper bounds:
π: number of stocks π: number of rounds π»: maximum relative ratio
Previous Results and Our Results
- Lower bound: Ξ© π log π
Algorithm Regret Time (/round) Universal Portfolio
(Cover 1991, Kalai et al. 2002)
π log π π14π4 ONS (Hazan et al. 2007) π»π log π π3.5 Soft-Bayes (Orseau et al. 2017) ππ π ? β π log π β π
- Upper bounds:
π: number of stocks π: number of rounds π»: maximum relative ratio
Previous Results and Our Results
- Lower bound: Ξ© π log π
Algorithm Regret Time (/round) Universal Portfolio
(Cover 1991, Kalai et al. 2002)
π log π π14π4 ONS (Hazan et al. 2007) π»π log π π3.5 Soft-Bayes (Orseau et al. 2017) ππ π ? β π log π β π BarrONS (this work) π2 log π 4 ππ2.5
- Upper bounds:
π: number of stocks π: number of rounds π»: maximum relative ratio
Key Components of Our Algorithm
bad bad suddenly good But player puts little weight on it Main Challenge:
Key Components of Our Algorithm
Barrons (Barrier-Regularized-ONS) compared to ONS:
bad bad suddenly good But player puts little weight on it Main Challenge:
Key Components of Our Algorithm
Barrons (Barrier-Regularized-ONS) compared to ONS: 1. Additional regularizer (to avoid too extreme distribution over stocks)
bad bad suddenly good But player puts little weight on it Main Challenge:
Key Components of Our Algorithm
Barrons (Barrier-Regularized-ONS) compared to ONS: 1. Additional regularizer (to avoid too extreme distribution over stocks) 2. Increase the learning rate for worse stocks (faster recovery)
bad bad suddenly good But player puts little weight on it Main Challenge:
Key Components of Our Algorithm
Barrons (Barrier-Regularized-ONS) compared to ONS: 1. Additional regularizer (to avoid too extreme distribution over stocks) 2. Increase the learning rate for worse stocks (faster recovery) 3. Restarting (adapting to maximum relative ratio)
bad bad suddenly good But player puts little weight on it Main Challenge:
Key Components of Our Algorithm
Barrons (Barrier-Regularized-ONS) compared to ONS: 1. Additional regularizer (to avoid too extreme distribution over stocks) 2. Increase the learning rate for worse stocks (faster recovery) 3. Restarting (adapting to maximum relative ratio)
bad bad suddenly good But player puts little weight on it Main Challenge:
Poster #157