Adaptive Signal Recovery by Convex Optimization
Dmitrii Ostrovskii CWI, Amsterdam 19 April 2018
Adaptive Signal Recovery by Convex Optimization Dmitrii Ostrovskii - - PowerPoint PPT Presentation
Adaptive Signal Recovery by Convex Optimization Dmitrii Ostrovskii CWI, Amsterdam 19 April 2018 Signal denoising problem Recover complex signal x = ( x ) , = n , ..., n , from noisy observations y = x + , = n ,
Dmitrii Ostrovskii CWI, Amsterdam 19 April 2018
Recover complex signal x = (xτ), τ = −n, ..., n, from noisy observations yτ = xτ + σξτ, τ = −n, ..., n, where ξτ are i.i.d. standard complex Gaussian random variables.
20 40 60 80 100
0.5 1 1.5 2
Signal
20 40 60 80 100
5
Observations
Dmitrii Ostrovskii Adaptive Signal Recovery by Convex Optimization 1 / 31
Cn(Z) = {x = (xτ)τ∈Z : xτ = 0 whenever |τ| > n} ; ℓp-norms restricted to Cn(Z): xp =
p ;
Scaled ℓp-norms: xn,p = 1 (2n + 1)1/p xp.
ℓ( x, x) = | x0 − x0| – pointwise loss; ℓ( x, x) = x − xn,2 – ℓ2-loss.
R( x, x) = [Eℓ( x, x)2]
1 2 ;
Rδ( x, x) = min {r ≥ 0 : ℓ( x, x) ≤ r with probability ≥ 1 − δ}.
Dmitrii Ostrovskii Adaptive Signal Recovery by Convex Optimization 2 / 31
Classical approach Given a set X containing x, look for a near-minimax, over X, estimator
xo is linear in y (e.g. for pointwise loss)*. If X is unknown, xo becomes an unavailable linear oracle. Mimic it! Oracle approach Knowing that there exists a linear oracle xo with small risk R( xo, x), construct an adaptive estimator x = x(y) satisfying an oracle inequality: R( x, x) ≤ P · R( xo, x) + Rem, Rem ≪ R( xo, x). x, xo can change but P and Rem must be uniformly bounded over ( xo, x).
*[Ibragimov and Khasminskii, 1984; Donoho et al., 1990], *[Tsybakov, 2008]
Dmitrii Ostrovskii Adaptive Signal Recovery by Convex Optimization 3 / 31
Let x be a regularly sampled function: xt = f (t/N), t = −N, ..., N, where f : [−1, 1] → R has weak derivative Dsf of order s ≥ 1 on [−1, 1], and belongs to a Sobolev (q = 2) or H¨
Fs,L = {f (·) : Dsf Lq ≤ L}.
1 2hN + 1
K τ hN
|t| ≤ N − hN.
...
*[Adams and Fournier, 2003; Brown et al., 1996; Watson, 1964; Nadaraya, 1964; Tsybakov, 2008; Johnstone, 2011], *[Lepski, 1991; Lepski et al., 1997, 2015; Goldenshluger et al., 2011]
Dmitrii Ostrovskii Adaptive Signal Recovery by Convex Optimization 4 / 31
ϕτyt−τ, where ∗ is discrete convolution, and ϕ ∈ Cn(Z) is called a filter. Definition* A signal x is (n, ρ)-recoverable if there exists φo ∈ Cn(Z) which satisfies
≤ σρ √2n + 1, |t| ≤ 3n.
3n,2]1/2 ≤ σρ √2n+1.
4n
3n
2n
n
*[Juditsky and Nemirovski, 2009; Nemirovski, 1991; Goldenshluger and Nemirovski, 1997]
Dmitrii Ostrovskii Adaptive Signal Recovery by Convex Optimization 5 / 31
Goal Assuming that x is (n, ρ)-recoverable, construct an adaptive filter
ϕ(y) such that the pointwise or ℓ2-risk of x = ϕ ∗ y is close to
σρ √2n+1.
Main questions:
Yes, but we must pay the price polynomial in ρ;
ϕ be efficiently computed? Yes, by solving a well-structured convex optimization problem.
Yes: when the signal belongs to shift-invariant subspace S ⊂ C(Z), dim(S) = s, we have “nice” bounds on ρ = ρ(s).
Dmitrii Ostrovskii Adaptive Signal Recovery by Convex Optimization 6 / 31
xt − [φo ∗ y]t
= xt − [φo ∗ x]t
+ σ[φo ∗ ξ]t
.
|xt − [φo ∗ x]t| ≤ σρ √2n + 1, |t| ≤ 3n, and φo2 ≤ ρ √2n + 1.
Look at the Fourier transforms Estimate x via x = ϕ ∗ y, where ϕ = ϕ(y) ∈ C2n(Z) minimizes the Fourier-domain residual F2n[y − ϕ ∗ y]p while keeping F2n[ϕ]1 small.
Dmitrii Ostrovskii Adaptive Signal Recovery by Convex Optimization 7 / 31
Oracle with small ℓ1-norm of DFT* If x is (n, ρ)-recoverable, then there exists a ϕo ∈ C2n(Z) s.t. for R= 2ρ2, |xt − [ϕo ∗ x]t| ≤ CσR √4n + 1, |t| ≤ 2n, F2n[ϕo]1 ≤ R √4n + 1.
|xt − [ϕo ∗ x]t| = |xt − [φo ∗ x]t| + |[φo ∗ (x − φo ∗ x)]t| ≤ (1 + φo1) max
|τ|≤3n |xτ − [φo ∗ x]τ| ≤ σρ(1 + ρ)
√2n + 1 .
F2n[ϕo]1 = 4n + 1 √4n + 1F2n[φo]2
2 =
√ 4n + 1Fn[φo]2
2 ≤
2ρ2 √4n + 1.
Dmitrii Ostrovskii Adaptive Signal Recovery by Convex Optimization 8 / 31
ϕ∈Cn(Z)
R √2n + 1
ϕ∈Cn(Z)
√ 2n + 1Fn[ϕ]1
(PUF) Pointwise upper bound for uniform-fit estimators Let x be (⌈ n
2⌉, ρ)-recoverable. Let R = 2ρ2 for the constrained estimator,
and λ = 2
|x0 − [ ϕ ∗ y]0| ≤ Cσρ4 log[(2n + 1)/δ] √2n + 1 . High price of adaptation: O(ρ3√log n).
*[Juditsky and Nemirovski, 2009]
Dmitrii Ostrovskii Adaptive Signal Recovery by Convex Optimization 9 / 31
Let ϕ be an optimal solution to (CUF) with R = R, and let Θn(ζ) = Fn[ζ]∞ = O(
w.h.p.
|[x − ϕ ∗ y]0| ≤ σ|[ ϕ ∗ ζ]0| + |[x − ϕ ∗ x]0| ≤ σFn[ ϕ]1Fn[ζ]∞ + |[x − ϕ ∗ x]0| [Young’s ineq.] ≤ σΘn(ζ)R √2n + 1 + |[x − ϕ ∗ x]0|. [Feasibility of ϕ]
ϕ ∗ x]0|, we can add & subtract convolution with ϕo: |x0 − [ ϕ ∗ x]0| ≤ |[ϕo ∗ (x − ϕ ∗ x)]0| + |[(1 − ϕ) ∗ (x − ϕo ∗ x)]0| ≤ Fn[ϕo]1Fn[x − ϕ ∗ x]∞ + (1 + ϕ1)[x − ϕo ∗ x]∞ ≤ R √2n + 1Fn[x − ϕ ∗ x]∞ + CR(1 + R) √2n + 1 .
Dmitrii Ostrovskii Adaptive Signal Recovery by Convex Optimization 10 / 31
ϕ ∗ x]∞ which can be done as follows: Fn[x − ϕ ∗ x]∞ ≤ Fn[y − ϕ ∗ y]∞ + σFn[ζ − ϕ ∗ ζ]∞ ≤ Fn[y − ϕ ∗ y]∞ + σ(1 + ϕ1)Θn(ζ) ≤ Fn[y − ϕo ∗ y]∞ + σ(1 + ϕ1)Θn(ζ) [Feas. of ϕo] ≤ Fn[x − ϕo ∗ x]∞ + 2σ(1 + R)Θn(ζ).
Fn[x − ϕo ∗ x]∞ ≤ Fn[x − ϕo ∗ x]2 = [x − ϕo ∗ x]2 [Parseval’s identity] ≤ √ 2n + 1 x − ϕo ∗ x∞ ≤ σCR. Collecting the above, we obtain a bound dominated by σCR(1+R)Θn(ζ)
√2n+1
.
Dmitrii Ostrovskii Adaptive Signal Recovery by Convex Optimization 11 / 31
Proposition: pointwise lower bound For any integer n ≥ 2, α < 1/4, and ρ satisfying 1 ≤ ρ ≤ nα, one can point out a family of signals Xn,ρ ∈ C2n(Z) such that
x0 of x0 from observations y ∈ C2n(Z), one can find x ∈ Xn,ρ satisfying P
x0| ≥ cσρ2 (1 − 4α) log n √2n + 1
Conclusion: there is a gap ρ2 between upper and lower bounds.
Dmitrii Ostrovskii Adaptive Signal Recovery by Convex Optimization 12 / 31
ϕ∈Cn(Z)
R √2n + 1
(CLS)
For the analysis, we have to restrict the set of signals, introducing shift-invariant subspaces (s.-i.s.)
an invariant subspace of the unit lag operator [∆x]t = xt−1.
Dmitrii Ostrovskii Adaptive Signal Recovery by Convex Optimization 13 / 31
Theorem: sharp ℓ2-oracle inequality for least-squares estimators Suppose that x belongs to some s.-i.s. S, and let ϕo be feasible in (CLS): Fn[ϕo]1 ≤ R √2n + 1. For any δ ∈ (0, 1], an optimal solution ϕ to (CLS) w.p. ≥ 1 − δ satisfies x − ϕ ∗ yn,2 ≤ x − ϕo ∗ yn,2 + Cσ √2n + 1
2n + 1 δ
2⌉, ρ)-recoverable, and let R = 2ρ2.
Then, ϕo = φo ∗ φo satisfies x − ϕo ∗ yn,2 = O
√2n+1
x − ϕ ∗ yn,2 = O
Dmitrii Ostrovskii Adaptive Signal Recovery by Convex Optimization 14 / 31
ϕ∈Cn(Z)
2 : Fn[ϕ]1 ≤
R √2n + 1
ϕ ∗ y2
2 ≤ y − ϕo ∗ y2 2.
x − ϕ ∗ y2
2 = x − ϕo ∗ y2 2 + 2σ2Reξ,
ϕ ∗ ξ + [...]
ϕ ∗ ξn with the cyclic one ⊛: ξ, ϕ ⊛ ξ = Fn[ξ], Fn[ ϕ ⊛ ξ] [Parseval] = √ 2n + 1Fn[ξ], Fn[ ϕ] ⊙ Fn[ξ] [Diagonalization] ≤ √ 2n + 1Fn[ξ]2
∞Fn[
ϕ]1 [Young] ≤ CR log 2n + 1 δ
indexed by ϕ, and control its maximum on ℓ1-ball.
Dmitrii Ostrovskii Adaptive Signal Recovery by Convex Optimization 15 / 31
x − ϕ ∗ y2
2 ≤ x − ϕo ∗ y2 2 + 2σReξ, x − ϕo ∗ y
− 2σReξ, x − ϕ ∗ y.
ξ, x − ϕ ∗ y = ΠSξ, x − ϕ ∗ y + σΠ⊥
S ξ,
ϕ ∗ ξ + ξ, Π⊥
S [x −
ϕ ∗ x], where ΠS is the projector onto S.
Re ΠSξ, x − ϕ ∗ y ≤ x − ϕ ∗ y2
1 δ
S ξ,
ϕ ∗ ξ is bounded similarly to ξ, ϕ ∗ ξ.
Π⊥
S [x −
ϕ ∗ x] ≡ [Π⊥
S x] −
ϕ ∗ [Π⊥
S x] ≡ 0.
Dmitrii Ostrovskii Adaptive Signal Recovery by Convex Optimization 16 / 31
We summarize the risk multiplier for
σ √2n+1 (up to a constant factor):
Pointwise loss ℓ2-loss Oracle ρ ρ (Adaptive) lower bound ρ2√log n ρ√log n ∗ (Adaptive) upper bound ρ4√log n ρ2 + ρ√log n +
In fact, one can also control the pointwise loss for least-squares estimators, so that ρ4√log n can be replaced with ρ3 + ρ2√log n + ρ
∗Obtained via a simple argument from the corresponding pointwise bound. Dmitrii Ostrovskii Adaptive Signal Recovery by Convex Optimization 17 / 31
Assume that x ∈ S ⊂ C∞(Z), a shift-invariant subspace with dim(S) = s. Equivalent formulations:
[P(∆)x]t ≡ 0, t ∈ Z, where ∆ : [∆x]t = xt−1 is the lag operator, and P(z) is a polynomial with deg(P) = s.
xt =
r
qk(t)eλkt, λk ∈ C, where deg(qk) − 1 is the multiplicity of the root zk = eλk of P(z). Unknown shift-invariant structure of x is encoded by S, or equivalently, P.
Dmitrii Ostrovskii Adaptive Signal Recovery by Convex Optimization 18 / 31
Signals from shift-invariant subspaces admit oracle filters with ρ = ρ(s). Theorem Let x ∈ S where S is a shift-invariant subspace, dim(S) = s. Then, for any n ≥ s there exists a filter φo ∈ Cn(Z) which satisfies xt − [φo ∗ x]t≡0 and φo2 ≤
2n + 1.
encompassing general differential inequalities*: P(D)f Lp ≤ L, deg(P) ≤ s.
*[Juditsky and Nemirovski, 2010]
Dmitrii Ostrovskii Adaptive Signal Recovery by Convex Optimization 19 / 31
One-sided filters: φo ∈ C+
n (Z) = {ϕ ∈ Cn(Z) : ϕτ = 0 for τ < 0}.
xt =
r≤s
qk(t)eiωkt, ωk ∈ [0, 2π). We improve over the state-of-the art bound* φo2 ≤
n + 1 : Theorem Under the premise of the previous theorem, there exists φo ∈ C+
n (Z):
xt − [φo ∗ x]t ≡ 0 and φo2 ≤
n + 1 .
*[Juditsky and Nemirovski, 2013]
Dmitrii Ostrovskii Adaptive Signal Recovery by Convex Optimization 20 / 31
Goal: recover an ordinary harmonic oscillation on the whole [−n, n]: xτ =
s
Ckeiωkτ, ωk ∈ [0, 2π).
2π 2n+1.
center + one-sided oracles in the border zones of size n/(s log n). Arbitrary frequencies Separated frequencies AST O(n−1/4) – slow rate
σ √n · (s log n)1/2 – optimal
One-sided recovery
σ √n · s2 log n σ √n · [s + (s log n)1/2]
Two-zone recovery
σ √n · s3/2 log n σ √n · [s + (s log n)1/2] *[Bhaskar et al., 2013; Tang et al., 2013]
Dmitrii Ostrovskii Adaptive Signal Recovery by Convex Optimization 21 / 31
min
ϕ∈Φ(r) {F(ϕ) + Pen(ϕ)} ,
where F(ϕ) = Fn[y − y ∗ ϕ]∞ for uniform-fit recovery, Fn[y − y ∗ ϕ]2
2
for least-squares recovery, Pen(ϕ) := µFn[ϕ]1, and Φ(r) := {ϕ ∈ Cn(Z) : Fn[ϕ]1 ≤ r} .
First-order proximal methods
Dmitrii Ostrovskii Adaptive Signal Recovery by Convex Optimization 22 / 31
Least-squares recovery min
ϕ∈Φ(r)
2 + Pen(ϕ)
Uniform-fit recovery min
ϕ∈Φ(r) {Fn[y − y ∗ ϕ]∞ + Pen(ϕ)}
= min
ϕ∈Φ(r) max ψ∈Φ(1) {ψ, y − y ∗ ϕ + Pen(ϕ)} .
[Nesterov and Nemirovski, 2013; Juditsky and Nemirovski, 2011a,b; Nemirovski et al., 2010]
Dmitrii Ostrovskii Adaptive Signal Recovery by Convex Optimization 23 / 31
Constrained uniform-fit (Mirror Prox)
1 101 102
Absolute accuracy
10-2 10-1 100 101 CMP-`2 CMP-`2-Gap
Constrained least-squares (Fast Gradient Method)
1 101 102 10-4 10-3 10-2 10-1 100 101 102 103 FGM-`2 FGM-`2-Gap
Convergence of the residual (95% upper confidence bound) for harmonic
Dashed: online accuracy bounds via the accuracy certificate technique.
Dmitrii Ostrovskii Adaptive Signal Recovery by Convex Optimization 24 / 31
Theorem Approximate solutions ˜ ϕ with objective accuracy ε∗ = σρ2 for uniform fit,
the exact solutions (up to a constant).
Corollary To reach the threshold accuracy ε∗, in each case it is sufficient to perform T∗ = O(Fn[y]∞/σ) iterations of the suitable first-order algorithm (CMP or FGM).
Dmitrii Ostrovskii Adaptive Signal Recovery by Convex Optimization 25 / 31
SNR!1
0.06 0.12 0.25 0.5 1 2 4
`2-error
0.025 0.05 10-1 0.25 0.5 100
Lasso Coarse Fine SNR!1
0.06 0.12 0.25 0.5 1 2 4
CPU time (s)
10-3 10-2 10-1 100 101
Lasso Coarse Fine SNR!1
0.06 0.12 0.25 0.5 1 2 4
`2-error
0.025 0.05 10-1 0.25 0.5 100
Lasso Coarse Fine SNR!1
0.06 0.12 0.25 0.5 1 2 4
CPU time (s)
10-3 10-2 10-1 100 101
Lasso Coarse Fine
0.01σρ2-accurate solution (Fine), and the oversampled Lasso estimator∗. Two signal generation scenarios are compared: 4 random frequencies on [0, 2π] (left) and 2 random pairs of 0.2π
n -close frequencies (right). Bhaskar et al. [2013]
Dmitrii Ostrovskii Adaptive Signal Recovery by Convex Optimization 26 / 31
SNR
10-2 100 102
T$
100 101 102 CMP-`2
SNR
10-2 100 102
T$
100 101 102 FGM-`2
for (CUF), left, and (CLS), right (signal with 4 random frequencies).
Dmitrii Ostrovskii Adaptive Signal Recovery by Convex Optimization 27 / 31
Constrained least-squares can be recast as (non-squared) ℓ2-minimization: min
ϕ∈Φ(r) Res2(ϕ) := Fn[y − y ∗ ϕ]2.
Objective is non-smooth but can be minimized at rate O(1/k2) by FGM:
Res2
2(ϕk) − Res2 2( ˜
ϕ∗) ≤ Q k2 , where ϕ∗ is any minimizer of Res2
2(·) on Φ(r), and Q is a constant.
Res2( ˜ ϕk) − Res2(ϕ∗) ≤ Q (Res2( ˜ ϕk) + Res2(ϕ∗))k2 ≤ Q 2Res2(ϕ∗)k2 (Note that this requires the “non-ideal” fit: Res2(ϕ∗) > 0.)
Dmitrii Ostrovskii Adaptive Signal Recovery by Convex Optimization 28 / 31
Res2( ˜ ϕk) ≤
2(ϕ∗) + Q
k2 ≤ Res2(ϕ∗) + √Q k .
Res2( ˜ ϕk) − Res2(ϕ∗) ≤ min √Q k , Q 2Res2(ϕ∗)k2
i.e. there is an “elbow” at k ≈
√Q 2Res2(ϕ∗). Confirmed empirically:
1 101 102 10-4 10-2 100 102 CMP-`2 FGM-`2
solved with Mirror Prox and FGM (2 pairs of close frequencies, SNR = 4).
Dmitrii Ostrovskii Adaptive Signal Recovery by Convex Optimization 29 / 31
shift-invariant structure;
estimators, and compare them with lower bounds.
shift-invariant subspace without frequency separation assumptions.
Dmitrii Ostrovskii Adaptive Signal Recovery by Convex Optimization 30 / 31
yτ = [a ∗ x]τ + σξτ, where a ∈ Cm(Z) is a known filter. Applications: inverse PDEs1, fluorescence microscopy2, exoplanet detection3, ...
Applications: social network analysis, sensor networks,... Challenge: no FFT, difficult to work in the Fourier domain.
1[Cavalier et al., 2002], 2[Waters, 2009; Bissantz et al., 2015], 3[Fischer et al., 2015; Kim et al.,
2017],
4[Sandryhaila and Moura, 2013] Dmitrii Ostrovskii Adaptive Signal Recovery by Convex Optimization 31 / 31
Publications and preprints:
Adaptive Signal Recovery by Convex Optimization. COLT 2015.
Structure-Blind Signal Recovery. NIPS 2016. extended version: arXiv:1607.05712.
Efficient First-Order Algorithms for Adaptive Signal Denoising. Submitted to ICML 2018.
Adaptive Signal Recovery: an Overview. In preparation.
Adaptive Signal Deconvolution by Convex Optimization. In preparation.
Adams, R. A. and Fournier, J. J. (2003). Sobolev spaces, volume 140. Academic press. Bhaskar, B., Tang, G., and Recht, B. (2013). Atomic norm denoising with applications to line spectral estimation. IEEE Trans. Signal Processing, 61(23):5987–5999. Bickel, P., Ritov, Y., and Tsybakov, A. (2009). Simultaneous analysis of Lasso and Dantzig selector. Ann. Statist., 37(4):1705–1732. Bissantz, K., Bissantz, N., and Proksch, K. (2015). Monitoring of Significant Changes Over Time in Fluorescence Microscopy Imaging of Living Cells. Universit¨ atsbibliothek Dortmund. Brown, L. D., Low, M. G., et al. (1996). Asymptotic equivalence of nonparametric regression and white noise. The Annals of Statistics, 24(6):2384–2398. B¨ uhlmann, P. and Van De Geer, S. (2011). Statistics for high-dimensional data: methods, theory and applications. Springer Science & Business Media.
Cavalier, L., Golubev, G., Picard, D., Tsybakov, A., et al. (2002). Oracle inequalities for inverse problems. The Annals of Statistics, 30(3):843–874. Donoho, D. L., Liu, R. C., and MacGibbon, B. (1990). Minimax risk over hyperrectangles, and implications. The Annals of Statistics, pages 1416–1437. Fischer, D. A., Howard, A. W., Laughlin, G. P., Macintosh, B., Mahadevan, S., Sahlmann, J., and Yee, J. C. (2015). Exoplanet detection techniques. arXiv preprint arXiv:1505.06869. Goldenshluger, A., Lepski, O., et al. (2011). Bandwidth selection in kernel density estimation: oracle inequalities and adaptive minimax optimality. The Annals of Statistics, 39(3):1608–1632. Goldenshluger, A. and Nemirovski, A. (1997). Adaptive de-noising of signals satisfying differential inequalities. IEEE Transactions on Information Theory, 43(3):872–889. Ibragimov, I. and Khasminskii, R. (1984). Nonparametric estimation of the value
29:1–32. Johnstone, I. (2011). Gaussian estimation: sequence and multiresolution models.
Juditsky, A. and Nemirovski, A. (2009). Nonparametric denoising of signals with unknown local structure, I: Oracle inequalities. Appl. & Comput. Harmon. Anal., 27(2):157–179. Juditsky, A. and Nemirovski, A. (2010). Nonparametric denoising signals of unknown local structure, II: Nonparametric function recovery. Appl. &
Juditsky, A. and Nemirovski, A. (2011a). First-order methods for nonsmooth convex large-scale optimization, I: general purpose methods. Optimization for Machine Learning, pages 121–148. Juditsky, A. and Nemirovski, A. (2011b). First-order methods for nonsmooth convex large-scale optimization, II: utilizing problem structure. Optimization for Machine Learning, pages 149–183. Juditsky, A. and Nemirovski, A. (2013). On detecting harmonic oscillations. Bernoulli, 23(2):1134–1165. Juditsky, A. and Nemirovski, A. (2017). Near-optimality of linear recovery from indirect observations. arXiv preprint arXiv:1704.00835.
Kim, T. H., Lee, K. M., Sch¨
deblurring via dynamic temporal blending network. In IEEE International Conference on Computer Vision (ICCV 2017). Laurent, B. and Massart, P. (2000). Adaptive estimation of a quadratic functional by model selection. Ann. Statist., 28(5):1302–1338. Lepski, O. (1991). On a problem of adaptive estimation in Gaussian white noise. Theory of Probability & Its Applications, 35(3):454–466. Lepski, O. et al. (2015). Adaptive estimation over anisotropic functional classes via oracle approach. The Annals of Statistics, 43(3):1178–1242. Lepski, O., Mammen, E., and Spokoiny, V. (1997). Optimal spatial adaptation to inhomogeneous smoothness: an approach based on kernel estimates with variable bandwidth selectors. The Annals of Statistics, pages 929–947. Nadaraya, E. A. (1964). On estimating regression. Theory of Probability & Its Applications, 9(1):141–142. Nemirovski, A. (1991). On non-parametric estimation of functions satisfying differential inequalities.
Nemirovski, A., Onn, S., and Rothblum, U. (2010). Accuracy certificates for computational problems with convex structure. Mathematics of Operations Research, 35(1):52–78. Nesterov, Y. and Nemirovski, A. (2013). On first-order algorithms for ℓ1/nuclear norm minimization. Acta Numerica, 22:509–575. Sandryhaila, A. and Moura, J. M. (2013). Discrete signal processing on graphs. IEEE transactions on signal processing, 61(7):1644–1656. Tang, G., Bhaskar, B., and Recht, B. (2013). Near minimax line spectral
Conference on, pages 1–6. IEEE. Tibshirani, R. (1996). Regression shrinkage and selection via the Lasso. J. R.
Tsybakov, A. (2008). Introduction to Nonparametric Estimation. Springer. Waters, J. C. (2009). Accuracy and precision in quantitative fluorescence
Watson, G. S. (1964). Smooth regression analysis. Sankhy¯ a: The Indian Journal