\newcommand{\Om}{\boldsymbol{\Omega}} \newcommand{\G}{\boldsymbol{\Gamma}} % \newcommand{\inv}[1]{{}^{#1}} $$ </div> Generate samples `\(\pb{ \q_1, \q_2, \dots, \q_n } \sim \pi\)` from a target distribution .content-box-blue[ $$ \pi(\q) \propto e^{- U(\q) } $$ ] -- ### .orange[Methods] 1. .large[Rejection sampling ❌ ]<br> 2. .large[Random-walk Metropolis ]✅<br> -- 3. .large[Metropolis adjusted Langevin algorithm ]😮<br> 4. .large[Hamiltonian Monte Carlo ]😲<br> -- 5. .large[Stein Variational gradient descent ]🤯<br> 6. .large[Wasserstein gradient flows ]🤯<br> 7. .large[Normalizing flows ]🤯<br> 8. ... --- class: left count: false # .left[Objective] <br><br><br> .Large.content-box-orange[<body> Can we design a .bolder.dblue.text-shadow[sampler] which:<br><br> 1. Can efficiently sample from .bolder.dblue.text-shadow[multi-modal distributions]?<br><br> 2. .bolder.dblue.text-shadow[Preserves all the nice properties] of existing mainstays? </body>] --- class: inverse, center, middle count: false # Background ## (⚡️ Speed) <!-- --- class: left ### .dblue[Robust TDA] .large[ 1. Robust Persistent Diagrams using Reproducing Kernels] .large[ 2. Efficient and Robust Topological Inference] ### .dblue[Statistical Inference using TDA] .large[ 3. Statistical Invariance of Betti Numbers] ### .dblue[Differential Privacy under TDA Lens] .large[ 4. The Shape of Edge Differential Privacy] ### .dblue[Geometry in Multimodal Sampling and Non-convex Optimization] .large[ 5. Repelling-Attracting Hamiltonian Monte Carlo] --> <style type="text/css"> .panelset { --panel-tab-foreground: currentColor; --panel-tab-background: unset; --panel-tab-active-foreground: #0148A4; --panel-tab-active-background: unset; --panel-tab-active-border-color: currentColor; --panel-tab-hover-foreground: currentColor; --panel-tab-hover-background: unset; --panel-tab-hover-border-color: currentColor; --panel-tab-inactive-opacity: 0.3; --panel-tabs-border-bottom: #ddd; } </style> --- class: left # The Problem with Diffusions .pull-left[[Random Walk Metropolis] <body><br> 1) Given the current state \(\q_n\), propose a new state $$ \q_{n+1} \sim \kappa(\q|\q_n) $$ 2) Accept / Reject with probability $$ \alpha(\q_{n+1} | \q_n) = 1 \wedge \f{\kappa(\q_{n}|\q_{n+1})\pi(\q_{n+1})}{\kappa(\q_{n+1}|\q_{n})\pi(\q_{n})} $$ </body>] -- .pull-right[ <img src=images/gifs/rw2.gif height="400"/> ] --[ Wastes too much time in high dimensions. ] --- class: left # More intelligent diffusions .pull-left[[Langevin Monte Carlo] <body><br> 1) Given the current state \(\q_n\), propose a new state $$ \q_{n+1} \sim \q_n + {h}\D\log\pi(\q_n) + 2h \N(\zerov, \Sigma) $$ 2) Accept / Reject with probability $$ \alpha(\q_{n+1} | \q_0) = 1 \wedge \f{\kappa(\q_{n}|\q_{n+1})\pi(\q_{n+1})}{\kappa(\q_{n+1}|\q_{n})\pi(\q_{n})} $$ </body>] -- .pull-right[ <img src=images/gifs/l2.gif width="400"/> ] --[ Only works for small step sizes! ] --- class: inverse, center, middle count: false # Hamiltonian Monte Carlo --- class: center, middle .center[ <img src= width=70%> ] --- class: left # Why walk when you can flow? 1. Augment state space with auxiliary variables `\(\p \sim \N(\boldsymbol{0}, \S)\)` 2. Joint distribution `\((\q,\p) \sim \exp(-H(\q,\p))\)` where `\(H(\q,\p)\)` is given by `\begin{align} H(\q,\p) = U(\q) + \f{1}{2} \p^{\top}\S^{-1}\p \end{align}` 3. Treat `\(H(\q,\p)\)` as the Hamiltonian of a system and generate trajectories using Hamiltonian dynamics <body>$$\dd{t}\begin{bmatrix}\q_t\\ \p_t\end{bmatrix} = \begin{bmatrix}\Ov & \I\\ -\I & \Ov\end{bmatrix}\begin{bmatrix}\D_{\!\q}H(\q_t, \p_t)\\ \D_{\!\p}H(\q_t, \p_t)\end{bmatrix} = \begin{bmatrix}\S\inv\p_t\\ - \D U(\q_t)\end{bmatrix}.$$ </body> 1. Accept/reject state `\((\q_t, \p_t)\)` with probability .center.content-box-blue[<body>$$\alpha(\q_t, \p_t | \q_0, \p_0) = \min\pb{1, \f{\kappa(\q_0, \p_0 | \q_t, \p_t) e^{-H(\q_t,\p_t)}}{\kappa(\q_t, \p_t | \q_0, \p_0) e^{-H(\q_0,\p_0)}} \Bigg| \ddd{(\q_0, \p_0)}{(\q_t, \p_t)} \Bigg| }.$$ </body>] --- class: center count: false # .left[Why walk when you can flow?] <img src=images/gifs/h1.gif height="500"/> --- class: center count: false # .left[Why walk when you can flow?] <img src=images/gifs/h2.gif height="500"/> --- class: center count: false # .left[Why walk when you can flow?] <img src=images/gifs/h3.gif height="500"/> --- class: center count: false # .left[Why walk when you can flow?] <img src=images/gifs/h4.gif height="500"/> --- # The Four Pillars of Hamiltonian Monte Carlo .panelset[ .panel[.panel-name[Energy Conservation] .pull-left[ <br><br> .theorem[<body> For every trajectory \(\pb{(\q_t, \p_t) : t > 0}\) satisfying the Hamiltonian dynamics $$ \dd{t}H(\q_t, \p_t) = 0. $$ </body>] ] .pull-right[ <img src=images/gifs/wo-friction2.gif height="270" /> ] <br>[<body> \(H(\q_t, \p_t) = H(\q_0, \p_0)\) </body>] ] .panel[.panel-name[Reversibility] .pull-left[ <br> .theorem[<body>Let:<br><br> 1. \(\flip: (\q, \p) \mapsto (\q, -\p)\) denote a .bolder.text-shadow[momentum flip]<br> 2. \(\F_t: (\q_0, \p_0) \mapsto (\q_t, \p_t)\) be the .bolder.text-shadow[flow operator]<br><br> Then $$ \flip \circ \F_t \circ \flip \circ \F_t = \id. $$ </body>] ] .pull-right[ <img src=images/gifs/reversible.gif height="270"/> ] <br>[<body> Detailed balance condition is satisfied, i.e., \(\kappa(\q_t, \p_t | \q_0, \p_0) = \mathbf{1}.\) </body>] ] .panel[.panel-name[Volume Preservation] .pull-left[ <br> .theorem[<body> For every trajectory \(\pb{(\q_t, \p_t) : t > 0}\)$$\ddd{(\q_0, \p_0)}{(\q_t, \p_t)} = \Bigg| \begin{bmatrix} \ddd{\q_0}{\q_t} & \ddd{\q_t}{\p_0} \\ \ddd{\p_t}{\q_0} & \ddd{\p_t}{\p_0} \end{bmatrix} \Bigg| = 1.$$ </body>] ] .pull-right[  ] <br>[<body>$$\alpha(\q_t, \p_t | \q_0, \p_0) = \min\pb{1, \f{\kappa(\q_0, \p_0 | \q_t, \p_t) e^{-H(\q_t,\p_t)}}{\kappa(\q_t, \p_t | \q_0, \p_0) e^{-H(\q_0,\p_0)}} \Bigg| \ddd{(\q_0, \p_0)}{(\q_t, \p_t)} \Bigg| } = 1.$$ </body>] ] .panel[.panel-name[Symplecticity] .pull-left[ <br> .theorem[<body>Let \(\omega(\cdot, \cdot)\) be the .bolder.text-shadow[symplectic 2-form] $$ \omega(\q,\p) = \sum_{i=1}^d dq_i \wedge dp_i. $$ Then \(\omega(\q_t, \p_t) = \omega(\q_0, \p_0)\). </body>] ] .pull-right[  ][<body> Enables efficient numeric approximations. </body>] ] ] --- class: left # Symplectic Integration * The solution `\(\F_t: (\q_0, \p_0) \mapsto (\q_t, \p_t)\)` to the system of .bolder.text-shadow.dblue[differential equations] $$ \ddd{t}{\q_t} = \S\inv\p_t, \quad\quad \ddd{t}{\p_t} = - \D U(\q_t) $$ is rarely available in practice. It can be numerically solved using the[leapfrog scheme].<br> -- * Let `\(\e \approx dt\)` be a small[step-size], then `\(\F_{dt} \approx \F_\e: (\q_t, \p_t) \mapsto (\q_{t+\e}, \p_{t+\e})\)` is given by[<body>$$\begin{cases} \p_{t+\f \e 2} = \p_t - \f \e 2 \D U(\q_t) \\[5pt] \q_{t+\e} = \q_t + \e \S\inv \p_{t+\f \e 2}\\[5pt] \p_{t+\e} = \p_{t+\f \e 2} - \f \e 2 \D U(\q_{t+\e}) \end{cases} $$ </body>] -- <br>[<body>For time \(T\) and \(L = \lfloor{T/\e}\rfloor\), we have \(\F_T \approx \F_{\e, L} = (\F_\e)^{\otimes L}\). </body>] --- class: center # .left[Why symplectic integration?] <img src=images/intro/symp2.png width="70%"> -- .content-box-orange[<body> `\(|H\left( \F_{T}(\q,\p) \right) - H\left( \F_{\e, L}(\q,\p) \right)| \approx O(\e^3)\)`. Therefore, `\(\alpha(\q_T, \p_T | \q_0, \p_0) \approx e^{O(\e^3)}.\)`</body>] --- class: left # .left[Is HMC the panacea?] .theorem-blue[<body>For \(b \in \R\) and \(\muv = b \mathbf{1}_d\) consider the multimodal target $$ \pi(\q) \sim \half \mathcal{N}(\q | -\muv, \I_d) + \half \mathcal{N}(\q | +\muv, \I_d), $$ and let \(\mathcal{E}(b, d)\) denote the event that for initial state \(\q_0=-\muv\) $$ \mathcal{E}(b, d) = \pb{ \norm{\P_T(\q_0) - \muv} \le \norm{\P_T(\q_0) + \muv}} $$ Then for all \(b > \sqrt{2/d}\) $$ \pr\pa{\mathcal{E}(b, d)} \le \exp\bigg(-\f{1}{8} \pa{\f{b^2d - 2}{b}}^2\bigg) $$ </body>] .center[<body>Markov chain mixes well only when `\(U(\q)\)` is strongly convex. Terrible for multimodal distributions.</body>] --- class: center count: false # .left[Is HMC the panacea?] <img src=images/gifs/hm1.gif height="400" width="400"/> Markov chain mixes well only when `\(U(\q)\)` is strongly convex. Terrible for multimodal distributions. --- class: center count: false # .left[Is HMC the panacea?] <img src=images/gifs/hm2.gif height="400" width="400"/> Markov chain mixes well only when `\(U(\q)\)` is strongly convex. Terrible for multimodal distributions. --- class: inverse, center, middle count: false # HaRAM ## Hamiltonian Repelling-Attracting Metropolis --- class: left # Observation #1: .orange.text-shadow[Friction dissipates energy] Consider the trajectory of a particle `\((\q_t, \p_t)\)` on a rough surface with friciton `\(\gamma > 0\)` $$ \ddd{t}{\q_t} = \S\inv\p_t, \quad\quad \ddd{t}{\p_t} = - \D U(\q_t) - \gamma\p_t $$ This can be rewritten as <body>$$\dd{t}\begin{bmatrix}\q_t\\ \p_t\end{bmatrix} = \underbrace{\begin{bmatrix}\Ov & \I\\ -\I & \Ov\end{bmatrix}}_{=\Om}\underbrace{\begin{bmatrix}\D_{\!\q}H(\q_t, \p_t)\\ \D_{\!\p}H(\q_t, \p_t)\end{bmatrix}}_{=\D H(\q_t, \p_t)} - \underbrace{\begin{bmatrix}\Ov & \Ov\\ \Ov & \gamma\I\end{bmatrix}}_{=\G}\begin{bmatrix}\q_t\\ \p_t\end{bmatrix}.$$ </body> -- Therefore, <body>$$\dd{t}(\q_t, \p_t) = \Om \D H(\q_t, \p_t) - \G (\q_t, \p_t).\tag{A}$$ </body> -- <br>[<body>Eq. (A) is called a[conformal Hamiltonian system]. </body>] --- class: left count: false # Observation #1: .orange.text-shadow[Friction dissipates energy] .center[ <img src=images/gifs/wo-friction.gif height="400"/> ] --- class: left count: false # Observation #1: .orange.text-shadow[Friction dissipates energy] .center[ <img src=images/gifs/w-friction.gif height="400"/> ] --- class: left # Observation #2: .orange.text-shadow[Negative friction accumulates energy] If we just flip the sign of the friction parameter `\(\gamma\)` we get -- <br><br> <body>$$\dd{t}\begin{bmatrix}\q_t\\ \p_t\end{bmatrix} = \underbrace{\begin{bmatrix}\Ov & \I\\ -\I & \Ov\end{bmatrix}}_{=\Om}\underbrace{\begin{bmatrix}\D_{\!\q}H(\q_t, \p_t)\\ \D_{\!\p}H(\q_t, \p_t)\end{bmatrix}}_{=\D H(\q_t, \p_t)} + \underbrace{\begin{bmatrix}\Ov & \Ov\\ \Ov & \gamma\I\end{bmatrix}}_{=\G}\begin{bmatrix}\q_t\\ \p_t\end{bmatrix}.$$ </body> i.e., <body>$$\dd{t}(\q_t, \p_t) = \Om \D H(\q_t, \p_t) + \G (\q_t, \p_t).\tag{B}$$ </body> -- <br>[<body> Eq. (B) is also a[conformal Hamiltonian system.] </body>] --- class: left count: false # Observation #1: .orange.text-shadow[Negative friction accumulates energy] .center[ <img src=images/gifs/wo-friction2.gif height="400"/> ] --- class: left count: false # Observation #1: .orange.text-shadow[Negative friction accumulates energy] .center[ <img src=images/gifs/neg-friction.gif height="400"/> ][<body>But as `\(t \uparrow\)`, the particle magically gains energy (हराम) </body>] --- class: center, middle count: false # HaRAM ## / हराम / .Large[Proscribed or forbidden by Law.] --- # Hamiltonian Repelling-Attracting Metropolis 1. Choose a hypothetical friction parameter `\(\gamma \in [0, \infty)\)`, and integration time `\(T\)` 2. Let `\(\Om = \begin{bmatrix}\boldsymbol{0} & I\\ -I & \boldsymbol{0}\end{bmatrix}\)` and `\(\G = \begin{bmatrix} \boldsymbol{0} & \boldsymbol{0} \\ \boldsymbol{0} & \gamma \end{bmatrix}\)` 3. For time `\(t \in [0, T/2]\)` generate .dblue[conformal Hamiltonian dynamics] using .red[negative friction] `\begin{align} \f{d}{d t} (\q_t, \p_t) = \Om \D H(\q_t, \p_t) + \G (\q_t, \p_t) \end{align}` -- 1. For time `\(t \in [T/2, T]\)` generate .dblue[conformal Hamiltonian dynamics] using .green[positive friction] `\begin{align} \f{d}{d t} (\q_t, \p_t) = \Om \D H(\q_t, \p_t) - \G (\q_t, \p_t) \end{align}` -- 1. Accept/reject state `\((\q_T, \p_T)\)` with MH probability <body>$$\alpha(\q_T, \p_T | \q_0, \p_0) = \min\pb{1, \f{\kappa(\q_0, \p_0 | \q_T, \p_T) e^{-H(\q_T,\p_T)}}{\kappa(\q_T, \p_T | \q_0, \p_0) e^{-H(\q_0,\p_0)}} \Bigg| \ddd{(\q_0, \p_0)}{(\q_t, \p_t)} \Bigg| }$$ </body> --- count: false # Hamiltonian Repelling Attracting Metropolis .center[ <img src=images/gifs/downup.gif height="400"/> ] --- count: false # Hamiltonian Repelling Attracting Metropolis .center[ <img src=images/gifs/downup2.gif height="400"/> ] --- count: false # Hamiltonian Repelling Attracting Metropolis .center[ <img src=images/gifs/downup3.gif height="400"/> ] --- class: left # Properties of HaRAM .panelset[ .panel[.panel-name[Energy] .left.content-box-purple[<body> For the Haram trajectory \(\pb{(\q_t, \p_t) : t > 0}\)$$\dd{t}H(\q_t, \p_t) \neq 0,$$but$$\int\limits_0^T\dd{t}H(\q_t, \p_t) dt = 0.$$ </body>] <br>[<body> This implies that \(H(\q_t, \p_t) = H(\q_0, \p_0)\) only when \(t=T\).</body>] ] .panel[.panel-name[Reversibility] .pull-left[ .left.content-box-purple[<body> Let .red[<body>\(\F_t\)</body>] and .green[<body>\(\P_t\)</body>] be the conformal .bolder.text-shadow[Hamiltonian flow operators] w.r.t. the .red[repelling] and .green[attracting] dynamics, and let \(\flip\) be the .bold[momentum-flip] operator. Then for all \(t \le T/2\)$$\flip \circ \P_t \circ \flip = \F_{-t}$$ In particular,$$\flip \circ \pa{\P_T \circ \F_T} \circ \flip \circ \pa{\P_T \circ \F_T} = \id.$$ </body>] <br>[<body> This implies that \(\kappa(\q_T, \p_T | \q_0, \p_0) = \onev\).</body>] ] .pull-right[ <img src="images/gifs/reversible.gif" height="270"/> ] ] .panel[.panel-name[Volume] .left.content-box-purple[<body> .red[<body>\(\F_t\) expands volume</body>] and .green[<body>\(\P_t\) shrinks volume</body>]. But, over the entire time interval \([0,T]\):$$\ddd{(\q_0, \p_0)}{(\q_T, \p_T)} = 1.$$ </body>] <br>[<body> This implies that the Jacobian correction term \(\Bigg|\ddd{(\q_0, \p_0)}{(\q_T, \p_T)}\Bigg|\) is not needed.</body>] ] .panel[.panel-name[Symplecticity] .left.content-box-purple[<body>The deformation of the .bolder.text-shadow[symplectic 2-form] is given by $$ \omega(\q_t, \p_t) = \begin{cases} e^{\gamma t}\omega(\q_0, \p_0), & 0 \le t \le T/2\\ \\ e^{\gamma (T-t)}\omega(\q_0, \p_0), & T/2 \le t \le T \end{cases}. $$ In particular, HaRAM preserves the .bolder.text-shadow[symplectic 2-form], i.e.,$$\omega(\q_T, \p_T) = \omega(\q_0, \p_0).$$</body>] ] .panel[.panel-name[Numerical Integration] For `\(\e \approx dt\)` and the .red[repelling] conformal Hamiltonian flow `\(\F_\e \approx \F_{dt}\)` the .red[conformal symplectic leapfrog algorithm] is $$ `\begin{cases} \tp_{t+\f \e 2} = e^{\gamma\e/2}\p_t \\[5pt] \p_{t+\f \e 2} = \tp_{t+\f \e 2} - \f \e 2 \D U(\q_t) \\[5pt] \q_{t+\e} = \q_t + \e\S\inv \tp_{t+\f \e 2}\\[5pt] \tp_{t+\e} = \tp_{t+\f \e 2} - \f \e 2 \D U(\q_{t+\e})\\[5pt] \p_{t+\e} = e^{\gamma\e/2}\p_{t+\e} \end{cases}` $$ and for `\(L = \lfloor T/\e \rfloor\)`, `\(\F_T \approx \F_{\e, L} = (\F_\e)^{\otimes L}\)`. Similarly, `\(\P_T \approx \P_{\e, L} = (\P_\e)^{\otimes L}\)` by changing the sign of `\(\gamma\)`.[<body> `\(|H\left( \P_{T} \circ \F_{T}(\q,\p) \right) - H\left( \P_{\e, L} \circ \F_{\e, L}(\q,\p) \right)| \approx O(\e^3)\)`</body>] ] ] --- class: inverse, center, middle count: false # Experiments --- class: left # Bimodal Gaussian .panelset[ .panel[.panel-name[Setup] .pull-left[ For `\(\muv = 5 \onev_d \in \R^d\)` the target is $$ \pi(\q) \sim \half \mathcal{N}(\muv, \Sigma_1) + \half \mathcal{N}(-\muv, \Sigma_2) $$ where * `\(\Sigma_1(i, j) = 0.5^{|i-j|}\)` <br> * `\(\Sigma_2 = R \Sigma_1 R^\top\)` for `\(R \in SO(d)\)` <br> .content-box-blue[<body> `\(d \in \pb{2, 10, 50}\)` for HaRAM, HMC, RAM & PEHMC</body>] ][ <img src="images/plots/gaussian3/d/plt.svg"> ] .panel[.panel-name[Results] | Method | Time/L(d=2) | Time/L(d=10) | Time/L(d=50) | W2 metric (d=2) | W2 metric (d=10) | W2 metric (d=50) | |-------:|:-----------:|:------------:|:------------:|:---------------:|:----------------:|:----------------:| | HaRAM | 0.2697 s | 0.7125 s | 2.9816 s | 0.51 | 7.93 | 7.78 | | HMC | 0.2555 s | 0.6602 s | 2.9375 s | 17.16 | 45.93 | 48.04 | | RAM | 0.5364 s | 0.5847 s | 33.7778 s | 0.9 | 33.88 | 51.52 | | MHMC | 0.5211 s | 1.3530 s | --- | 18.41 | 89.50 | 47.84 | ] .panel[.panel-name[Mixing] .pull-left[.center[ <img src="images/plots/gaussian3/d/plt2_acf.svg"> d=2 ]] .pull-right[.center[ <img src="images/plots/gaussian3/d/plt10_acf.svg"> d=10 ]] ] .panel[.panel-name[Trace][ <img src="images/plots/gaussian3/d/plt2_tr.svg"> ][ <img src="images/plots/gaussian3/d/plt10_tr.svg"> ] ] .panel[.panel-name[Scatter][ <img src="images/plots/gaussian3/d/scatter2.svg"> d=2 ][ <img src="images/plots/gaussian3/d/scatter10.svg"> d=10 ] ] .panel[.panel-name[Parallel Tempering (d=2)] <img src="images/plots/gaussian3/d/plt2_tr_all.svg" height="60%"> ] .panel[.panel-name[Parallel Tempering (d=10)] <img src="images/plots/gaussian3/d/plt10_tr_all.svg" height="60%"> ] ] ] --- class: left # Benchmark Dataset .panelset[ .panel[.panel-name[Setup] .center[ <img src="figures/sims/gaussian3/d2.svg"> .content-box-blue[<body> `\(20\)` component Gaussian Mixture given in Kou et al. (2006)</body>] ] ] .panel[.panel-name[Mixing][ <br><br> | Method | Time/L | W2 metric | |-------:|:-----------:|:---------------:| | HaRAM | 0.2697 s | 133.733 | | HMC | 0.2555 s | 2989.539 | | RAM | 0.5364 s | 309.421 | | MHMC | 0.5211 s | 1916.740 | ][ <img src="figures/sims/gaussian3/acf-d2.svg"> ] ] .panel[.panel-name[Trace] .center[ <img src="figures/sims/gaussian3/trace-d2.svg"> ] ] .panel[.panel-name[Scatter] .center[ <img src="figures/sims/gaussian3/scatter-d2.svg"> ] ] ] --- class: left # Vanilla Gaussian Distribution[ <img src="figures/sims/gaussian1/d2.svg"> ] --[ <img src="figures/sims/gaussian1/acf-d2.svg"> ] --- class: left # Autotuning .panelset[ .panel[.panel-name[Setup] For `\(\muv = \frac{5}{\sqrt{d}} \onev_d \in \R^d\)` the target is $$ \pi(\q) \sim \half \mathcal{N}(\muv, I) + \half \mathcal{N}(-\muv, I) $$ .content-box-blue[<body> Nesterov dual averaging for `\((\epsilon, \gamma, L)\)` </body>] .panel[.panel-name[d=3] .center[ <img src="images/plots/autotune/d3.svg" height="350"> ] ] .panel[.panel-name[d=10] .center[ <img src="images/plots/autotune/d10.svg" height="350"> ] ] .panel[.panel-name[d=50] .center[ <img src="images/plots/autotune/d50.svg" height="350"> ] ] .panel[.panel-name[d=100] .center[ <img src="images/plots/autotune/d3.svg" height="350"> ] ] ] ] --- class: left # Bayesian Neural Network .panelset[ .panel[.panel-name[Setup] .pull-left[ <img src="images/plots/nn/scatter.svg"> ][ <img src="images/plots/nn/nn.png"> ] .panel[.panel-name[Posterior prediction surface] <img src="images/plots/nn/contourf.svg"> ] .panel[.panel-name[Posterior prediction quantiles] .center[ <img src="images/plots/nn/all.svg"> ] ] ] ] --- class: left # Time Delay Estimation .panelset[ .panel[.panel-name[Gravitational Lensing] .pull-left[ <img src="images/plots/astro/lensing.jpg"> ] .pull-right[ <img src="images/plots/astro/ts-original.svg" height="250"> $$ `\begin{align} dX(t)&=-\frac{1}{\tau}(X(t)-\mu)dt+\sigma dB(t)\\ Y(t)&=X(t-\Delta)+\beta \end{align}` $$ ] .panel[.panel-name[Model] .pull-left[ $$ `\begin{align} x_i\mid X(t_i) &\sim N(X(t_i),~ \sigma^2_{x, i})\\ y_i\mid Y(t_i) &\sim N(Y(t_i),~ \sigma^2_{y, i})~~\textrm{and}\\ y_i\mid X(t_i-\Delta), \Delta, \beta &\sim N(X(t_i-\Delta) + \beta,~ \sigma^2_{y, i}), \end{align}` $$ $$ `\begin{align} \Delta &\sim \textrm{Unif}(-1178.939,~ 1179.939)\\ \beta &\sim \textrm{Unif}(-60,~ 60)\\ \mu &\sim \textrm{Unif}(-30,~ 30)\\ \sigma^2 &\sim \textrm{inverse-Gamma}(1, 2\times 10^{-7})\\ \tau &\sim \textrm{inverse-Gamma}(1, 1). \end{align}` $$ ] .pull-right[<img src="images/plots/astro/hist.svg">] ] .panel[.panel-name[Aligned Light Curves] .center[ <img src="images/plots/astro/match.svg"> ] ] ] ] --- class: left # Code .pull-left[ ```julia using main using Distributions, DynamicPPL, Plots x = randn(1000) y = x .+ randn(1000) .* e @model function lr(x, y) σ2 ~ Truncated(Normal(), 1e-6, Inf) b0 ~ Cauchy(2.0) b1 ~ Normal(0.0, 1.0) y ~ MvNormal(b0 .+ b1 .* x, σ2 * I) end samples, accepts = mcmc( DualAverage(λ=10, δ=0.8) HaRAM(); lr(x, y), n=1e4, n_burn=1e4 ); ``` ] .pull-right[ <img src="images/plots/qr.png">[<body> </body>] ] ``` --- # References 1. Arnold, V. I. (2013). *Mathematical methods of classical mechanics* 2. Betancourt, M. (2017). *A Conceptual Introduction to Hamiltonian Monte Carlo* 3. Betancourt, M., Byrne, S., Livingstone, S. & Girolami, M. (2017). *The geometric foundations of Hamiltonian monte carlo* 4. Duane, S., Kennedy, A. D., Pendleton, B. J. & Roweth, D. (1987). *Hybrid Monte Carlo* 5. Hairer, E., Lubich, C. & Wanner, G. (2006). *Geometric Numerical Integration* 6. McLachlan, R. & Perlmutter, M. (2001). *Conformal Hamiltonian systems* 7. Neal, R. (2011). *MCMC Using Hamiltonian Dynamics* 8. Tak, H., Meng, X.-L. & van Dyk, D. A. (2018). *A repelling-attracting Metropolis algorithm for multimodality.* 9. Tierney, L. (1994). *Markov chains for exploring posterior distributions.* 10. Tripuraneni, N., Rowland, M., Ghahramani, Z. & Turner, R. (2017). *Magnetic Hamiltonian Monte Carlo* --- class: inverse, center, middle # Thanks for listening! ## Questions?