
Smoothed Dual Embedding Control
We revisit the Bellman optimality equation with Nesterov's smoothing tec...
read it

Path Integral Control by Reproducing Kernel Hilbert Space Embedding
We present an embedding of stochastic optimal control problems, of the s...
read it

Optimal Control via Combined Inference and Numerical Optimization
Derivative based optimization methods are efficient at solving optimal c...
read it

Path Integral Networks: EndtoEnd Differentiable Optimal Control
In this paper, we introduce Path Integral Networks (PINet), a recurrent...
read it

Adaptive PathIntegral Approach to Representation Learning and Planning for Dynamical Systems
We present a representation learning algorithm that learns a lowdimensi...
read it

Smoothing splines approximation using Hilbert curve basis selection
Smoothing splines have been used pervasively in nonparametric regression...
read it

Agglomerative clustering and collectiveness measure via exponent generating function
The key in agglomerative clustering is to define the affinity measure be...
read it
Adaptive Smoothing Path Integral Control
In Path Integral control problems a representation of an optimally controlled dynamical system can be formally computed and serve as a guidepost to learn a parametrized policy. The Path Integral CrossEntropy (PICE) method tries to exploit this, but is hampered by poor sample efficiency. We propose a modelfree algorithm called ASPIC (Adaptive Smoothing of Path Integral Control) that applies an infconvolution to the cost function to speedup convergence of policy optimization. We identify PICE as the infinite smoothing limit of such technique and show that the sample efficiency problems that PICE suffers disappear for finite levels of smoothing. For zero smoothing this method becomes a greedy optimization of the cost, which is the standard approach in current reinforcement learning. We show analytically and empirically that intermediate levels of smoothing are optimal, which renders the new method superior to both PICE and direct costoptimization.
READ FULL TEXT
Comments
There are no comments yet.