K-DSN
深度堆叠网络
Random Features for Large-Scale Kernel Machines
To accelerate the training of kernel machines, we propose to map the input data
to a randomized low-dimensional feature space and then apply existing fast linear
methods. Our randomized features are designed so that the inner products of the
transformed data are approximately equal to those in the feature space of a user
specified shift-invariant kernel. We explore two sets of random features, provide
convergence bounds on their ability to approximate various radial basis kernels,
and show that in large-scale classification and regression tasks linear machine
learning algorithms that use these features outperform state-of-the-art large-scale
kernel machines.
On the Error of Random Fourier Features
https://www.cs.cmu.edu/~dsutherl/papers/rff_uai15.pdf
Kernel methods give powerful, flexible, and the-
oretically grounded approaches to solving many
problems in machine learning. The standard ap-
proach, however, requires pairwise evaluations
of a kernel function, which can lead to scalabil-
ity issues for very large datasets. Rahimi and
Recht (2007) suggested a popular approach to
handling this problem, known as random Fourier
features. The quality of this approximation, how-
ever, is not well understood. We improve the uni-
form error bound of that paper, as well as giving
novel understandings of the embedding’s vari-
ance, approximation error, and use in some ma-
chine learning methods. We also point out that
surprisingly, of the two main variants of those
features, the more widely used is strictly higher-
variance for the Gaussian kernel and has worse
bounds.