Kernel learning example¶

Here, we imagine that we want to use linear regression with a Gaussian kernel to fit a noisy sine wave.

In [1]:
using PyPlot
using Random
using LinearAlgebra
import Statistics.mean
In [2]:
Random.seed!(123456);
N = 10000; # size of data set
x = 6 * rand(N);
y = sin.(x.^2) .+ 0.2 * randn(N);
In [3]:
scatter(x,y; s=1);
plot(collect(0:0.01:6), sin.(collect(0:0.01:6).^2), "--k"; linewidth=1);
xlabel("x");
ylabel("y");
title("Training Set");
No description has been provided for this image
In [4]:
gamma = 100;
function K(x1, x2)
    return exp(-gamma*(x1 - x2)^2/2);
end
Out[4]:
K (generic function with 1 method)
In [14]:
# compute the Gram matrix
@time begin
    G = zeros(N, N);
    for i = 1:N
        for j = 1:N
            G[i,j] = K(x[i], x[j]);
        end
    end
end
 27.807765 seconds (889.82 M allocations: 15.495 GiB, 1.48% gc time)
In [6]:
# actually solve the regularized linear regression problem
@time w = (G + 0.001 * LinearAlgebra.I) \ y;
  2.247741 seconds (1.02 M allocations: 1.538 GiB, 0.29% gc time, 7.79% compilation time)
In [15]:
function predict(xt :: Float64)
    rv = 0.0;
    for i = 1:N
        rv += w[i] * K(x[i], xt);
    end
    return rv;
end
Out[15]:
predict (generic function with 1 method)
In [39]:
scatter(x,y; s=1);
x_test = collect(0:0.01:6);
y_test = sin.(x_test.^2);
@time y_pred = predict.(x_test);
plot(x_test, y_pred, "k"; linewidth=1);
xlabel("x");
ylabel("y");
title("Learned model: Gram matrix");

println("average test error: $(sqrt(mean((y_test - y_pred).^2)))")
  1.278735 seconds (71.51 M allocations: 1.155 GiB, 1.71% gc time)
average test error: 0.01936837588802565
No description has been provided for this image

Now, what if we do it with random Fourier features?¶

In [33]:
D = 50;
U = randn(D) * sqrt(gamma);
b = 2 * pi * rand(D);
In [34]:
# compute the feature map
@time begin
    phi = zeros(N, D);
    for i = 1:N
        for j = 1:D
            phi[i,j] = sqrt(2)*cos(U[j] * x[i] + b[j]);
        end
    end
end
  0.167623 seconds (4.54 M allocations: 81.044 MiB, 1.00% gc time)
In [35]:
# actually solve the (now not regularized) linear regression problem
@time w_rff = (phi' * phi + 0.001 * LinearAlgebra.I) \ (phi' * y);
  0.001141 seconds (62 allocations: 63.797 KiB)
In [36]:
function predict_rff(xt :: Float64)
    phi_xt = sqrt(2)*cos.(U .* xt .+ b);
    return dot(phi_xt, w_rff);
end
Out[36]:
predict_rff (generic function with 1 method)
In [38]:
scatter(x,y; s=1);
x_test = collect(0:0.01:6);
y_test = sin.(x_test.^2);
@time y_pred = predict_rff.(x_test);
plot(x_test, y_pred, "k"; linewidth=1);
xlabel("x");
ylabel("y");
title("Learned model: Random Fourier features");

println("average test error: $(sqrt(mean((y_test - y_pred).^2)))")
  0.000419 seconds (5.47 k allocations: 646.469 KiB)
average test error: 0.01330773438094877
No description has been provided for this image
In [ ]:

In [ ]:

In [ ]: