Sparse Autoencoder in Keras

by allenlu2007

Reference: https://blog.keras.io/building-autoencoders-in-keras.html

 

在 reference 只有一段話。沒有源代碼。

 

Adding a sparsity constraint on the encoded representations

In the previous example, the representations were only constrained by the size of the hidden layer (32). In such a situation, what typically happens is that the hidden layer is learning an approximation of PCA (principal component analysis). But another way to constrain the representations to be compact is to add a sparsity contraint on the activity of the hidden representations, so fewer units would “fire” at a given time. In Keras, this can be done by adding an activity_regularizer to our Dense layer:

from keras import regularizers

encoding_dim = 32

input_img = Input(shape=(784,))
# add a Dense layer with a L1 activity regularizer
encoded = Dense(encoding_dim, activation='relu',
                activity_regularizer=regularizers.l1(10e-5))(input_img)
decoded = Dense(784, activation='sigmoid')(encoded)

autoencoder = Model(input_img, decoded)

Let’s train this model for 100 epochs (with the added regularization the model is less likely to overfit and can be trained longer). The models ends with a train loss of 0.11 and test loss of 0.10. The difference between the two is mostly due to the regularization term being added to the loss during training (worth about 0.01).

Here’s a visualization of our new results:

sparse autoencoder

They look pretty similar to the previous model, the only significant difference being the sparsity of the encoded representations. encoded_imgs.mean() yields a value 3.33 (over our 10,000 test images), whereas with the previous model the same quantity was 7.30. So our new model yields encoded representations that are twice sparser.

 

我把 simplest autoencoder Keras code template 改成 sparse autoencoder 如下:

Adding a Sparsity Constraint on the Encoder

from keras.layers import Input, Dense
from keras.models import Model

from keras import regularizers

encoding_dim = 32

input_img = Input(shape=(784,))
# add a Dense layer with a L1 activity regularizer
encoded = Dense(encoding_dim, activation='relu',
                activity_regularizer=regularizers.l1(10e-9))(input_img)
decoded = Dense(784, activation='sigmoid')(encoded)

autoencoder = Model(input_img, decoded)
Using TensorFlow backend.
# this model maps an input to its encoded representation
encoder = Model(input_img, encoded)
# create a placeholder for an encoded (32-dimensional) input
encoded_input = Input(shape=(encoding_dim,))
# retrieve the last layer of the autoencoder model
decoder_layer = autoencoder.layers[-1]
# create the decoder model
decoder = Model(encoded_input, decoder_layer(encoded_input))
#autoencoder.compile(optimizer='adadelta', loss='binary_crossentropy')
autoencoder.compile(optimizer='adam', loss='binary_crossentropy')
from keras.datasets import mnist
import numpy as np
(x_train, _), (x_test, _) = mnist.load_data()
x_train = x_train.astype('float32') / 255.
x_test = x_test.astype('float32') / 255.
x_train = x_train.reshape((len(x_train), np.prod(x_train.shape[1:])))
x_test = x_test.reshape((len(x_test), np.prod(x_test.shape[1:])))
print x_train.shape
print x_test.shape
(60000, 784)
(10000, 784)
autoencoder.fit(x_train, x_train,
                epochs=50,
                batch_size=256,
                shuffle=True,
                validation_data=(x_test, x_test))
Train on 60000 samples, validate on 10000 samples
Epoch 1/50
60000/60000 [==============================] - 8s - loss: 0.0933 - val_loss: 0.0922
Epoch 2/50
60000/60000 [==============================] - 8s - loss: 0.0933 - val_loss: 0.0922
Epoch 3/50
60000/60000 [==============================] - 8s - loss: 0.0933 - val_loss: 0.0921
Epoch 4/50
60000/60000 [==============================] - 8s - loss: 0.0933 - val_loss: 0.0922
Epoch 5/50
60000/60000 [==============================] - 8s - loss: 0.0932 - val_loss: 0.0922
Epoch 6/50
60000/60000 [==============================] - 8s - loss: 0.0933 - val_loss: 0.0921
Epoch 7/50
60000/60000 [==============================] - 8s - loss: 0.0932 - val_loss: 0.0922
Epoch 8/50
60000/60000 [==============================] - 8s - loss: 0.0932 - val_loss: 0.0921
Epoch 9/50
60000/60000 [==============================] - 8s - loss: 0.0932 - val_loss: 0.0921
Epoch 10/50
60000/60000 [==============================] - 8s - loss: 0.0932 - val_loss: 0.0921
Epoch 11/50
60000/60000 [==============================] - 8s - loss: 0.0932 - val_loss: 0.0922
Epoch 12/50
60000/60000 [==============================] - 8s - loss: 0.0932 - val_loss: 0.0921
Epoch 13/50
60000/60000 [==============================] - 8s - loss: 0.0932 - val_loss: 0.0920
Epoch 14/50
60000/60000 [==============================] - 8s - loss: 0.0932 - val_loss: 0.0921
Epoch 15/50
60000/60000 [==============================] - 8s - loss: 0.0931 - val_loss: 0.0921
Epoch 16/50
60000/60000 [==============================] - 8s - loss: 0.0931 - val_loss: 0.0920
Epoch 17/50
60000/60000 [==============================] - 8s - loss: 0.0931 - val_loss: 0.0921
Epoch 18/50
60000/60000 [==============================] - 8s - loss: 0.0931 - val_loss: 0.0920
Epoch 19/50
60000/60000 [==============================] - 8s - loss: 0.0931 - val_loss: 0.0920
Epoch 20/50
60000/60000 [==============================] - 8s - loss: 0.0931 - val_loss: 0.0920
Epoch 21/50
60000/60000 [==============================] - 8s - loss: 0.0931 - val_loss: 0.0920
Epoch 22/50
60000/60000 [==============================] - 8s - loss: 0.0931 - val_loss: 0.0920
Epoch 23/50
60000/60000 [==============================] - 8s - loss: 0.0931 - val_loss: 0.0920
Epoch 24/50
60000/60000 [==============================] - 8s - loss: 0.0931 - val_loss: 0.0920
Epoch 25/50
60000/60000 [==============================] - 8s - loss: 0.0931 - val_loss: 0.0920
Epoch 26/50
60000/60000 [==============================] - 8s - loss: 0.0931 - val_loss: 0.0920
Epoch 27/50
60000/60000 [==============================] - 8s - loss: 0.0930 - val_loss: 0.0920
Epoch 28/50
60000/60000 [==============================] - 8s - loss: 0.0930 - val_loss: 0.0919
Epoch 29/50
60000/60000 [==============================] - 8s - loss: 0.0930 - val_loss: 0.0920
Epoch 30/50
60000/60000 [==============================] - 8s - loss: 0.0930 - val_loss: 0.0920
Epoch 31/50
60000/60000 [==============================] - 8s - loss: 0.0930 - val_loss: 0.0920
Epoch 32/50
60000/60000 [==============================] - 8s - loss: 0.0930 - val_loss: 0.0919
Epoch 33/50
60000/60000 [==============================] - 8s - loss: 0.0930 - val_loss: 0.0919
Epoch 34/50
60000/60000 [==============================] - 8s - loss: 0.0930 - val_loss: 0.0919
Epoch 35/50
60000/60000 [==============================] - 8s - loss: 0.0930 - val_loss: 0.0919
Epoch 36/50
60000/60000 [==============================] - 8s - loss: 0.0930 - val_loss: 0.0919
Epoch 37/50
60000/60000 [==============================] - 8s - loss: 0.0930 - val_loss: 0.0919
Epoch 38/50
60000/60000 [==============================] - 8s - loss: 0.0930 - val_loss: 0.0919
Epoch 39/50
60000/60000 [==============================] - 8s - loss: 0.0930 - val_loss: 0.0919
Epoch 40/50
60000/60000 [==============================] - 8s - loss: 0.0929 - val_loss: 0.0919
Epoch 41/50
60000/60000 [==============================] - 8s - loss: 0.0929 - val_loss: 0.0919
Epoch 42/50
60000/60000 [==============================] - 8s - loss: 0.0929 - val_loss: 0.0919
Epoch 43/50
60000/60000 [==============================] - 8s - loss: 0.0929 - val_loss: 0.0919
Epoch 44/50
60000/60000 [==============================] - 8s - loss: 0.0929 - val_loss: 0.0919
Epoch 45/50
60000/60000 [==============================] - 8s - loss: 0.0929 - val_loss: 0.0918
Epoch 46/50
60000/60000 [==============================] - 8s - loss: 0.0929 - val_loss: 0.0918
Epoch 47/50
60000/60000 [==============================] - 8s - loss: 0.0929 - val_loss: 0.0919
Epoch 48/50
60000/60000 [==============================] - 8s - loss: 0.0929 - val_loss: 0.0918
Epoch 49/50
60000/60000 [==============================] - 8s - loss: 0.0929 - val_loss: 0.0919
Epoch 50/50
60000/60000 [==============================] - 8s - loss: 0.0929 - val_loss: 0.0919
<keras.callbacks.History at 0x7fcbd81c8450>
# encode and decode some digits
# note that we take them from the *test* set
encoded_imgs = encoder.predict(x_test)
decoded_imgs = decoder.predict(encoded_imgs)
# use Matplotlib (don't ask)
import matplotlib.pyplot as plt

n = 10  # how many digits we will display
plt.figure(figsize=(20, 4))
for i in range(n):
    # display original
    ax = plt.subplot(2, n, i + 1)
    plt.imshow(x_test[i].reshape(28, 28))
    plt.gray()
    ax.get_xaxis().set_visible(False)
    ax.get_yaxis().set_visible(False)

    # display reconstruction
    ax = plt.subplot(2, n, i + 1 + n)
    plt.imshow(decoded_imgs[i].reshape(28, 28))
    plt.gray()
    ax.get_xaxis().set_visible(False)
    ax.get_yaxis().set_visible(False)
plt.show()

如果修改這一個參數,結果就完全錯了。

encoded = Dense(encoding_dim, activation='relu',
                activity_regularizer=regularizers.l1(10e-5))(input_img)

   
10e-9 —> 10e-5  即使 epoches=100 仍然一樣。注意 optimizer 是用 adam 而非 adadelta

 10-9 —> 1e-6 OK

NewImage

 

Advertisements