Deep Learning

Getting Started with Keras

If you are new to Artificial Intelligence and its correlates (Machine Learning and Deep Learning) you may be confused how to start in this new world. In Deep Learning there are many popular frameworks and libraries like Tensorflow, Caffe2, CNTK and Theano. I always recommend starting with Keras because it is a high level neural network API written in Python and capable of running on top of TensorFlow, CNTK or Theano.

In this blog post, you will learn how to use Keras to create a neural network that classifies the MNIST dataset with 97% accuracy. You can clone this project here:

If you are using Anaconda, create a new environment and install Keras.

conda create -n py36keras python=3.6 keras matplotlib nb_conda -y

After creating the environment, activate it …

activate py36keras

And start the Jupyter Notebook.

jupyter notebook

Jupyter starts automatically.
Go to New button and create a new “Python [conda env:py36keras]” notebook.

Now we are able to coding!
First of all import the Keras library.

from keras.datasets import mnist
from keras.models import Sequential
from keras.layers import Dense
from keras.utils import to_categorical
import matplotlib.pyplot as plt

As you can see, we import the MNIST dataset from Keras.
MNIST is a dataset of handwritten 0-9 digits with images and labels.
Next, go and load the dataset.

(train_images, train_labels), (test_images, test_labels) = mnist.load_data()

You can see the shape of the tensors.
This dataset has 60.000 images (28×28) and labels to train and 10.000 imagens (28×28) and labels to test your model.

print(train_images.shape, train_labels.shape)
print(test_images.shape, test_labels.shape)

Let’s see an image.

plt.imshow(train_images[20], cmap='binary')

Preprocess the data by reshaping it into the shape the network expects and scaling it so that all values are in the [0, 1] interval.

train_images = train_images.reshape((60000, 28*28))
train_images = train_images.astype('float32') / 255
test_images = test_images.reshape((10000, 28*28))
test_images = test_images.astype('float32') / 255

Vectorize the labels using one-hot encoding.

num_classes = 10
train_labels = to_categorical(train_labels, num_classes)
test_labels = to_categorical(test_labels, num_classes)

Create a simple network with two Fully Connected layers.
The first Dense layer has size of 32 units and it uses relu activation.
The second Dense layer has size of 10 units and it uses softmax activation.

model = Sequential()
model.add(Dense(32, activation='relu', input_shape=(28*28,)))
model.add(Dense(num_classes, activation='softmax'))

Finally, compile and train the network.

model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
history =, train_labels, batch_size=64, epochs=10, validation_data=(test_images, test_labels))

Plot the comparation of accuracy and loss values between training and validation steps.

acc = history.history['acc']
val_acc = history.history['val_acc']
loss = history.history['loss']
val_loss = history.history['val_loss']
epochs = range(1, len(acc)+1)

plt.plot(epochs, acc, 'bo', label='Training Accuracy')
plt.plot(epochs, val_acc, 'b', label='Validation Accuracy')
plt.title('Training x Validation Accuracy')

plt.plot(epochs, loss, 'bo', label='Training Loss')
plt.plot(epochs, val_loss, 'b', label='Validation Loss')
plt.title('Training x Validation Loss')

Done! You have created a neural network that classifies digits between 0 and 9 with an accuracy of 97%.
Have a good time!