In this practical we deal with a reduced version of the CIFAR10 dataset and train convolutional neural networks. We use a small subset of the dataset to be able to compute things in sensbile time within the tutorial.

The main focus is to see how “to assemble” different convolutional layers.


1.1 Load the data

CIFAR10 <- dataset_cifar10()
x_train <- CIFAR10$train$x/255
x_test  <- CIFAR10$test$x/255
y_train <- keras::to_categorical(CIFAR10$train$y, num_classes = 10)
y_test  <- keras::to_categorical(CIFAR10$test$y,  num_classes = 10)
label_name = c("flyer", "car", "bird", "cat", "deer", "dog", "frog ", "horse", "ship", "truck")
 [1] "flyer" "car"   "bird"  "cat"   "deer"  "dog"   "frog " "horse" "ship" 
[10] "truck"
[1] 50000    32    32     3
[1] 10000    32    32     3

   0    1    2    3    4    5    6    7    8    9 
5000 5000 5000 5000 5000 5000 5000 5000 5000 5000 

We consider only 3 classes - not all 10

We use only “airplane”, “cat” and “truck” to make the problem “easier and quicker” for the practical

1.2 Filter “airplane”, “cat” and “truck” <- which(CIFAR10$train$y==0|CIFAR10$train$y==3|CIFAR10$train$y==9) <- which(CIFAR10$test$y==0|CIFAR10$test$y==3|CIFAR10$test$y==9)
[1] 15000
[1] 3000
x_train.filter <- x_train[,,,]
x_test.filter <- x_test[,,,]
y_train.filter <- CIFAR10$train$y[,]
y_test.filter <- CIFAR10$test$y[,] <- y_train[,c(1,4,10)]  <-  y_test[,c(1,4,10)]

We stay with a small training data size and validation data size - also due to time constraints. For testing we could use the full test set. We will also a batch size of 40.

Here I have chosen 8000 for train and 4000 validation but I have a “good” laptop. You migh try with a smaller data size such 2000 for the train set and 1000 for the validation set.

trainLength <- 8000
validateLength <- 4000
trainRange <- 1:trainLength
validateRange <- (trainLength+1):(trainLength+validateLength)

x_train.filter.small <- x_train.filter[1:trainLength,,,]
y_train.filter.small <-[1:trainLength,]
x_val.filter.small <- x_train.filter[(trainLength+1):(trainLength+validateLength),,,]
y_val.filter.small <-[(trainLength+1):(trainLength+validateLength),]

1.3 A first CNN using Keras

img_width <- 32
img_height <- 32

model <- keras_model_sequential()

model %>%
    layer_conv_2d(filter = 8, kernel_size = c(3,3), strides=c(1,1),padding="same", activation="relu",
    input_shape = c(img_height, img_width, 3)) %>%
    layer_batch_normalization() %>%
    layer_max_pooling_2d(pool_size = c(2,2)) %>%
    layer_conv_2d(filter = 4, kernel_size = c(3,3), strides=c(1,1),padding="valid", activation="relu") %>%
    layer_batch_normalization() %>%
    layer_max_pooling_2d(pool_size = c(2,2)) %>%
    layer_flatten() %>%
    layer_dense(units = 80, activation = "relu")   %>%
    layer_dropout(rate = 0.5) %>% 
    layer_dense(units = 40, activation = "relu")  %>%
    layer_dropout(rate = 0.5) %>%
    layer_dense(units = 3, activation = "softmax")

Model: "sequential"
Layer (type)                        Output Shape                    Param #     
conv2d_1 (Conv2D)                   (None, 32, 32, 8)               224         
batch_normalization_1 (BatchNormali (None, 32, 32, 8)               32          
max_pooling2d_1 (MaxPooling2D)      (None, 16, 16, 8)               0           
conv2d (Conv2D)                     (None, 14, 14, 4)               292         
batch_normalization (BatchNormaliza (None, 14, 14, 4)               16          
max_pooling2d (MaxPooling2D)        (None, 7, 7, 4)                 0           
flatten (Flatten)                   (None, 196)                     0           
dense_2 (Dense)                     (None, 80)                      15760       
dropout_1 (Dropout)                 (None, 80)                      0           
dense_1 (Dense)                     (None, 40)                      3240        
dropout (Dropout)                   (None, 40)                      0           
dense (Dense)                       (None, 3)                       123         
Total params: 19,687
Trainable params: 19,663
Non-trainable params: 24

1.3.1 TASK 0 : Try to understand the dimension of the layers and the number of parameters involved

1.4 Visualize this model

model %>% plot_model()
1.5 Task 1:

  • Modify the model to have \((5\times 5\)) convolutions in the first layer and no max pooling layer after the second convolutional layer. No padding for the convolutional layers

  • You’ll need to update the number of neurons in the dense layers.

  • Call this model, model2.

Model: "sequential_1"
Layer (type)                        Output Shape                    Param #     
conv2d_3 (Conv2D)                   (None, 28, 28, 8)               608         
batch_normalization_3 (BatchNormali (None, 28, 28, 8)               32          
max_pooling2d_2 (MaxPooling2D)      (None, 14, 14, 8)               0           
conv2d_2 (Conv2D)                   (None, 12, 12, 4)               292         
batch_normalization_2 (BatchNormali (None, 12, 12, 4)               16          
flatten_1 (Flatten)                 (None, 576)                     0           
dense_5 (Dense)                     (None, 80)                      46160       
dropout_3 (Dropout)                 (None, 80)                      0           
dense_4 (Dense)                     (None, 40)                      3240        
dropout_2 (Dropout)                 (None, 40)                      0           
dense_3 (Dense)                     (None, 3)                       123         
Total params: 50,471
Trainable params: 50,447
Non-trainable params: 24

1.6 Training your model

  • first set up the compiler
model2 %>% compile(
  loss = 'categorical_crossentropy',
  metrics = 'accuracy',
  optimizer = optimizer_adam(lr = 0.001)
  • Fit the model
epochs <- 20
batch.size <- 50

start.time <-  Sys.time()

history.model2 <- model2 %>% fit(
  x=x_train.filter.small, y=y_train.filter.small, validation_data = list(x_val.filter.small,y_val.filter.small),
  epochs = epochs, batch_size = batch.size, verbose=FALSE
end.time <- Sys.time()
(running.time <- end.time - start.time)
Time difference of 38.08836 secs

1.7 Visualise the model 2

plot(history.model2 )
1.8 Evaluate Your Model

model.acc <- model2 %>% tensorflow::evaluate(x_test.filter,
  • Confusion matrix <- c("airplane","cat","truck")
pred<- model2 %>% keras::predict_classes(x_test.filter)
pred.class <- factor(pred,
true.class <- factor(y_test.filter,
pred.class airplane cat truck
  airplane      551  16    23
  cat           251 934   132
  truck         198  50   845
(ACC <- sum(diag(table(pred.class,true.class)))/3000)
[1] 0.7766667

1.9 Task 2

  • Attempt to create a different model archiecture
  • use two additional convolutional layer.
  • Try to maximize the validation accuracy

and we’ll see in class who gets the best test accuracy (you can only check the test accuracy once).

Time difference of 1.812771 mins

1.10 Visualise my new model

plot(history.model3 )
1.11 Evaluate my new model

(model.acc <- model3 %>% tensorflow::evaluate(x_test.filter,
     loss  accuracy 
0.5492241 0.8166667 
  • Confusion matrix <- c("airplane","cat","truck")
pred<- model3 %>% keras::predict_classes(x_test.filter)
pred.class <- factor(pred,
true.class <- factor(y_test.filter,
pred.class airplane cat truck
  airplane      934 154   240
  cat            38 822    66
  truck          28  24   694
(ACC <- sum(diag(table(pred.class,true.class)))/3000)
[1] 0.8166667