ESP32 MCU.

Lab 03: Embedded Deep Learning¶
In this lab session, we will optimize the deep learning model that was trained in the last session. Later we will put the quantized model to the ESP32 MCU.

The first half session can be run on google colab. Open in google colab ->

But the latter part must be running on your PC, below are tutorials of installing required softwares for your PC:

VS Code Editor
Git Client

Tips: It is recommended to try to install all the software before you come to the Lab. But, don’t worry if you fail to install everything.

First, we import some libraries for image processing and utils, as well as TensorFlow. Note that the module image_dataset_from_directory is necessary for downloading our data set from Google.

import matplotlib.pyplot as plt
import numpy as np
import tensorflow as tf

from tensorflow.keras.preprocessing import image_dataset_from_directory

# Set the seed value for experiment reproducibility.
tf.random.set_seed(seed)
np.random.seed(seed)

Import the Gesture dataset for evaluating the pretrianed model¶
Download and extract the zip file containing the datasets with tf.keras.utils.get_file.

Tips: change the code respectively if you have a model for other task.

# Download our dataset used for training
TRAIN_SET_URL = ‘https://storage.googleapis.com/learning-datasets/rps.zip’
path_to_zip = tf.keras.utils.get_file(‘rps.zip’, origin=TRAIN_SET_URL, extract=True, cache_dir=’/content’)
train_dir = os.path.join(os.path.dirname(path_to_zip), “rps”)

# As well as the validation dataset
VAL_SET_URL = ‘https://storage.googleapis.com/learning-datasets/rps-test-set.zip’
path_to_zip2 = tf.keras.utils.get_file(‘rps-test-set.zip’, origin=VAL_SET_URL, extract=True, cache_dir=’/content’)
validation_dir = os.path.join(os.path.dirname(path_to_zip2), “rps-test-set”)

Then we can generate tf.data.Dataset from image files in a directory.

BATCH_SIZE = 32
IMG_SIZE = (96, 96)

train_dataset = image_dataset_from_directory(train_dir,
shuffle=True,
batch_size=BATCH_SIZE,
image_size=IMG_SIZE)

validation_dataset = image_dataset_from_directory(validation_dir,
shuffle=True,
batch_size=BATCH_SIZE,
image_size=IMG_SIZE)

Lets display some images of our dataset, as well as the class names.

Split test set and validation set¶
We are now taking a fifth of the validation dataset to use as our test set. The validation set will be used for observing if we got overfitting during training while the test set is for the final test after training:

val_batches = tf.data.experimental.cardinality(validation_dataset)

test_dataset = validation_dataset.take(val_batches // 5)
validation_dataset = validation_dataset.skip(val_batches // 5)

print(‘Number of validation batches: %d’ % tf.data.experimental.cardinality(validation_dataset))
print(‘Number of test batches: %d’ % tf.data.experimental.cardinality(test_dataset))

Upload model and load the model from hard disk¶

Fetch the example model from github, you can also upload your own model to colab:

import h5py
import requests

url = ‘https://github.com/SuperChange001/deeplearning_labs/raw/main/Lab03/pretrained_models/model_rps.h5’
r = requests.get(url, allow_redirects=True)

with open(‘/content/model_rps.h5’, ‘wb’) as f:
f.write(r.content)

Alternative way for fetching the pretrained model:

!wget https://github.com/SuperChange001/deeplearning_labs/raw/main/Lab03/pretrained_models/model_rps.h5

Load model from hard disk

model = tf.keras.models.load_model(‘/content/model_rps.h5’)
model.summary()

Test Accuracy of the loaded model

loss, accuracy = model.evaluate(test_dataset)
print(‘Test accuracy :’, accuracy)

Convert model¶

Convert to a TensorFlow Lite model¶

#TF Lite model without quantization
converter = tf.lite.TFLiteConverter.from_keras_model(model)
tflite_model = converter.convert()

Convert using dynamic range quantization¶

# Parameters setting
optimization_config = [tf.lite.Optimize.DEFAULT]

#TF Lite model with dynamic range quantization
converter = tf.lite.TFLiteConverter.from_keras_model(model)
converter.optimizations = optimization_config

tflite_model_dynamic_range = converter.convert()

Convert using float fallback quantization¶

#Extracts sample images needed for float fallback and full integer quantization
def representative_data_gen():
for input in train_dataset.take(4):
for input_value in tf.data.Dataset.from_tensor_slices(np.array(input[0])).batch(1).take(32):
yield [input_value]

#TF Lite model with Float Fallback quantization
converter = tf.lite.TFLiteConverter.from_keras_model(model)
converter.optimizations = optimization_config
converter.representative_dataset = representative_data_gen

tflite_model_float_fallback = converter.convert()

Convert using integer-only quantization¶

#TF Lite model with Full integer quantization
converter = tf.lite.TFLiteConverter.from_keras_model(model)
converter.optimizations = optimization_config
converter.representative_dataset = representative_data_gen
# Ensure that if any ops can’t be quantized, the converter throws an error
converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS_INT8]
# Set the input and output tensors to int8
converter.inference_input_type = tf.int8
converter.inference_output_type = tf.int8

tflite_model_quant = converter.convert()

Save the models as files¶

You’ll need a .tflite file to deploy your model on other devices. So let’s save the converted models to files and then load them when we run inferences below.

import pathlib

tflite_models_dir = pathlib.Path(“/content/rps_tflite_models/”)
tflite_models_dir.mkdir(exist_ok=True, parents=True)

# Save the unquantized/float model:

tflite_model_file = tflite_models_dir/”rps_model.tflite”
tflite_model_file.write_bytes(tflite_model)

# Save the dynamic range quantized model:

tf_model_dynamic_range_file = tflite_models_dir/”rps_model_dynamic_range.tflite”
tf_model_dynamic_range_file.write_bytes(tflite_model_dynamic_range)

# Save the float fallback quantized model:

tflite_model_float_fallback_file = tflite_models_dir/”rps_model_float_fallback.tflite”
tflite_model_float_fallback_file.write_bytes(tflite_model_float_fallback)

# Save the integer only quantized model:

tflite_model_quant_file = tflite_models_dir/”rps_model_quant.tflite”
tflite_model_quant_file.write_bytes(tflite_model_quant)

Test the models¶

def evaluate_model(tflite_file, dataset, model_type):
interpreter = tf.lite.Interpreter(model_path=str(tflite_file))

interpreter.allocate_tensors()

input_details = interpreter.get_input_details()[0]
output_details = interpreter.get_output_details()[0]

total_seen = 0
num_correct = 0
is_int8_quantized = (input_details[‘dtype’] == np.int8)

for image_batch, labels_batch in dataset:
for i in range(tf.shape(image_batch)[0]):
test_image = image_batch[i]

if is_int8_quantized:
input_scale, input_zero_point = input_details[“quantization”]
test_image = test_image / input_scale + input_zero_point

test_image = np.expand_dims(test_image, axis=0).astype(input_details[“dtype”])
interpreter.set_tensor(input_details[“index”], test_image)
interpreter.invoke()
output = interpreter.get_tensor(output_details[“index”])[0]

output = np.argmax(output)

if labels_batch[i] == output:
num_correct += 1
total_seen += 1

if total_seen % 50 == 0:
print(“Accuracy after %i images: %f” %
(total_seen, float(num_correct) / float(total_seen)))
print(‘Num images: {0:}, Accuracy: {1:.4f}, Type: {2:}’.format(total_seen, float(num_correct / total_seen), model_type))

#Check accuracy on the test subset
evaluate_model(tflite_model_file, test_dataset, model_type=”Float”)
evaluate_model(tf_model_dynamic_range_file, test_dataset, model_type=”Dynamic Range”)
evaluate_model(tflite_model_float_fallback_file, test_dataset, model_type=”Float Fallback”)
evaluate_model(tflite_model_quant_file, test_dataset, model_type=”Integer Quantized”)
model.evaluate(test_dataset)

#Check accuracy on all validation data
evaluate_model(tflite_model_file, validation_dataset, model_type=”Float”)
evaluate_model(tf_model_dynamic_range_file, validation_dataset, model_type=”Dynamic Range”)
evaluate_model(tflite_model_float_fallback_file, validation_dataset, model_type=”Float Fallback”)
evaluate_model(tflite_model_quant_file, validation_dataset, model_type=”Integer Quantized”)
model.evaluate(validation_dataset)

Inspecting the models’ size¶

print(“Float model in KB:”, os.path.getsize(tflite_model_file) / float(2**10))
print(“Dynamic Range model in KB:”, os.path.getsize(tf_model_dynamic_range_file) / float(2**10))
print(“Float fallback model in KB:”, os.path.getsize(tflite_model_float_fallback_file) / float(2**10))
print(“Integer Quantized model in KB:”, os.path.getsize(tflite_model_quant_file) / float(2**10))

So you now have an integer quantized a model with almost no difference in the accuracy, compared to the float model.

To learn more about other quantization strategies, read about TensorFlow Lite model optimization.

Generate a TensorFlow Lite for MicroControllers Model¶
Convert the TensorFlow Lite model into a C source file that can be loaded by TensorFlow Lite for Microcontrollers.

# Install xxd if it is not available
!apt-get update && apt-get -qq install xxd
# Convert to a C source file, i.e, a TensorFlow Lite for Microcontrollers model
MODEL_TFLITE = “/content/rps_tflite_models/rps_model_quant.tflite”
MODEL_TFLITE_MICRO = “/content/rps_tflite_models/rps_model_quant.cc”
!xxd -i {MODEL_TFLITE} > {MODEL_TFLITE_MICRO}
# Update variable names
REPLACE_TEXT = MODEL_TFLITE.replace(‘/’, ‘_’).replace(‘.’, ‘_’)
!sed -i ‘s/'{REPLACE_TEXT}’/rps_model_tflite/g’ {MODEL_TFLITE_MICRO}

Deploy to a Microcontroller¶

First we download the genrated model for MCU.

from google.colab import files
files.download(‘/content/rps_tflite_models/rps_model_quant.cc’)

We now need to move to PC side.

Clone the code repository from GitHub to your PC. For example, if you use the shell command:
git clone https://github.com/SuperChange001/deeplearning_labs.git

Open the folder Lab03/esp32-projects/rock_paper_scissors in vscode.
Building the project and flash to the MCU, assuming you have installed the extension, you will see the buttons at the left-bottom of the vs code editor:

1: choose the correct usb to serial device
2: clean the build
3: build the project
4: flash the build binary file to the ESP32
5: Monitor the log information from the ESP32
6: build+flash+monitor

Tips: To connect the hardware correctly, you need to specify the USB device (/dev/ttyUSBxxx on Linux, or COM port device on windows), e.g.

What is the next?¶
Try your own model on the MCU!

This means replacing the rps_quant.cc in panth_to_your_clone/Lab03/esp32-projects/rock_paper_scissors/main with the file generated by you.
Change the tensor arena size in file main_functions.cc:
constexpr int kTensorArenaSize = 530 * 1024;

which defines the size of the memory area in which the calculations of the model happen according to the needs of your model. Thankfully, if its too small, the esp will tell you during runtime that this is the case, and how roughly how much is missing, so you can get to a somewhat optimal tensor arena size by trial-and-error.
Add necessary operations in your main_functions.cc depending on your model structure:

// Pull in only the operation implementations we need.
// This relies on a complete list of all the ops needed by this graph.
// An easier approach is to just use the AllOpsResolver, but this will
// incur some penalty in code space for op implementations that are not
// needed by this graph.
// tflite::AllOpsResolver resolver;
// NOLINTNEXTLINE(runtime-global-variables)
static tflite::MicroMutableOpResolver<5> micro_op_resolver;
micro_op_resolver.AddAveragePool2D();
micro_op_resolver.AddConv2D();
micro_op_resolver.AddDepthwiseConv2D();
micro_op_resolver.AddReshape();
micro_op_resolver.AddSoftmax();

A list of supported operations can be found here.

For a different image recognition task, you may also need to change the kCategoryLabels field in the file model_settings.cc. In the model_setting.h, more options you can tweak, do you know what all these parameters mean?