The introduction to

Intel OpenVino Toolkit Inference Engine, part -1/2

model optimization through simple example

Introduction

In this tutorial (github link), we will focus on following four aspects:

  1. Train a CNN model
  2. Generate Tensorflow frozen model (.PB file)
  3. Get OpenVino optimized intermediate representation (IR) model
  4. Inference using OpenVino IR model

1. Train a CNN model ( MNIST image classification using Keras)

1.1 Loading and splitting of MNIST data in train and test

#download MNIST data and split into train and test sets
(X_train, y_train), (X_test, y_test) = mnist.load_data()
# reshape data to fit model
X_train = X_train.reshape(60000,28,28,1)
X_test = X_test.reshape(10000,28,28,1)
X_train = np.asarray(X_train)
X_test = np.asarray(X_test)
print(X_train.shape, X_test.shape)
MNIST dataset images (Source : https://en.wikipedia.org/wiki/MNIST_database)

1.2 model architecture definition, training and evaluation

# Model defination
model = Sequential()
model.add(Conv2D(filters=32, kernel_size=(5,5), padding=’same’, activation=’relu’, input_shape=(28, 28, 1)))
model.add(MaxPool2D(strides=2))
model.add(Conv2D(filters=48, kernel_size=(5,5), padding=’valid’, activation=’relu’))
model.add(MaxPool2D(strides=2))
model.add(Flatten())
model.add(Dense(256, activation=’relu’))
model.add(Dense(84, activation=’relu’))
model.add(Dense(10, activation=’softmax’))
model.build()
model.summary()
Model Architecture

After defining AI model architecture, we need to define loss function and optimizer. We can use reduce learning rate as training is progressing. model.fit() function call is used to train the model with appropriate arguments like, training an validation data, epoch count and batch size.

adam = Adam(lr=5e-4)
model.compile(loss=’categorical_crossentropy’, metrics=[‘accuracy’], optimizer=adam)
# Set a learning rate annealer
reduce_lr = ReduceLROnPlateau(monitor=’val_acc’, patience=3,
verbose=1, factor=0.2,min_lr=1e-6)

# train the model
results = model.fit(X_train, y_train,batch_size=128, validation_data=(X_test, y_test), epochs=10,callbacks=[reduce_lr])

trained model can be saved using model.save() function call. Performance of model on train and test set is as below.

Train Accuracy = 0.999966
Test Accuracy = 0.991900

2. Generate Tensorflow frozen model (.PB file)

2.1 Get frozen graph

2.2 Infer using frozen graph

saved_pb_dir = r’./Models/2_tf_frozen_PB_model/frozen_model.pb’f = gfile.FastGFile(saved_pb_dir, ‘rb’)
graph_def = tf.GraphDef()
graph_def.ParseFromString(f.read())
f.close()
# initiate TF session
config = tf.ConfigProto(allow_soft_placement=True)
sess = tf.Session(config=config)
sess.graph.as_default()
tf.import_graph_def(graph_def)

We need to define input and output node name, to specify input and output layer. In below example, to verify model prediction, we used first 20 images for inference and comparing same with ground truth.

# get node names from model.input/ model.output after loading model
input_node = ‘import/dense_3/Softmax:0’
output_node = ‘import/conv2d_1_input:0’
# verify model prediction for first 20 images
input_tensor = sess.graph.get_tensor_by_name(input_node)
predictions = sess.run(input_tensor, {output_node: X_test[:20]})
pred_num = np.argmax(predictions, axis=-1)
# print GT and predicted class for first 20 images
print('Ground Truth : ', y_test[:20])
print('Prediction : ', pred_num)

As, we can see model prediction is matching with ground truth, we are successfully able to convert Keras .h5 model to Tensorflow .pb model.

Ground Truth : [7 2 1 0 4 1 4 9 5 9 0 6 9 0 1 5 9 7 3 4]
Prediction : [7 2 1 0 4 1 4 9 5 9 0 6 9 0 1 5 9 7 3 4]

In next part, we will focus on optimizing our trained model and approach to faster inference.

Experimental AI researcher 😊