Outline
Introduction
We are starting a new series of blog posts, which are intended to introduce DeepLTK (Deep Learning Toolkit for LabVIEW) to new and/or advanced users. With this blog post we will cover the basic and essential functionality of DeepLTK by trying to solve a simple problem of implementing Boolean logic with help of simple neural network. This serves as a foundation to understand logistic regression. Logistic regression is a statistical method used for binary classification problems, where the outcome or target variable is categorical and has only two possible classes (e.g., 0 or 1, true or false, yes or no). The goal of logistic regression is to model the probability that an input or inputs belongs to a specific class.
The problem is chosen for its simplicity, but it will help to better understand the programming concept with DeepLTK.
This and other DeepLTK based LabVIEW examples can be accessed from our GitHub page.
The project consists of two main VIs:
1_DeepLTK_Boolean_Logic(Training).vi and 2_DeepLTK_Boolean_Logic(Inference).vi.
Training: 1_Boolean_Logic(Training).vi.
Front Panel of Training VI
Front Panel of the training VI exposes all necessary high level configuration parameters.
NN_Train_Params. This cluster specifies different aspects of setting the training process, which are described below.
Optimizer allows to chose optimization algorithms, e.g. SGD (Stochastic Gradient Descend) or Adam. Beta_1 sets first order momentum coefficient. Beta_2 sets second order momentum coefficient (ignored while using SGD optimizer, applicable for Adam optimizer). Weight Decay(L1) and Weight Decay(L2) are L1 and L2 regularization coefficients. Data_Sampling specifies how data samples for a single minibatch are sampled from a dataset. Samples can be sampled either sequentially or randomly.
LR_Decay_Params defines learning rate update policy.
Policy_Type specifies learning rate update policy function, which can be either Manual, Step, Exponential or 1/t.
LR0 specifies initial learning rate.
k defines learning rate decay speed. E.g. in case of "Step" update policy learning rate is halved after every "k" epochs.
WarmUp_Policy specifies the policy for controlling learning rate during warm-up (initial) stage of the training.
WarmUp_Iter specifies number of training iterations required to finish warm-up process.
WarmUp_Ratio specifies initial learning rate attenuation factor.
Loss Chart displays the history of training loss value across the training process.
Loss Value displays the current value of training loss.
Session Data contains information about current session of the training.
Epochs shows the total epochs passed. An epoch is one complete pass through the whole training dataset.
Tot_iter is the total number of training iterations completed after the beginning of the training process.
LR is the current learning rate. Note, that learning rate is being updated according to the Learning Rate Update Policy (see above).
LR_iter displays number of iterations passed after changing "LR_Decay_Params.Policy_Type" (see above).
Block Diagram of the Training VI
Now that we have explored the Front Panel, let's delve into the Block Diagram.
Every training process designed with help of DeepLTK should consist of 6 main sections:
reading and setting up the dataset
creating and configuring neural network
configuring training process
running and monitoring training process
saving trained model
releasing resources.
To better understand each section, let's go through them one by one.
1. Reading and setting up the dataset
To train a neural network dataset should be created, which is a collection of inputs and corresponding outputs (labels), as well as descriptive information. DeepLTK supports multiple types of datasets which differ depending on input dimensionality and output dimensionality and type. In this blog post we will cover 1-dimensional inputs with 1-dimensioanl outputs dataset type. This type of dataset is appropriate for this specific problem as a single input and output sample in the dataset can be represented with help pf 1-dimensional array. The dataset in DeepLTK is represented with help of NN_DataSet(In1D_Out1D).ctl structure/cluster.
As the dataset basically represents the mappings between inputs and outputs, in case of the Boolean AND logic gate that would be its truth table (shown below).
Truth table for AND operation
Input1 | Input2 | Output |
---|---|---|
False | False | False |
True | False | False |
False | True | False |
True | True | True |
In case of AND logic gate, the Inputs of the dataset should store the inputs (first and second columns) of truth table, and the outputs would be output (3-rd column) of the truth table. In case of AND logic gate, the dataset consists of 4 training samples, with two input features in each sample, so the input data becomes a 2-dimensional array with 4 rows and 2 columns. Similarly, the output data should be a 4 by 1 array that contains the values of outputs.
These arrays are being converted to DVRs (Data Value References) and together with additional descriptive information, i.e. input and output dimensions, as well the size of the dataset combined into a single cluster NN_DataSet(In1D_Out1D).ctl, which basically represents a dataset in DeepLTK.
2. Creating and configuring neural network
The process of creating neural network consists of the following steps.
Creation of an instance of a neural network. It is done with help of NN_Create.vi by specifying its name, minibatch size and whether it is being created for training or not.
Input layer creation with help of NN_Layer_Create.vi by specifying its name and size, i.e. number of inputs. As inputs in our dataset are 1-dimensional, "Input1D" instance of the polymorphic NN_Layer_Create.vi is chosen.
Creation of functional layers of the network with help of "NN_Layer_Create.vi". As we are going to model AND operation by using only a single neuron we will be using "FC" (Fully Connected) instance of the polymorphic NN_Layer_Create.vi and set its size to 1. We also set its name and specify its activation function as sigmoid as we are expecting the output of this layer to be between 0 and 1.
Network creation and configuration process is finished by specifying loss function (with help of NN_Set_Loss.vi), which is set to MSE (Mean Squared Error) in this case.
3. Configuring training process
This part of the code configures the Hyper-Parameters and Learning Rate update Policy with help of DeepLTK's NN_Set_Train_Config.vi and NN_Cfg_LR_Decay_Policy.vi VIs. The configuration parameters for each VI have been described in sections above.
4. Implementing the training process
After creating the dataset, building the network, and setting up the training hyperparameters, the network can be trained by iteratively calling a NN_Train.vi.
During the training the loss value is monitored to understand when to stop the training process.
Additionally "Session Data" is displayed to show how many training iterations and epochs passed, as well as what is the current value of Learning rate.
5. Saving trained model and releasing resources
After finishing the training process the model should be saved and allocated resources should be released. This can be done with help of calling NN_Destroy.vi. The trained model is saved in "working" directory, which is by default created next to the running VI or current project. Custom location for working directory can be specified when calling NN_Create.vi.
Now as we understood the details of implementation of taring process lets run the VI and observe the results.
Training the Network and Investigating the Results
Lets run the training VI. After running the VI we can see that the loss value starts to decrease rapidly. Once we see that the loss value is not decreasing any more we can stop the training process.
Note: If the network fails to train (loss does not decrease), stop the VI’s execution by pressing the Stop Train button and restart the training process.
Observing Training Results
After finishing the training process the model should be created and saved in the working directory (next to the ruining VI). It should contain 3 files: model configuration (.cfg), trained weights (.bin) and topology visualization (.svg).
As it can be seen the name of files contains the name of the model/network as well as timestamp.
Timestamping is useful to identify specific results in a bunch of files when running multiple experiments with same configurations.
Now we can use configuration and weights file to deploy the model for inference.
Inference: 2_Boolean_Logic(Inference).vi
After successfully training a neural network it should be tested on some examples to evaluate its performance. In this section we describe "2_DeepLTK_Boolean_Logic(Inference).vi" designed to recreate the network from .cfg file and initialize weight from .bin file, feed some values to the input of the network and observe how it performs on test data.
The Front Panel is presented below.
The Front Panel provides two inputs for sourcing .cfg and .bin files, two Boolean controls (Input1 and Input2) for providing test inputs to the network and one Boolean indicator (Output) displaying the trained network's prediction for provided inputs.
Let's see how the code is organized for implementing the inference.
Overall the block diagram can be divided into 5 logical part:
network creation (from .cfg) and initialization (from .bin)
setting the inputs of the network
running for inference (forward propagation)
getting outputs of the network (prediction/s).
releasing allocated resources.
1. Network Creatin and Initialization
Initially, we extract the configuration data using the NN_CFG(Read).vi and use it to reconstruct the architecture of the network using the NN_Create_from_CFG.vi.
Next, we load the pre-trained weights with help of NN_Weights(Load).vi.
2. Setting the Inputs of the Network
We combine two Boolean inputs representing the inputs of AND operator into a 1D array and convert them to SGL type (as the network can accept this type of data) and set them as inputs of the network with help of NN_Set_Input.vi.
Note: Multiple input samples (a batch) can be provided to the network by combining them into 2D array.
"NN_Forward.vi" calculates forward propagation of the network based on specified inputs.
In order to get the prediction results we first call "NN_Get_Layer(Last).vi" to obtain the last layer in the network. Then we call "NN_Layer_Get_OutVals_as1D.vi" to get the the output values of that layer. Then we convert the outputs (in this case a single output) represented as SGL into a Boolean.
Similar to the training VI we call "NN_Destroy.vi" to release allocated resources. Note that for the inference case when destroying the model "Save(T,T,T)" input sets all Booleans to False to prevent saving same training results (files), which are basically replicas of the input files.
Once the .cfg and .bin files are provided the VI can be run and tested on different configurations of the inputs.
Summary
With this blog post we demonstrated the usage of DeepLTK for solving one of the simplest problems, namely modeling Boolean AND with help of a single neuron. This trivial problem was chosen as an example to mostly concentrate and cover the common aspects of creating, configuring, training and deploying neural networks with the help of DeepLTK in LabVIEW.
Here we examined the logistic regression with multiple inputs and a single output. For the next blog, we will delve into a logistic regression with multiple inputs and multiple outputs. Get ready for an in-depth exploration of logistic regression and its diverse applications.
Things to Do Next
To strengthen the knowledge, we suggest readers to solve the following similar problems based on the reference example.
Model Boolean OR. Hint: dataset modification.
Model Boolean XOR. Hint: increase the depth of the network (add a layer).
How to download vi files for this example