Deep Neural Network from Scratch - Part 1
In this post I will be explaining how to design your own Neural Network (Keras-like) framework from scratch in python 3.7+. This post assumes you have basic understanding of how a Neural Network works and familiarity with forward-backward propagation, Loss functions, Optimizers, Regularization and so on. Without any further due, let’s jump right into it.
Photo by Moritz Kindler on Unsplash
A basic neural network model implementation in Keras framework looks like the following. In our implementation from scratch, we will try to design a similar framework as Keras (but basic).
# define the keras model
model = Sequential()
model.add(Dense(12, input_dim=8, activation='relu'))
model.add(Dense(8, activation='relu'))
model.add(Dense(1, activation='sigmoid'))
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
model.fit(X, y, epochs=150, batch_size=10)
Base Neural Network class
In order to keep all the functional elements together we will create a class (Vanilla here) to organize the functionality of our Neural Network. The class can store variables to define how the network in structured and parameterized.
class Vanilla:
def __init__(self):
self.layers = []
self.activations = []
self.alpha_0 = 0.01
self.loss_fn = "entropy"
self.iterations = 0
self.dict = {}
self.decay_rate = 0.001
self.problem_type = "c" # c for classification & r for regression
Adding Layers
This function will allow users to add layers and design the graph of the network. Note that this function stores the number of hidden units in a variable named layers (similarly activation).
def add_layer(self, units, input_dim=0, activation="sigmoid"):
if input_dim == 0:
self.layers.append(units)
self.activations.append(activation)
else:
self.layers.append(input_dim)
self.layers.append(units)
self.activations.append(activation)
Compile Function
This function initializes various model parameters before starting the training in the next step.
def compile(self, learning_rate=0.01, decay_rate=0.001, loss="entropy"):
self.alpha_0 = learning_rate
self.loss_fn = loss
self.decay_rate = decay_rate
Training the network
The fit method lies at the heart of this implementation. Although the actual function has a vast functionality, we will look at the most important part which is forward and backpropagation.
# Forward Pass
for i in range(no_of_layers):
a.append(self.forward(
a[i], self.dict['W' + str(i+1)], self.dict['b' + str(i+1)], activation=self.activations[i]))
dz = []
dW = []
db = []
dz.append(y_train)
# Backpropagation of Gradients
for i in range(no_of_layers, 0, -1):
if i == no_of_layers:
[dzz, dWW, dbb] = self.backward(
a[i-1], a[i], [], dz[no_of_layers-i], activation=self.activations[i-1], output_layer=True)
else:
[dzz, dWW, dbb] = self.backward(
a[i-1], a[i], self.dict['W' + str(i+1)], dz[no_of_layers-i], activation=self.activations[i-1])
dz.append(dzz)
dW.append(dWW)
db.append(dbb)
# Update Weights and Biases
for i in range(1, no_of_layers + 1):
if regularize:
dW[no_of_layers -
i] += self.l1_reg(self.dict['W' + str(i)], lambda_)
self.dict['W' + str(i)] -= (alpha * dW[no_of_layers-i])
self.dict['b' + str(i)] -= (alpha * db[no_of_layers-i])
Activation Functions
There are several activation functions (relu, sigmoid, softmax, tanh) defined in original Vanilla. An example of sigmoid and it’s derivative is shown below.
def sigmoid(self, x):
return (1.0/(1.0+np.exp(-x)))
def d_sigmoid(self, x):
a = self.sigmoid(x)
return a * (1 - a)
Loss Functions
Vanilla class contains three different loss functions — Cross Entropy, Logistic Loss, Mean Squared Error (MSE). An example implementation of Cross entropy is shown below.
def cross_entropy_loss(self, y_train, y_hat, epsilon=1e-11):
m = y_train.shape[0]
n = y_train.shape[1]
y_hat_clip = np.clip(y_hat, epsilon, 1 - epsilon)
result = ((-1.0 / (m)) * np.sum(np.sum(y_train *
np.log(y_hat_clip), axis=1), axis=0))
if (np.any(np.isnan(result))):
print("Error in Cross Entropy")
exit()
return result
Predicting the output
The predict method takes the input data (X) and performs a forward pass through the network in order to get prediction values (y).
def predict(self, X):
no_of_layers = len(self.layers) - 1
a = []
a.append(X)
for i in range(no_of_layers):
a.append(self.forward(a[i], self.dict['W' + str(i+1)],
self.dict['b' + str(i+1)], activation=self.activations[i]))
return a[-1]
This is a developing post and more details will be added soon. If you found this post helpful, follow me and share this post!
Github Link: Vanilla Neural Network Repository
Cheers.