4 minutes
Pytorch made simple
Creating tensors
A tensor is a multidimensional array, we can create them and upload to the GPU. There are varios types of tensorsin particular FloatTensor (32 bit), ByteTensors (8 bit) and LongTensors (64 bit). You can also create numpy arrays and convert them into tensors
import torch
import numpy as np
# Create a floar tensor
a = torch.FloatTensor([2,3])
print(a)
# Create a tensor of zeros with the same shape as a
a.zero_()
# Create an array in numpy and put it into a tensor
n = np.zeros(3,2)
b = torch.tensor(n)
Tensors in GPU
It is fairly straightforward, just create the tensors and send them to the device GPU. Use the method .to(device)
to make a copy. If you are sure your are going to send them to cuda, use .cuda()
.
Adding gradients
Each created tensor has several attributes related to gradients:
grad
: Holds a tensor of the same shape with the gradients.is_leaf=True
: Indicates if the tensor was constructed by the user or is a result of an operation.requires_grad=True
: Indicates if the tensor requires the gradient to be calculated. By default, the constructor has this set toFalse
.
To make this clear, let’s look at the following example of a simple neural network. To calculate all the gradients in the graph, you can use the .backward()
method of the gradient.
import torch
# Define the tensors
v1=torch.tensor([1.0, 1.0],requires_grad=True)
v2=torch.tensor([2.0, 2.0])
# Create the graph
v_sum=v1 + v2
v_res=(v_sum*2).sum()
# Shall be true, due to inheritance
print(v_res.requires_grad)
# Compute the gradients
v_res.backward()
print(v1.grad)
Creating NNs in pytorch
From the torch.nn
package, we have a ton of predefined classes providing the basic functionality of neural networks. In the code bellow we use nn.Linear(2, 5)
to construct a layer with 2 inputs and 5 outputs, with all the weights properly initialized. Some other useful methods include:
.parameters()
: Returns the weights..zero_grad()
: Initializes all weights of the object to zero..to(device)
: Sends the network to CUDA..state_dict()
: Retrieves the state dictionary of the model..load_state_dict()
: Useful for loading and saving different neural network states.
import torch.nn as nn
import torch
# Create a sample tensor
v = torch.FloatTensor([1,2])
# Create a NN with one layer of 2 inputs and 5 outputs
layer = nn.Linear(2,5)
# Pass the tensor to the layer and get the output
print(layer(v))
The sequential classes
Allows to combine several layers into a single call, here we created a 3 layer neural network with ReLu activation and dropout.
s = nn.Sequential(
nn.Linear(2,5),
nn.ReLU(),
nn.Linear(5,20),
nn.ReLU(),
nn.Linear(20,10),
nn.ReLU(),
nn.Dropout(p=0.3),
nn.SoftMax(dim=1)
)
Loss Functions
Loss functions define our training objective. They evaluate how well or poorly our model is performing, or in simple terms, how close the network’s prediction is to the desired result. We have a variety of loss functions to choose from, all included in the nn
module:
nn.MSELoss
: Mean squared error.nn.BCELoss
: Binary cross-entropy, used in classification.nn.CrossEntropyLoss
: The widely used maximum likelihood criterion.
Optimizers
Optimizers adjust the model parameters based on the gradients with respect to the loss function to minimize it. They are all part of the torch.optim
package. The main optimizers include:
SGD
: Stochatic gradient descentRMSprop
Adagrad
: An adaptive optimizer.
A sample training loop
import torch
import torch.nn as nn
import torch.optim as optim
import numpy as np
# Define the dataset
def generate_data(num_samples=100, num_features=2):
# Randomly generate input data and labels
X = np.random.rand(num_samples, num_features)
y = (np.sum(X, axis=1) > 1).astype(np.float32) # Binary classification: sum > 1
return torch.tensor(X, dtype=torch.float32), torch.tensor(y, dtype=torch.float32)
# Define the neural network
class SimpleNN(nn.Module):
def __init__(self):
super(SimpleNN, self).__init__()
self.pipe = nn.Sequential(
nn.Linear(2, 5),
nn.ReLU(),
nn.Linear(5, 1),
nn.Sigmoid() # Output layer for binary classification
)
def forward(self, x):
return self.pipe(x)
# Initialize data, model, loss function, and optimizer
X, y = generate_data()
# Create the NN from the class from above
model = SimpleNN()
# Binary Cross-Entropy Loss
loss_function = nn.BCELoss()
# Create the optimizer
optimizer = optim.SGD(model.parameters(), lr=0.01)
# Training loop
num_epochs = 20
batch_size = 10
num_batches = len(X) // batch_size
for epoch in range(num_epochs):
epoch_loss = 0.0
for i in range(num_batches):
# Get batch data
batch_start = i * batch_size
batch_end = batch_start + batch_size
batch_X = X[batch_start:batch_end]
batch_y = y[batch_start:batch_end]
# Forward pass
outputs = model(batch_X)
loss = loss_function(outputs.squeeze(), batch_y)
# Backward pass
optimizer.zero_grad() # Reset gradients
loss.backward() # Compute gradients
optimizer.step() # Update parameters
# Accumulate batch loss
epoch_loss += loss.item()
print(f"Epoch [{epoch+1}/{num_epochs}], Loss: {epoch_loss/num_batches:.4f}")
# Save the model
torch.save(model.state_dict(), "simple_nn_model.pth")
# Load the model (optional)
model.load_state_dict(torch.load("simple_nn_model.pth"))
model.eval() # Switch to evaluation mode