7 Multiple Dimension Input

Diabetes Dataset 糖尿病数据集

Multiple Dimension Logistic Regression Model

n指第几个特征

i指第几个样本

x与w想乘结果是标量，转秩后自身

Mini-Batch(N samples)

element-wise 按元素依次计算，可用于向量计算

转换成矩阵运算，可以利用GPU进行并行计算

class Model(torch.nn.Module):
	def __init__(self):
        super(Model, self).__init__()
        # 输入维度8，输出维度1
        self.linear = torch.nn.Linear(8, 1)
    def forward(self, x):
        x = self.sigmoid(self.linear(x))
        return x
model = Model()

Linear Layer

1	`self.linear = torch.nn.Linear(8, 1)`

输出维度为2，将8维空间降到2维。相当于原来八维中的某几个元素通过矩阵映射到二维中的某个值。

1	`self.linear = torch.nn.Linear(8, 1)`

将矩阵看成空间变换的函数

Neural Network

多次维度改变

Example: Artificiall Neural Network

Example: Diabetes Prediction

Y表示病情是否会加重

import numpy as np
# 逗号分隔读取，数据类型是float32
xy = np.loadtxt('diabetes.csv.gz', delimiter=',', dtype=np.float32)
# 将一个 NumPy 数组的部分切片转换成 PyTorch 中的 Tensor 对象。
# 所有行，所有列（除去最后一列）
x_data = torch.from_numpy(xy[:,:-1])
# 取最后一列，有中括号代表是矩阵不是向量
y_data = toech.from_numpy(xy[:, [-1]])

python中x[:],x[:-1],x[:,]x[:,-1]等操作含义解析

Define Model

import torch
class Model(torch.nn.Module):
    def __init__(self):
        super(Model, self).__init__()
        self.linear1 = torch.nn.Linear(8, 6)
        self.linear2 = torch.nn.Linear(6, 4)
        self.linear3 = torch.nn.Linear(4, 1)
        # 没有参数，只需要定义一次
        self.sigmoid = torch.nn.Sigmoid()
        
     def forward(self, x):
        x = self.sigmoid(self.linear1(x))
        x = self.sigmoid(self.linear2(x))
        x = self.sigmoid(self.linear3(x))
        return x
    
    model = Model()

Construct Loss ans Optimizer

Training Cycle

使用Relu作为激活函数

Relu取值是[0, 1]，计算y_hat时需要使用sigmoid，防止对数取到0.

import torch
class Model(torch.nn.Module):
    def __init__(self):
        super(Model, self).__init__()
        self.linear1 = torch.nn.Linear(8, 6)
        self.linear2 = torch.nn.Linear(6, 4)
        self.linear3 = torch.nn.Linear(4, 1)
        self.activate = torch.nn.ReLU()
        
     def forward(self, x):
        x = self.activate(self.linear1(x))
        x = self.activate(self.linear2(x))
        # 计算y_hat时需要使用sigmoid，防止对数取到0.
        x = self.sigmoid(self.linear3(x))
        return x
    
    model = Model()

8 Dataset and DataLoader

Termininology: Epoch, Batch-Size, Iterations

# Training cycle
for epoch in range(training_epochs)
	#Loop over all batches
    for i in range(total_batch):

Definition:Epoch

One forward pass and one backward pass of all the training examples.

Definition:Batch-Size

The number of training examples in one forward backward pass.

Definition:Iteration

Number of passes,each pass using [batch size]
number of examples.batch-size取了多少次。

Iteration × Batch-Size = The number of Smaples

DataLoader: batch_size = 2, shuffle = True

shuffle:打乱顺序，使每一个epoch取得batch不同

How to define your Dateset

Dataset是抽象类，不能实例化，需要自己继承它创建自己的类，DataLoader是类，可以直接实例化

import torch
# Dataset is an abstract class.We can define our class inherited from this class.
from torch.utils.data import Dataset
# DataLoader is a class to help us loading data in PyTorch.
from torch.utils.data import DataLoader
# DiabetesDataset is inherited from abstract class Dataset.
class DiabetesDataset(Dataset):
    def __init__(self):
        pass
    # The expression,dataset[index] will call this magic function.
    def __getitem__(self, index):
        pass
    # This magic function returns length of dataset.
    def __len__(self):
        pass
# Construct DiabetesDataset object
dataset = DiabetesDataset()
# Initialize loader with batch-size, shuffle,process number(进程数).
train_loader = DataLoader(dataset=dataset, batch_size=32, shuffle=True, num_workers=2)

ExtraL num_workers in Windows

train_loader = DataLoader(dataset=dataset, batch_size=32, shuffle=True, num_workers=2)
......
# enumerate() 函数用于将一个可遍历的数据对象(如列表、元组或字符串)组合为一个索引序列，同时列出数据和数据下标，一般用在 for 循环当中。
for epoch in range(100):
    for i, data in enumerate(train_loader, 0):
        ......

The implementation of multiprocessing is different on Windows,which uses spawn instead of fork.

So we have to wrap (包裹)the code with an if-clause to protect the code from executing multiple times.

if_name_ == '_main_':
    for epoch in range(100):
    	for i, data in enumerate(train_loader, 0):
        	# 1.Prepare data

Example: Diabetes Dataset

class DiabetesDataset(Dataset):
    def __init__(self, filepath):
        xy = np.loadtxt(filepath, delimiter=',', dtype=np.float32)
        # xy是N行9列，shape是(N,9)，取shape第0个元素N，即样本数
        self.len = xy.shape[0]
        # 数据都加载到内存，后面可以用索引取
        self.x_data = torch.from_numpy(xy[:, -1])
        self.y_data = torch.from_numpy(xy[:, [-1]])
        
    def __getitem(self, index):
        # return x, y表示返回元组(x, y)
        return self.x_data[index], self.y_data[index]
    def __len__(self):
        return self.len
dataset = DiabetesDataset('diabetes.csv.gz')
train_loader = DataLoader(dataset=dataset, batch_size=32, shuffle=True, num_workers=2)

Using DataLoader

for epoch in range(100):
    # data内有(x, y)
    for i, data in enumerate(train_loader, 0):
        # 1.Prepare data
        # data会被DataLoader自动转化为张量
        inputs, labels = data
        # 2 Forward
        y_pred = model(inputs)
        loss = criterion(y_pred, labels)
        print(epoch, i, loss.item())
        # 3 Backward
        optimizer.zero_grad()
        loss.backward()
        # 4 Updata
        optimizer.step()

总过程

Avaiable dataset loaders

Example: MINST Dataset

训练数据集需要shuffle，测试时不需要shuffle，方便观察记录

9 Softmax Classifier

Design 10 putputs using Sigmoid?

10分类不能满足分布要求：概率总和为1

We hope the outputs is competitivel (竞争性) Actually we hope the neural network outputs a distribution.

Output a Distribution of prediction with Softmax

Softmax Layer

使用指数因为指数值域大于零

Example

Loss function - Cross Entropy

NLLLOSS(Negative Log Likelihood Loss负对数似然损失)

Cross Entropy in Numpy

import numpy as np
y = np.array([1, 0, 0])
z = np.array([0.2, 0.1, -0.1])
y_pred = np.exp(z) / np.exp(z).sum()
loss = (-y * np.log(y_pred)).sum()
print(loss)

Cross Entropy in Pytorch

import torch
# 长整数型
y = torch.LongTensor([0])
z = torch.Tensot([0.2, 0.1, -0.1])
criterion = torch.nn.CrossEntropyLoss()
loss = critetion(z, y)
print(loss)

Mini-Batch: batch_size = 3

import torch
criterion = torch.nn.CrossEntropyLoss()
Y = torch.LongTensor([2, 0, 1])
Y_pred1 = torch.Tensor([[0.1, 0.2, 0.9],
                       [1.1, 0.1, 0.2],
                       [0.2, 2.1, 0.1]])
Y_pred2 = torch.Tensor([[0.8, 0.2, 0.3],
                       [0.2, 0.3, 0.5],
                       [0.2, 0.2, 0.5]])
l1 = critetion(Y_pred1, Y)
l2 = criterion(Y+pred2, Y)
print("Batch Loss1 = ", l1.data, "\nBatch Loss2=", l2.data)

每一行为一个样本，每一行的某一列是某个特征。某个特征值最大，代表属于这个分类。

Back to MNIST Dataset

Implementation of classifier to MNIST dataset

Import Package

import torch
# For constructing DataLoader
from torchvision import transforms
from torchvision import datasets
from torch.utils.data import DataLoader
# For using function relu()
import torch.nn.functional as F
# For constructing Optimizer
import torch.optim as optim

Prepare Dataset

batch_size = 64
transform = transforms.Compose([
    # Convert the PIL Image to Tensor
    transforms.ToTensor(),
    # 归一化。The parameters are mean均值 and std标准差 respectively.
    # 可以将输入的图像数据进行归一化处理，使得数据分布在均值为 0、标准差为 1 的范围内。这种预处理操作可以帮助训练过程更加稳定且收敛更快。
    # 这里只有一个值 0.1307，意味着对图像的每个通道都要减去这个值。
    # (0.1307, 0.5, 0.2) 表示对图像的第一个通道减去 0.1307，对第二个通道减去 0.5，对第三个通道减去 0.2。
    transforms.Normalize((0.1307, ), (0.3801, ))
])
train_dataset = dataset.MNIST(root='../dataset/mnist/', train=True, download=True, transform=transform)
train_loader = DataLoader(train_dataset, shuffle=True, batch_size=batch_size)

test_dataset = dataset.MNIST(root='../dataset/mnist/', train=False, download=True, transform=transform)
test_loader = DataLoader(test_dataset, shuffle=False, batch_size=batch_size)

图像张量

Pytorch需要将(W,H,C)格式转换成(C,W,H)

归一化

Design Model

# 转换成二阶张量
x = x.view(-1, 784)
# 线性层
self.l1 = torch.nn.Linear(784, 512)
x = F.relu(self.l1(x))
self.l2 = torch.nn.Linear(512, 256)
x = F.relu(self.l2(x))
self.l3 = torch.nn.Linear(256, 128)
x = F.relu(self.l3(x))
self.l4 = torch.nn.Linear(128, 64)
x = F.relu(self.l4(x))
self.l5 = torch.nn.Linear(64, 10)

class Net(torch.nn.Module):
    def __init__(self):
        super(Net, self).__init__()
        self.l1 = torch.nn.Linear(784, 512)
        self.l2 = torch.nn.Linear(512, 256)
        self.l3 = torch.nn.Linear(256, 128)
        self.l4 = torch.nn.Linear(128, 64)
        self.l5 = torch.nn.Linear(64, 10)
        
     def forward(self, x):
        x = x.view(-1, 784)
        x = F.relu(self.l1(x))
        x = F.relu(self.l2(x))  
        x = F.relu(self.l3(x))
        x = F.relu(self.l4(x))
        # 最后一层不用激活，直接输入softmax
        return self.l5(x)
model = Net()

Construct Loss and Optimizer

criterion = torch.nn.CrossEntropyLoss()
'''
momentum 参数可以被设定为一个[0, 1]的值。当该值较小（例如接近0）时，意味着过去的梯度对当前步骤的影响较小，更新步骤更加依赖于当前的梯度信息。当 momentum 较大（例如接近1）时，过去的梯度对当前步骤的影响更显著，更新步骤会更加保持一定的方向性。
通过引入动量项，momentum 可以帮助加快训练速度、跳出局部最优点，并减少参数更新的波动性。它在处理具有高曲率或峡谷形状的损失函数时特别有用。
'''
optimizer = optim.SGD(model.paramenter(), lr=0.01, momentum=0.5)

Train and Test

def train(epoch):
    running_loss = 0.0
    for batch_idx, data in enumerate(train_loader, 0):
        inputs, target = data
        optimizer.zero_grad()
        # forward + backward + update
        outputs = model(inputs)
        loss = criterion(outputs, target)
        loss.backward()
        optimizer.step()
        
        running_loss += loss.item()
        # 每300次训练输出loss
        if batch_idx % 300 == 299:
            # 百分号 % 是格式化字符串的操作符，用于将括号中的变量值插入到整个字符串中的占位符位置。
            pring('[%d, %5d] loss: %.3f' % (epoch + 1, batch_idx + 1, running_loss / 300))
            running_loss = 0.0

def test():
    correct = 0
    total = 0
    # 不会计算梯度
    with torch.no_grad()
    	for data in test_loader
        	images, labels = data
            outputs = model(images)
            # outputs是矩阵，dim=1代表按列方向取最大值,max返回最大值和最大值的下标，_代表忽略最大值，predicted是最大值的下标，即预测类别
            -, predicted = torch.max(outputs.data, dim=1)
            # labels是N×1的矩阵，size是元组(N,1),size(0)取第0个元素N。total就是训练到这一轮位置类别总数
            total += labels.size(0)
            '''
            比较 predicted 和 labels 两个张量中的每个元素是否相等。这将返回一个布尔值张量，其中对于相等的元素位置，对应的值为 True，否则为 False。
接着，.sum() 对布尔值张量进行求和操作，统计出所有为 True 的元素的数量。最后，使用 .item() 将结果转换为 Python 中的标量值。
            '''
            correct += (predicted == labels).sum().item()
            # %%代表输出百分号%
    print('Accuracy on test set: %d %%' % (100 * correct / total))

if __name__ == '__main__':
    for epoch in range(10):
        train(epoch)
        test()

10 Basic CNN

线性层也称全连接层

Convolutional Neural Network

保留空间信息

下采样Subsampling

通道数不变，宽度和高度会变，减少数据量，降低运算需求。

Convolution

Single Input Channel

3 Input Channels

输入通道数和卷积核通道数相同

N Input Channels

N Input Channels and M Output Channels

卷积核个数决定输出的特征图的通道

Convolutional Layer

import torch
in_channels, out_channels = 5, 10
width, height = 100, 100
kernel_size = 3
batch_size = 1
# 正态分布随机采样
input = torch.randn(batch_size, in_channels, width, height)
conv_layer = torch.nn.Conv2d(in_channels, out_channels, kernel_size=kernel_size)
output = conv_layer(input)
print(input.shape) # torch.Size([1, 5, 100, 100])
print(output.shape) # torch.Size([1, 10, 98, 98])
# 10是输出通道，5是输入通道
print(conv_layer.weight.shape) # torch.Size([10, 5, 3, 3])

padding = 1

一般卷积核3×3一圈，5×5两圈

import torch
input = [3, 4, 6, 5, 7,
        2, 4, 6, 8, 2,
        1, 6, 7, 8, 4,
        9, 7, 4, 6, 2,
        3, 7, 5, 4, 1]
# B, C, W, H
input = torch.Tensor(input).view(1, 1, 5, 5)
# padding = 1
conv_layer = torch.nn.Con2d(1, 1, kernel_size=3, padding=1, bias=False)
# 卷积核
# 输出通道数，输入通道数，宽度，高度
kernel = torch.Tensor([1, 2, 3, 4, 5, 6, 7, 8, 9]).view(1, 1, 3, 3)
conv_layer.weight.data = kernel.data
output = conv_layer(input)
print(output)

stride = 2

import torch
input = [3, 4, 6, 5, 7,
        2, 4, 6, 8, 2,
        1, 6, 7, 8, 4,
        9, 7, 4, 6, 2,
        3, 7, 5, 4, 1]
# B, C, W, H
input = torch.Tensor(input).view(1, 1, 5, 5)
# padding = 1
conv_layer = torch.nn.Con2d(1, 1, kernel_size=3, stride=2, bias=False)
# 卷积核
# 输出通道数，输入通道数，宽度，高度
kernel = torch.Tensor([1, 2, 3, 4, 5, 6, 7, 8, 9]).view(1, 1, 3, 3)
conv_layer.weight.data = kernel.data
output = conv_layer(input)
print(output)

Max Pooling Layer

通道数量不变

import torch
input = [3, 4, 6, 5
        2, 4, 6, 8
        1, 6, 7, 8
        9, 7, 4, 6
        ]
input = torch.Tensor(input).view(1, 1, 4, 4)
maxpooling_layer = torch.nn.MaxPool2d(kernel_size=2)
output = maxpooling_layer(input)
print(output)

A Simple Convolutional Neural Network

class Net(torch.nn.Module):
    def __init__(self):
        super(Net, self).__init__()
        self.conv1 = torch.nn.Con2d(1, 10, kernel_size=5)
        self.conv2 = torch.nn.Conv2d(10, 20, kernel_size=5)
        self.pooling = torch.nn.MaxPool2d(2)
        self.fc = torch.nn.Linear(320)
        
    def forward(self, x):
        # Flatten data from(n, 1, 28, 28) to (n, 784)
        # 样本数量
        batch_size = x.size(0)
        x = F.relu(self.pooling(self.conv1(x)))
        x = F.relu(self.pooling(self.conv2(x)))
        x = x.view(batch_size, -1) # flatten
        x = self.fc(x)
        return x
    
model = Net()

How to use GPU

Move Model to GPU

class Net(torch.nn.Module):
    def __init__(self):
        super(Net, self).__init__()
        self.conv1 = torch.nn.Con2d(1, 10, kernel_size=5)
        self.conv2 = torch.nn.Conv2d(10, 20, kernel_size=5)
        self.pooling = torch.nn.MaxPool2d(2)
        self.fc = torch.nn.Linear(320)
        
    def forward(self, x):
        # Flatten data from(n, 1, 28, 28) to (n, 784)
        # 样本数量
        batch_size = x.size(0)
        x = F.relu(self.pooling(self.conv1(x)))
        x = F.relu(self.pooling(self.conv2(x)))
        x = x.view(batch_size, -1) # flatten
        x = self.fc(x)
        return x
    
model = Net()
# Define device as the first visible cuda device if we have CUDA available.
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
# Convert parameters and buffers of all modules to CUDA Tensor.
model.to(device)

Move Tensors to GPU

def train(epoch):
    running_loss = 0.0
    for batch_idx, data in enumerate(train_loader, 0):
        inputs, target = data
        # Send the inputs and targets at every step to the GPU.
        inputs, target = inputs.to(device), target.to(device)
        optimizer.zero_grad()
        # forward + backward + update
        outputs = model(inputs)
        loss = criterion(outputs, target)
        loss.backward()
        optimizer.step()
        
        running_loss += loss.item()
        if batch_idx % 300 == 299:
            pring('[%d, %5d] loss: %.3f' % (epoch + 1, batch_idx + 1, running_loss / 300))
            running_loss = 0.0

def test():
    correct = 0
    total = 0
    with torch.no_grad()
    	for data in test_loader
        	images, labels = data
             # Send the inputs and targets at every step to the GPU.
        images, labels = images.to(device), labels.to(device)
            outputs = model(images)
            -, predicted = torch.max(outputs.data, dim=1)
            total += labels.size(0)
            correct += (predicted == labels).sum().item()
    print('Accuracy on test set: %d %%' % (100 * correct / total))