PyTorch: A summary of usage tips

Set the tensor default dtype: torch.set_default_tensor_type(torch.doubletensor)

Pytorch has eight types:

Daya type	dtype	Tensor types
32 – bit floating point	`torch.float32` or `torch.float`	`torch.*.FloatTensor`
64 – bit floating point	`torch.float64` or `torch.double`	`torch.*.DoubleTensor`
16 – bit floating point	`torch.float16` or `torch.half`	`torch.*.HalfTensor`
8-bit integer (unsigned)	`torch.uint8`	`torch.*.ByteTensor`
8-bit integer (signed)	`torch.int8`	`torch.*.CharTensor`
16-bit integer (signed)	`torch.int16` or `torch.short`	`torch.*.ShortTensor`
32-bit integer (signed)	`torch.int32` or `torch.int`	`torch.*.IntTensor`
64-bit integer (signed)	`torch.int64` or `torch.long`	`torch.*.LongTensor`

Save the model:

def save_checkpoint(model, optimizer, scheduler, save_path) :
	If there are other variables you want to save, you can add them
    torch.save({
        'model_state_dict': model.state_dict(),
        'optimizer_state_dict': optimizer.state_dict(),
        'scheduler_state_dict': scheduler.state_dict(),
    }, save_path)

# Load model
checkpoint = torch.load(pretrain_model_path)
model.load_state_dict(checkpoint['model_state_dict'])
optimizer.load_state_dict(checkpoint['optimizer_state_dict'])
scheduler.load_state_dict(checkpoint['scheduler_state_dict'])...Copy the code

Gradient of print model:

Print gradient
for name, parameters in model.named_parameters():
	print('{}\'s grad is:\n{}\n'.format(name, parameters.grad))
Copy the code

Using gradient attenuation strategy:

# Exponential decay
scheduler = torch.optim.lr_scheduler.ExponentialLR(optimizer, gamma=0.9)
# step attenuation
scheduler = torch.optim.lr_scheduler.StepLR(optimizer, step_size=200, gamma=0.5)
# Custom interval attenuation
scheduler = torch.optim.lr_scheduler.MultiStepLR(optimizer, milestones=[400], gamma=0.5)
Copy the code

Gradient truncation:

def clip_gradient(optimizer, grad_clip) :
    """ Clips gradients computed during backpropagation to avoid explosion of gradients. :param optimizer: optimizer with the gradients to be clipped :param grad_clip: clip value """
    for group in optimizer.param_groups:
        for param in group["params"] :if param.grad is not None:
                param.grad.data.clamp_(-grad_clip, grad_clip)
Copy the code

Examples of custom activation functions:

class OutExp(nn.Module) :
    def __init__(self) :
        super(OutExp, self).__init__()

    def forward(self, x) :
        x = -torch.exp(x)
        return x
Copy the code

Modify parameters of a layer of the model: nn.parameter () :

# Change layer 2 bias (' layer 'is the name given when the model is defined)
model.layer[2].bias = nn.Parameter(torch.tensor([-0.01, -0.4], device=device, requires_grad=True))
Copy the code

Model parameter initialization:

# Custom weight initialization
def weight_init(m) :
    if isinstance(m, nn.Linear):
        nn.init.xavier_uniform_(m.weight, gain=0.1)
        nn.init.constant_(m.bias, 0)
    You can also determine if it is a conv2D, using the appropriate initialization method
    elif isinstance(m, nn.Conv2d):
        nn.init.kaiming_normal_(m.weight, mode='fan_out', nonlinearity='relu')
    # Whether it is a batch normalization layer
    elif isinstance(m, nn.BatchNorm2d):
        nn.init.constant_(m.weight, 1)
        nn.init.constant_(m.bias, 0)

# Model application function
model.apply(weight_init)
Copy the code

Related Posts

[TensorFlow Pit Filling Tour] Constantly updated…

Nova project: Python print output with color summary

GAN: “I won’t generate the hard parts. Excuse me.”