Set the tensor default dtype: torch.set_default_tensor_type(torch.doubletensor)
Pytorch has eight types:
Daya type | dtype | Tensor types |
---|---|---|
32 – bit floating point | torch.float32 or torch.float |
torch.*.FloatTensor |
64 – bit floating point | torch.float64 or torch.double |
torch.*.DoubleTensor |
16 – bit floating point | torch.float16 or torch.half |
torch.*.HalfTensor |
8-bit integer (unsigned) | torch.uint8 |
torch.*.ByteTensor |
8-bit integer (signed) | torch.int8 |
torch.*.CharTensor |
16-bit integer (signed) | torch.int16 or torch.short |
torch.*.ShortTensor |
32-bit integer (signed) | torch.int32 or torch.int |
torch.*.IntTensor |
64-bit integer (signed) | torch.int64 or torch.long |
torch.*.LongTensor |
Save the model:
def save_checkpoint(model, optimizer, scheduler, save_path) :
If there are other variables you want to save, you can add them
torch.save({
'model_state_dict': model.state_dict(),
'optimizer_state_dict': optimizer.state_dict(),
'scheduler_state_dict': scheduler.state_dict(),
}, save_path)
# Load model
checkpoint = torch.load(pretrain_model_path)
model.load_state_dict(checkpoint['model_state_dict'])
optimizer.load_state_dict(checkpoint['optimizer_state_dict'])
scheduler.load_state_dict(checkpoint['scheduler_state_dict'])...Copy the code
Gradient of print model:
Print gradient
for name, parameters in model.named_parameters():
print('{}\'s grad is:\n{}\n'.format(name, parameters.grad))
Copy the code
Using gradient attenuation strategy:
# Exponential decay
scheduler = torch.optim.lr_scheduler.ExponentialLR(optimizer, gamma=0.9)
# step attenuation
scheduler = torch.optim.lr_scheduler.StepLR(optimizer, step_size=200, gamma=0.5)
# Custom interval attenuation
scheduler = torch.optim.lr_scheduler.MultiStepLR(optimizer, milestones=[400], gamma=0.5)
Copy the code
Gradient truncation:
def clip_gradient(optimizer, grad_clip) :
""" Clips gradients computed during backpropagation to avoid explosion of gradients. :param optimizer: optimizer with the gradients to be clipped :param grad_clip: clip value """
for group in optimizer.param_groups:
for param in group["params"] :if param.grad is not None:
param.grad.data.clamp_(-grad_clip, grad_clip)
Copy the code
Examples of custom activation functions:
class OutExp(nn.Module) :
def __init__(self) :
super(OutExp, self).__init__()
def forward(self, x) :
x = -torch.exp(x)
return x
Copy the code
Modify parameters of a layer of the model: nn.parameter () :
# Change layer 2 bias (' layer 'is the name given when the model is defined)
model.layer[2].bias = nn.Parameter(torch.tensor([-0.01, -0.4], device=device, requires_grad=True))
Copy the code
Model parameter initialization:
# Custom weight initialization
def weight_init(m) :
if isinstance(m, nn.Linear):
nn.init.xavier_uniform_(m.weight, gain=0.1)
nn.init.constant_(m.bias, 0)
You can also determine if it is a conv2D, using the appropriate initialization method
elif isinstance(m, nn.Conv2d):
nn.init.kaiming_normal_(m.weight, mode='fan_out', nonlinearity='relu')
# Whether it is a batch normalization layer
elif isinstance(m, nn.BatchNorm2d):
nn.init.constant_(m.weight, 1)
nn.init.constant_(m.bias, 0)
# Model application function
model.apply(weight_init)
Copy the code