This article has participated in the “new creative Ceremony” activity

Software download and installation

Software required (personal use is this, but also other)

1. Xshell Use Xshell to connect to the server, and perform command operations (you can also use other software, such as SSH) download and install: you can find a free version on the Internet (it is recommended to support the legal version), download it and install it as instructed.

2. Download and install XManager: if you do not use the graphical function, you can not install xManager. Similarly, you can find a free version on the Internet (it is recommended to support the legal version).

3. Pycharm download and install: you can choose from the community version and the professional version. The community version is free and the general functions are supported. All we need to do here is upload and download the code with PyCharm and the weights we have trained, and the community version is all we need.

Configuration xshell

Right-click all sessions in session Manager and create a new -> Session

Note: this new session 1 is a new session created by the author. This session is not used for the first time

Enter the session name (optional), host, and port number

Set X11 transfer in tunnel options in SSH to Xmanager

Use Xshell to connect to the server

If the Intranet server is used, log in to the Intranet first

Enter a user name

Enter the password

The connection is successful

Configuration pycharm

Community edition and Professional edition can be configured, the location may be different, use the professional edition to configure

Start by finding and selecting Tools from the menu

Select Deployment – > Configuration

Choose + >SFTP

Click ‘… ‘on SSH Configuration. ‘

Host enter the IP address, User Name enter the User Name, and PassWord enter the PassWord. If the port number is not the default 22, you need to change the port number

Select Save Password and click Test Connection to Test the Connection. If it is shown in the picture, the Connection is successful

To open Remote Host, use Tools->Deployment->Browse Remote Host

According to the Remote Host

Click the down arrow and select SFTP that has just been configured

In this directory you can see your user’s folder, it is best to upload code in your folder

For example, to upload the code to the server, select the code on the left of PyCharm and drag it to the folder to be uploaded on the right. To download the trained weights, you only need to select the weight files in the server on the right and drag them to the folder on the left

Installing dependency packages

In general, the dependency packages of the server environment just configured are not enough to run the code, so we need to install the dependency packages before training.

PIP timeout problem: Some servers are not connected to the Internet, so the server runs the command equivalent to the server running offline. Note: this command can not be used directly, many places need to change their own environment information, such as: write “server address” location, change their own server address, (X11; Ubuntu; Linux x86_64; Rv :81.0) should also be changed to the server Linux corresponding version, these information is generally a command can be checked, can not be found to ask the server administrator.

curl 'http://server address /0.htm' -H 'the user-agent: Mozilla / 5.0 (X11; Ubuntu; Linux x86_64; The rv: 81.0) Gecko / 20100101 Firefox 81.0 / ' -H 'Accept: text/html,application/xhtml+xml,application/xml; Q = 0.9, image/webp, * / *; Q = 0.8 ' -H 'Accept-Language: zh-CN,zh; Q = 0.8, useful - TW; Q = 0.7, useful - HK; Q = 0.5, en - US; Q = 0.3, en. Q = 0.2 ' --compressed -H 'Content-Type: application/x-www-form-urlencoded' -H 'Origin: http://server address ' -H 'DNT: 1' -H 'Connection: keep-alive' -H 'Referer: http://server address /0.htm' -H 'Upgrade-Insecure-Requests: 1' --data-raw 'R3=1&v6ip=&DDDDD=2018405A122&upass=260613&save_me=1&0MKKey=123'
Copy the code

PIP can be used normally once the network is connected, but since many packages are external, the download will be very slow, so you should use the mirror source, and also pay attention to the package name for PIP. In this example, OpencV

pip install opencv-python -i https://pypi.tuna.tsinghua.edu.cn/simple
Copy the code

The back of the package name -i https://pypi.tuna.tsinghua.edu.cn/simple is to use the tsinghua mirror

Problems with Cuda installation

There are plenty of tools available on the web for installing Linux versions and TensorFlow, but here’s a look at cudNN. Some of CUDNN’s libs don’t reference properly with GPU acceleration due to issues with my environment variable configuration. In this case, the program will report an error indicating that the so-and-so file could not be found in a path. What I did was search for the file, copy the file to the file path in the error file search command

Locate file nameCopy the code

Training time

Since each server has a time limit, it will automatically disconnect if it does not operate for about 30 minutes, which is not enough for many models, so we need to find ways to extend the time, or permanently prevent the disconnect. 1. Right-click the script on the web page -> Check -> console, and enter the script on the console to prevent disconnection (its principle is to perform a webpage operation at intervals). Here take Colab and Kaggle as an example

function ConnectButton(){
    console.log("Connect pushed"); 
    document.querySelector("#top-toolbar > colab-connect-button").shadowRoot.querySelector("#connect").click() 
}
setInterval(ConnectButton,60000);

function closeButton(){
    console.log("close"); 
    document.querySelector("body > colab-dialog > paper-dialog > colab-sessions-dialog").shadowRoot.querySelector("#footer > div > paper-button.dismiss").click() 
}
setInterval(ConnectButton,60000);

Copy the code

Kaggle applies code

function closeButton(){
    console.log("close"); 
   document.querySelector("#root > div > div > div.AppView-sc-16eb2j.kZXkZl > div.App_Body-sc-16c8j4p.hxOBfv > div.Layout_Body-sc-6piylv.bXAYPy > div > div > div > div.ToolbarContainer_Body-sc-2h8iu7.fhvgBU > button").click() 
}
setInterval(closeButton,60000);

function closeButton(){
    console.log("close"); 
    document.querySelector("#root > div > div > div.AppView-sc-16eb2j.kZXkZl > div.App_Body-sc-16c8j4p.hxOBfv > div.Layout_Body-sc-6piylv.bXAYPy > div > div > div > div.ToolbarContainer_Body-sc-2h8iu7.fhvgBU > div.DetailedStatus_Body-sc-zfwb95.fMzpPO > button > i").click() 
}
setInterval(closeButton,60000);
Copy the code

Disadvantages: As you can see from the above, the code for each different web page is different, which can be troublesome, so we will introduce the use of the nohup command to run offline.

Nohup command: The nohup command needs to be run on the Xshell or SSH command-line interface (CLI). To run the command offline, enter nohup before the command to be executed. The same is true for python files. If nohup is not added, the screen output is displayed in the nohup. Out file generated in the current folderAfter running, xshell can be turned off, or even shut down, and the server will run on its own.

Nohup. Out occupies too much memory

Nohup. Out stores too many characters when the model theory is trained too many times, which takes up too much space. My approach is to delete the nohup.out file once it is up and running, so it will work (not sure how it works, so I’m not sure every server supports this), so I don’t have to worry about the problem of nohup.out being too large.

End the nohup process

First you need to find the process and enter the command:

ps aux | less 
Copy the code

Running results:Note that the second column is the process ID(PID), and the last line is the COMMAND. If it is a Python program, the COMMAND looks like: Python program name argument and then kills the process

kill 9 - PID
Copy the code