The baseline is also provided on the official website for the Causal Inference and Recommendation of PCIC 2021.

Clicking on “Submit” won’t get you to NAIE, so forget it. By the way, is to www.hwtelcloud.com/products/mt… This link is open for service, but I don’t seem to find any coupons, so forget it.

To proceed, we first create a Git project repository, such as competition_PICIC-2021-Perfection-and-Recommendation, and then clone git to our computer. I’m using the Google Colab virtual machine.

Ok, then follow the tutorial to create a training file, train_demo.py

If Google colab can use vscode link virtual machine instance, you can refer to Running Google colab with VS code | Simen Eide

This screenshot is the effect of using vscode to connect colab, the effect is still good, at least can be in vscode programming:

Then create train_demo.py in the baseline code according to zhihu tutorial, copy the code in the past, and comment the code inside.

Decompress the dataset at the command line:


cd /content/drive/MyDrive/git/competition_PCIC-2021-Causal-Inference-and-Recommendation/dataset

unzip PCIC2021-track2-baselines.zip -d PCIC2021-track2-baselines

sudo apt-get remove unrar

sudo apt-get install unrar

unrar x "PCIC Track 2 train_validation_dataset.rar" train_validation_dataset/

unrar x test_phase1.rar test_phase1/

Copy the code

The screenshot below is after decompression

Connect to colab-ssh using vscode

Configure the Conda environment for colab

The next step is to comment out the train_demo.py code

Install Conda by referring to the following article, since terminal is Python2.7 by default, indicating that Jupyter is running ina virtual environment. So you need to install virtual environment Conda + Google Colab. A guide to installing Conda when using… | by David R. Pugh | Towards Data Science

Refer to the above to complete the conda configuration of COLab

Annotation train_demo. Py

Naie, a machine learning package of Huawei platform, was used in the code, but I did not find it in pypi.org. In fact, it is ok. After understanding the code, you can change it to other machine learning packages.

For a long time, or afraid of comments, because machine learning is not good enough to feel that the code can not understand, in fact, as long as you can run it. You can annotate it while running.

Do not understand the code you are behind the comments do not understand, do not stop, stop or do not understand.

I can’t understand the code, mainly because it’s a black box. I feel uneasy when I look at it. What should I do? I’ll just have to take the course again.

I watched li Hongyi’s machine learning 1, 2 classes, on machine learning and deep learning process.

The local machine runs the training code

Run train_demo.py, on colab or on your computer, directly python train_demo.py.

If you use Huawei’s platform, it becomes the submission of the training script, and the selection of the engine means the selection of the deep learning package.

Huawei platform submits the training code

If you can’t change the code, use huawei platform.

Because the indentation of the code was disorderly when I copied it, please refer to zhihu article PCIC 2021: Zhihu indented it and then submitted it to Huawei platform for training results. I was thinking of annotating it and converting it into my own code and running it on the local PC/COLab.

I was confused by the process of Huawei platform, but AFTER trying, I probably know the process.

Subscribe to model training service, data set service

We can see that the model and training tasks are in the submitted scope, so the platform does not have a terminal interface, mainly the Notebook interface.

NAIE This is NAIE’s console.

Create projects, data subscriptions, and new model training projects

Enter the model Training Service console interface, here “New Model Training Project”

Create a project on the NAIE Training platform. After creating the project, click “Subscribe data” on the Data Set page to subscribe to the data set:

Then enter the “Model training” page “New Model Training Project” and select WebIDE. It takes about 3-5 minutes to create a WebIDE:

Edit the training task file train_demo.py

Edit the code as follows:

Configuration training project

Start training

Click “Start Training” at the bottom of the above training task configuration page to jump to the training project page:

Training failed, view print message:

The error was found, modified, and then retrained. The retraining procedure is as follows:

The error was reported again:

Then during retraining, CPU, GPU and RAM changes can not be seen during training, so it is estimated to be wrong again:

This is a troublesome way to check errors, because many of them are code editing problems:

Maybe there is an indentation problem, because there is a problem with indentation when I copy it from Zhihu, I used 4 Spaces to represent TAB, so I adjusted the indentation manually.

This is a big problem because Python requires strict consistency in indentation, with no more or less than one space.

After many indenting problems, the training was completed, but the file /cache/submit.csv was not found. This file existed temporarily during the training, and was copied to the cloud storage:

s3://bucket-kak0h2bo/72e24879924c4d45be34a6c1f86f6e29/02adc51b92ca4fa59f7c0af851041fea/Job/algo-train_demo/train_demo-74 302/output/submit.csv

Mox.file. copy(‘/cache/submit. CSV ‘, ‘/home/ma-user/work/Algorithm/algo-train_demo’, ‘submit

Use the code above to copy submit.csv into the training project directory

Os. system(‘cp /cache/submit.csv./submit.csv’

Np.savetxt (“/home/ma-user/work/Algorithm/algo-train_demo/submit. CSV “, submit, FMT = (‘%d’,’%d’,’%f’)) Isn’t the virtual environment for code editing the same machine as the training environment?

Submit training results

After the above training is completed, save the code according to the result, you can see that the full path of the output file is /cache/submit.csv, that is, submit the CSV file to the platform, and then the platform calculates the accuracy rate and periodically releases the ranking.

Training sets do not provide labels, so scores need to be submitted before they can be calculated.

summary

This is done, can run train_demo. Py code in gist.github.com/eatcosmos/c…

changelog

  • The process has gone through, the writing is quite messy, we will sort it out later

  • June 23, 2021 Delayed