Project source: github.com/nickliqian/…
darknet_captcha
The project developed a series of quick start scripts based on Darknet, designed to enable image recognition novice or developer to quickly start a target detection (location) project. If you have any clear points, please feel free to mention issues and PR. I hope we can work together to improve it!
The project is divided into two parts:
- Provides two object detection examples (single classification and multi-classification click verification code), you can get familiar with the use of location yolo3 location network through the examples
- Darknet provides a series of apis for useOwn dataTrain the target detection model and provide code for the Web Server
directory
- The project structure
- Let’s start with an example: single-type target detection
- Second example: multitype target detection
- Train your data
- The Web service
- The API documentation
- Other problems
- Use Ali Cloud OSS to speed up downloads
- GPU cloud recommend
- Comparison of CPU and GPU recognition speed
- Error resolution reported
- TODO
The project structure
The project consists of darknet, Extent, and APP
- Darknet: This section is the source code for the Darknet project without any changes.
- Extent: An extension, including configuration generation, sample generation, training, demo identification, and API programs.
- App: Each new identification requirement is differentiated by app, which includes configuration files, samples, tag files, etc.
Let’s start with an example: single-type target detection
Darknet actually provides us with a series of deep learning algorithms. What we need to do is to use relatively simple steps to call Darknet to train our recognition model.
- The recommended operating system is
ubuntu
There will be fewer potholes. - If Windows is used, you need to install it first
cygwin
For easy compilation of Darknet. (See my blog:Install cygwin)
The following steps have all passed the ubuntu16.04 test.
1. Download the project
git clone https://github.com/nickliqian/darknet_captcha.git
Copy the code
2. Compile darknet
Go to the darknet_captCHA directory and download the Darknet project, overwriting the Darknet directory:
cd darknet_captcha
git clone https://github.com/pjreddie/darknet.git
Copy the code
Go to the Darknet directory and modify the Darknet /Makefile configuration file
cd darknet
vim Makefile
Copy the code
- If GPU training is used then GPU=1 below
- With CPU training, the GPU below =0
GPU=1
CUDNN=0
OPENCV=0
OPENMP=0
DEBUG=0
Copy the code
Then use make to compile darknet:
make
Copy the code
CPU training is not recommended because it takes too long to train and predict. If you need to rent a temporary and inexpensive GPU host for testing, some recommended GPU cloud services are described below. If errors occur during compilation, you can find a solution in darknet’s issue or send me an email asking for the old version of Darknet.
3. Install python3
Execute the following statement using PIP and make sure tk is installed on your system:
pip install -r requirement.txt
sudo apt-get install python3-tk
Copy the code
4. Create an application
Go to the root directory and run the following program to generate the basic configuration for an application:
cd darknet_captcha
python3 extend/create_app_config.py my_captcha 1
Copy the code
The category here generates classES_1 by default, and you can change the category name; Open app/my_captcha/my_captcha.names and change classES_1 to the desired name of the host.
How do I view the create_app_config.py command line parameters? Running python create_app_config.py directly can be viewed on the console, as can the following program.
If you are familiar with darknet configuration, you can open the file and change the parameter values, but we will leave them as they are.
5. Generate samples
Generate samples using another nickliqian/generate_click_captcha project that I have integrated into, execute the following command to generate samples and corresponding labels to the yOLO specified directory in the specified application:
python3 extend/generate_click_captcha.py my_captcha
Copy the code
Run Python generate_click_captcha.py to see an explanation of the parameters.
6. Divide the training set and verification set
Run the following program to divide the training set and validation set, and convert the value of the tag to a format that YOLO recognizes:
python3 extend/output_label.py my_captcha 1
Copy the code
The type should be the same as above. Run python output_label.py to see the parameter explanation.
7. Start training
At this point, we need to prepare one more thing, we need to download the pre-training model provided by Darknet and place it in the darknet_captcha directory:
wget https://pjreddie.com/media/files/darknet53.conv.74
Copy the code
In the darknet_captcha directory, execute the following command to begin training:
./darknet/darknet detector train app/my_captcha/my_captcha.data app/my_captcha/my_captcha_train.yolov3.cfg darknet53.conv.74
Copy the code
During the training, the model will be stored every one hundred iterations and stored under app/ my_CAPtcha /backup/ for viewing.
8. Recognition effect
Training with GTX 1060 is about 1.5 hours, and training iterations to 1000 times will have obvious results.
python3 extend/rec.py my_captcha 100
Copy the code
Here 100 is the hundredth image to be identified by selecting app/my_captcha/images_data/JPEGImages. Run python rec.py to see the parameter explanation.
300 iterations:
9. Picture cutting
This part is relatively simple, and there is plenty of sample code on the web. You can call the darknet_interface.cut_and_save method to cut down the located characters.
The classifier
It’s easy to get to the classification step, either using Darknet’s own classifier or using CNN_Captcha, a project that uses convolutional neural networks to identify captchas.
11. A summary
The general process of identifying verification codes is as follows:
- Collect the sample
- Label (mark coordinates and characters)
- Training locator
- Detect the position and cut the picture
- Training classifier
- Use locator + classifier to identify the position and character category of the character on the selected captchas
Second example: multitype target detection
The steps are basically the same as above, with the command directly listed:
Generate the configuration file
python3 extend/create_app_config.py dummy_captcha 2
# generate images
python3 extend/generate_click_captcha.py dummy_captcha 500 True
Print labels to TXT
python3 extend/output_label.py dummy_captcha 2
# Start training W
./darknet/darknet detector train app/dummy_captcha/dummy_captcha.data app/dummy_captcha/dummy_captcha_train.yolov3.cfg darknet53.conv.74
# Identification test
python3 extend/rec.py dummy_captcha 100
Copy the code
Train your data
The following process teaches you how to train your data. Suppose we want to create an application that identifies cars and people on the road, so the number of categories is 2. Assuming you now have some raw images, first you need to label them. LabelImg is recommended for marking. Use the tutorial can be Google, the software interface is roughly as follows:
Labeling the person and car in the image as person and car, respectively, generates an XML tag file. Next, we create an application named CAR of class 2 and generate some configuration files:
python3 extend/create_app_config.py car 2
Copy the code
Annotations (APP/CAR /JPEGImages); / / Add the XML tag file to the specified path, using the relative coordinates of the target in the image. At the same time, the sample paths of training set and verification set need to be defined respectively in CAR. Here, the training set and verification set will be divided, and two TXT files will be generated to record their paths.
python3 extend/output_label.py car 2
Copy the code
It should be noted that you can open car.names and change class_1 and class_2 to CAR and person, respectively, and the result will output car and person. Then you can start training:
./darknet/darknet detector train app/car/car.data app/car/car_train.yolov3.cfg darknet53.conv.74
Copy the code
There is no difference between the identification test and the above:
# Identification test
python3 extend/rec.py car 100
Copy the code
The web service
Start the Web service:
python3 extend/web_server.py
Copy the code
Before starting, modify the configuration parameters as required:
# create object
app_name = "car" # app name
config_file = "app/{}/{}_train.yolov3.cfg".format(app_name, app_name) Configuration file path
model_file = "app/{}/backup/{}_train.backup".format(app_name, app_name) # Model path
data_config_file = "app/{}/{}.data".format(app_name, app_name) Data profile path
dr = DarknetRecognize(
config_file=config_file,
model_file=model_file,
data_config_file=data_config_file
)
save_path = "api_images" The path to save the image
Copy the code
Use the following script request_api.py to test the identification of the Web service (note changing the image path) :
python3 extend/request_api.py
Copy the code
Returns the response containing the target category and center point location:
Interface response: {"speed_time(ms)": 16469,
"time": "15472704635706885"."value": [["word", 0.9995613694190979, [214.47508239746094, 105.97418212890625, 24.86412811279297, 33.40662384033203],... }Copy the code
The API documentation
no
Other problems
Use Ali Cloud OSS to speed up downloads
If you are using a foreign cloud host for training, the download speed of trained models can be a real problem. It is recommended to use Ali Cloud OSS, upload files on the cloud host, and then download them using OSS. Configure the key:
Get the key from the environment variable
AccessKeyId = os.getenv("AccessKeyId")
AccessKeySecret = os.getenv("AccessKeySecret")
BucketName = os.getenv("BucketName")
Copy the code
Upload pictures:
python3 extend/upload2oss.py app/my_captcha/images_data/JPEGImages/1_15463317590530567.jpg
python3 extend/upload2oss.py text.jpg
Copy the code
GPU cloud recommend
Using a rented vectordash GPU cloud host, SSH to connect to ubuntu16.04 with an integrated Nvidia deep learning environment contains the following tools or frameworks:
CUDA 9.0, cuDNN, Tensorflow, PyTorch, Caffe, Keras
Copy the code
Vectordash provides a client that can connect remotely, upload and download files, manage multiple cloud hosts, and more. Here are the rental prices for several graphics cards:
Install the client
pip install vectordash --upgrade
# login
vectordash login
# list hosts
vectordash list
# SSH login
vectordash ssh <instance_id>
# open jupyter
vectordash jupyter <instance_id>
# upload file
vectordash push <instance_id> <from_path> <to_path>
# Download file
vectordash pull <instance_id> <from_path> <to_path>
Copy the code
Since vectorDash host is abroad, upload and download is very slow. It is suggested to temporarily rent an Ali Cloud bid burst instance (about 7 cents per hour) as a transfer.
Comparison of CPU and GPU recognition speed
GTX 1060. Identification takes 1s
[Load Model] Speed time: 4.691879987716675s [detect image-i] Speed time: 1.002530813217163sCopy the code
CPU. Identification takes 13 seconds
[Load Model] Speed time: 3.313053846359253s [detect image-i] Speed time: 13.256595849990845sCopy the code
Error resolution reported
- UnicodeEncodeError: ‘ASCII’ codec can’t encode character ‘\U0001f621’ in posit Reference link
- PIP install, locale.Error: unsupported locale setting reference link
TODO
- Support multi-class detection identification and training Done
- The WebServer API calls Done
- classifier