“Digital Human” Vision Challenge – Intelligent Diagnosis of Cervical cancer risk

The competing object

Open to the whole society, institutions of higher learning, research institutions, Internet enterprises and other personnel can register for the competition. Note: If the organizer and technical support units have the opportunity to contact with the background business, products, data of the competition, they will automatically quit the competition and give up the qualification to participate.

Registration and Real Name Authentication (from now on — November 20, 2019)

1. Registration method: Log on to the official website of the competition and complete personal information registration to register for the competition; 2, players can be a single team or 2-5 team, each player can only join one team; 3. Participants should ensure that the registration information is accurate and valid, and the organizing committee reserves the right to cancel the qualification and reward of the unqualified teams. 4. The deadline for registration, team change and real-name authentication is 10:00am on November 20, 2019. Teams that have not completed certification will be eliminated at 12:00am on 20 November 2019 and will not be able to continue. 5. Please go to the forum for official communication and scan the following QR code:

Background of the problem

Contest aims to provide large-scale after medical professional labeling of cervical liquid thin layer cell detection data, players can put forward and the integrated use of target detection, such as deep learning approach for cervical cytology and the positioning of the squamous epithelial cells in cervical cytological image classification, improve the speed and precision of the model checking, assist doctors diagnosis.

The problem of data

The contest provides thousands of cervical cancer cytology pictures and corresponding abnormal squamous cell location labeling, data in KFB format, which need to be accessed using the SDK specified by the contest. Each data was obtained under a 20-fold digital scanner with a size of 300 ~ 400M.

 

The preliminary round allowed contestants to download data, which included the following: Cervical cancer cells

There were 800 pictures, including 500 positive pictures and 300 negative pictures. Positive images will provide multiple ROI areas, in which the location of abnormal squamous cells will be marked, while negative images will not contain abnormal squamous cells and will not be marked. The abnormal squamous cells discussed in the preliminary competition mainly included four categories: ASC-US(atypical squamous cells have no clear meaning), LSIL(intraepithelial low lesion), ASC-H(atypical squamous cells tend to intraepithelial height), HSIL(intraepithelial high lesion). (Note: No abnormal squamous cells are guaranteed outside the ROI area of positive images)

In the semi-final, players are not allowed to download data and complete model training online through online competition. At the same time, online competition also provides support for the engineering development in the process of code reproduction of players’ model and achievement implementation. The semi-final is expected to provide 1000 cervical cancer cytological data, further determining the classification of the entire cytological picture by examining multiple cell categories.

The competition will reasonably divide training sets and test sets, and hide test annotation data as model evaluation basis. The data of the preliminary competition is divided into train and test: Train is used to provide athletes with training model, KFB file of cervical cancer cytology picture and CORRESPONDING LABELED JSON file are provided to athletes, and Test is used for evaluation. Annotated JSON file content is a list file, which records the location of each ROI region and the location coordinates of abnormal squamous cells (the upper-left coordinates of the rectangular box where the cells are located and the width and height of the rectangular box). Category ROI represents areas of interest and POS represents abnormal squamous epithelial cells. The following is an example of a JSON annotation file:

[{"x": 33842, "y": 31905, "w": 101, "h": 106, "class": "pos"},
{"x": 31755, "y": 31016, "w": 4728, "h": 3696, "class": "roi"},
{"x": 32770, "y": 34121, "w": 84, "h": 71, "class": "pos"},
{"x": 13991, "y": 38929, "w": 131, "h": 115, "class": "pos"},
{"x": 9598, "y": 35063, "w": 5247, "h": 5407, "class": "roi"},
{"x": 25030, "y": 40115, "w": 250, "h": 173, "class": "pos"}]
Copy the code

This competition also specially sets the supplementary race — VNNI track. The questions of THE VNNI race are the same as those of the final race, but the training framework of deep learning is limited (TensorFlow and MXNet). Model compression is required according to the model compression tool provided by Intel, and inference evaluation is conducted on the VNNI platform provided by Intel. VNNI track will be open after the rematch. You need to register separately. Only the first 30 teams are eligible to participate in the race, and they must submit a valid result within 10 days.

This competition will ensure the security of medical data from the perspective of data security. Game this data set will be based on specific desensitization software, the data security of all cervical cytological image data in strict accordance with the international standard of medical information desensitization, desensitization, and desensitization information including: hospital information, patient information and label information, all the data is back, ensure the data security, protect the privacy of patients.

Submit instructions

Participants submit folders composed of multiple JSON files to be packed and compressed into ZIP files. The folder name is freely selected and expressed in English lowercase (e.g. Tianchi.zip). Each file in the folder corresponds to the detection result of a cervical cancer cytology picture, and the file name is the image ID number (e.g. 20160050033533_ano.json), the content of the JSON file is a list file. Each element corresponds to the location coordinates of a detected tumor cell, which are the coordinate xy of the upper left corner of the rectangle containing the tumor cell, the value of the width and height of the rectangle, and the confidence p in turn. Here’s an example:

20160050033533 _ano. Json [{" x ": 22890," y ": 3877," w ": 396, the" h ": 255," p ", 0.94135}, {" x ": 20411," y ": 2260," w ": Of 8495, 7683 "h" : "p" : 0.67213}, {" x ": 26583," y ": 7937," w ": 172, the" h ": 128," p ", 0.73228}, {" x ": 2594," y ": 18627," w ": 1296, "h": 1867, "p": 0.23699}]Copy the code

 

Evaluation indicators

The question group will adopt the mAP (Mean Average Precision) index commonly used in target detection tasks as the evaluation index of cervical cancer cell detection in the preliminary competition. We used two IoU thresholds (0.5 and 0.7) to calculate AP respectively, and then integrated the average as the final evaluation result.

Specifically, for each cervical cancer cytology image, the participants output multiple prediction box positions and confidence of the whole image through the detection model. Our background evaluation algorithm will randomly generate some ROI regions, and only calculate mAP in ROI regions.

AP calculation process: First, fix an IoU threshold, calculate the IoU size of each prediction box and the actual label, and check whether the prediction box is correct based on the threshold. Then, the prediction box is sorted according to the confidence level, and a series of recall rate and accuracy rate values are obtained by setting different confidence level thresholds. The accuracy is averaged under different recall rates, which is AP.

Recall (Recall rate) =TP/(TP+FN)

Precision =TP/(TP+FP) Precision =TP/(TP+FP)

 

schedule

The competition is divided into three stages: the first season, the second season and the final. The specific arrangements and requirements are as follows:

Season 1, October 24, 2019 — November 21, 2019

1. After successful registration, the participating teams will download the data through Tianchi Platform, debug the algorithm locally, and submit the results online. If teams submit their results more than once in one day, the new version will override the old version. 2. From October 24, we will provide evaluation once a day. The ranking time of the system is 10:00am. 3. After the end of the first season (the last evaluation at 10:00am on November 21st), the organizing committee will review the finalists of the top 100 teams. Some teams will need to submit codes as supplementary materials for review. The organizing committee will identify the teams that only rely on manual marking without algorithmic contribution, and terminate the participation of the above teams. The promotion vacancy will be filled later.

Season 2, November 28, 2019 — January 9, 2020

1. The semi-final teams will obtain the training data and updated test data online, submit the debugging algorithm and results online. 2. The data of the semi-final cannot be downloaded. Contestants need to use pai-DSW platform (data.aliyun.com/pai/dsw) to complete all steps including data processing, modeling, algorithm debugging and output results. Pai-dsw (Data Science Workshop) is a cloud deep learning development environment specially prepared for algorithm developers. At present, DSW has built-in Tensorflow framework which has been deeply optimized by PAI team. At the same time, it can also install third-party libraries required by itself. 2. From November 28, 2019, we will provide evaluation and ranking opportunities once a day. The system evaluation starts at 10:00am, and the ranking list will be updated every day. 3. By the end of the second season (10:00, January 9, 2020), the TOP20 teams will enter the code review. Code review requires players to submit a clear code structure. According to the instructions, they can directly complete the training and inference of the model through the script to reproduce the competition results. In addition, code consistency is compared during the review phase. The representatives of the top 10 teams whose codes have been approved will be invited to participate in the finals.

VNNI Track, 28 November 2019 -20 January 2020

Because the input size of pathological images is very large, usually reaching several gigabytes and billions of pixels, the traditional NvidiaGPU cannot accommodate more global image information and has inefficient reasoning process. This competition will be supported by Intel. Participants can get rid of the GPU memory limit and verify the engineering efficiency of Intel VNNI in ultra-high resolution pathological images. 1. Starting from November 28, the TOP100 teams in the preliminary round will be able to register for the track, with 30 seats open. The qualification of the track will be opened according to the priority order of registration. 2. On December 20, the teams with no achievements will be eliminated, and the quota will be opened according to the order of the first phase of registration.

Final (tentatively Scheduled for February 2020)

1. The final will be held in the form of on-site defense meeting. The qualified teams need to prepare defense materials in advance, including defense PPT, competition summary and algorithm core code. 2. No more than 3 representatives from each team will be invited to participate in the final, and the travel expenses will be borne by the organizing committee. Specific arrangements will be announced later. 3. At the scene of defense, the judges will give a comprehensive score according to the contestant’s technical thinking, theoretical depth and on-site performance. 4. The final score will be weighted according to the algorithm score and defense score of the participating teams. Specific weighting ratio will be notified later. Awards will be awarded according to the final score.

Award setting

Competition incentive:

Title: 1 teams, each team bonus one pick up wu wan, award certificate Runner-up: two teams, each team bonus of award certificate Third: 3 teams, each team bonus wu wan, award-winning certificate level: 4 teams, each team bonus one, award certificate (the decision to award in the NBA finals reply the final result of)

VNNI Track incentives:

First prize: 1 teams, each team will receive a bonus of $60,000 second prize: 2 teams, each team will receive a bonus of $30,000 Third prize: 3 teams, each team will receive a bonus of $20,000 Winning Prize: 4 teams, each team will receive a bonus of $5,000

Organizational Unit:

Sponsor: Intel (China) Co., LTD

Directing Unit: Beijing Union Medical College Hospital

Partner: Ningbo Jiangfeng Biological Information Technology Co., LTD. Beijing Qingwutong Health Technology Co., LTD