radiation
The basic framework of radioactivity
The services within each server are composed of the uniform structure above.
Total process time statistics of the test environment
The whole process
Waiting to pull data
PACS module receives data
The back-end processing
Algorithm in line
Algorithm module calculation
The average time
950s
630s
19s
27s
189s
85s
The percentage
66%
2%
3%
20%
9%
Data pull module
Overall structure drawing
Time consumption analysis of the existing scenario (illustrated with default configuration below)
Find periodic check → db (3 mins)
Move periodic check → db (1 min)→ delay(10 mins)
(receive dicom, compress dicom, seconds level ingore)
The overall delay is typically 10-11 minutes. (Incorrect server clock affects this value)
New Design (first draft)
Explain new design logic
The Find cycle queries PACS, and if the number of images in a studyUID remains unchanged on the next query, we assume that PACS has received the image. (Note: PACS itself has no way of knowing when the image is finished.)
Find and Move use RabbitMq as a messaging channel, providing more timely responsiveness and lower CPU and IO consumption than periodic polling databases
Find pacS with the filtering conditions of the original service back-end, not only can reduce ris interconnection, but also can be obtained on demand, rather than batch pull and filter abandon (waste of bandwidth, disk IO, disk storage and CPU consumption).
The interface between the data pull module and the service back end is changed from the original data pull module based on restful API to MQ as the boundary, and the service back end obtains the interface according to its own processing capacity
Image compression postposition. The existing scheme compresses the image immediately after it is pulled, resulting in an additional decompression CPU overhead (typically 3 seconds) for the business back end and algorithm module when processing the image. The new scheme is modified to compress the image after the algorithm module is finished.
The most optimistic time analysis of the new design
Find periodic check (1 min)
if study instance nums keep the same, publish message to (rabbitmq) to Move
The most optimistic time is about 1.5 minutes (actual time is based on hospital operation)
Pull by sequence
The reception time of PACS in the test environment is 20 seconds. It should be noted that the test environment is a single sequence.
Typical studies include (anteroposterior thin/and thick layers, mediastinal window thin/and thick layers, positioning films), excluding positioning films of very small volume, usually with 2 or 4 sequences, here the average is 3.
It normally receives data at study level, which takes about 60 seconds.
If the target sequence can be pulled by series, only 40 seconds can be saved.
After the business side
The average time of the back end from receiving data to sending requests to the algorithm module is 1 minute and 30 seconds (depending on the layer thickness, this is the average time of Xiamen University data). I mainly do two things, 1 is data archiving, 2 is sequence screening. The main time is data archiving.
Data archiving is to read DICOM one by one, and then put the pictures of the same seriesUID into the same folder. Meanwhile, some key header information (patient information, examination information, sequence information, picture information, etc.) is stored in the database.
We also need to do some fault tolerance, delete the duplicate images in the same sequence (the instanceNumber is the same), and delete the images with inconsistent pixels.
What can be adjusted:
- The archiving function is directly processed in pacsInterface. When pacsInterface receives data, it will also parse the DICOM header information, so it can be directly archived according to the sequence and save the header information in the database. Just like PACS servers do.
Algorithm of engineering
CT for small pulmonary nodules cancels conversion to FLOAT16 with an average reduction of 10 seconds per case.
The front-end UI
Front-end performance optimization
Nginx-based DICOM download performance (28% faster, 4 cores less CPU usage)
With lossless compression, the image volume is reduced by nearly 50% and the page loading speed is halved. (100M bandwidth, 1100 slice CT, 55 seconds before compression loading, only 26 seconds after lossless compression)
Operation/operation module
Desensitization packaging
Primary design (omits secondary logic)
Efficiency improvement
LRU CACHE: Use lru_cache to reduce the double calculation of volume estimation (lru_cache does not support list input, so only db query can be cached)
Process Pool: Use a Process Pool to desensitize images (desensitization requires CPU resources, so choose processes rather than threads. Using pool can reuse processes and reduce the consumption of constantly opening processes.
Query DB Necessary: The database only asks for the Necessary data (in the past, it was simple and simple to pull the whole document from the DB, for the document like study, only pull the required fields, which can be faster)
Memory Filesystem: after desensitization, the dicom image before packaging is output to the in-memory Filesystem, which can reduce disk IO consumption
Only Archive: DicOM images cannot be reduced by traditional compression algorithms, so Only packaging is required to reduce the CPU consumption of packaging and unpacking
OCR Server
First, thanks to the plug-in client for de-processing
The plug-in will hash the image after taking a screenshot of the GUI. If the image is not changed, it will not send the image to the OCR Server, which greatly reduces the pressure on the OCR Server.
Replace the Best model of type float with a FAST model of type INTEGER
In the past, in order to pursue the best recognition rate, the best model has been adopted, and even the fusion model (multi-model) has been adopted.
The OCR recognition rate of the INTEGER based FAST model is almost the same as that of the float based Best model. Therefore, fast model is now selected by default.
Compared with best model, fast model can save 1/2 ~ 2/3 running time.
basis
The database
Any table field that requires find must be configured with an index