PDF form extract camelot installation tutorial

After testing, macOS and Win10 can be installed in the following way

Camelot: a friendly DATA extraction tool for PDF forms

A Python command line tool that makes it easy for anyone to extract tabular data from PDF files.

How do you use Camelot

Extracting data from PDF documents using Camelot is simple

Camelot allows you to precisely control the data extraction process by adjusting Settings

. Bad tables can be identified by whitespace and precision metrics and discarded without manual inspection

Every table data is a Panda Dataframe, which can be easily integrated into ETL and data analysis workflow

. You can export data to various formats such as CSV, JSON, EXCEL, and HTML

  • PIP installation instructions:

First install PYTHon3.6 on your computer, then type on the command line:

pip install camelot-py
Copy the code
  • Enter the Python command line test
(CLOT) C: \ Users \ yss > python 3.6.7 | Anaconda, Inc. | (default, Oct 28, 2018, 19:44:12) [MSC v.1915 64 bit (AMD64)] on win32 Type "help", "copyright", "credits" or "license" for more information. >>> import camelot as cl ...... import chardet # For str encoding detection in Py3 ModuleNotFoundError: No module named 'chardet' >>>Copy the code

If an error is reported as above: No module named ‘chardet’, return to the system command line and execute:

pip install chardet
Copy the code

After chardet is installed successfully, enter python command test again:

(CLOT) C: \ Users \ yss > python 3.6.7 | Anaconda, Inc. | (default, Oct 28, 2018, 19:44:12) [MSC v.1915 64 bit (AMD64)] on win32 Type "help", "copyright", "credits" or "license" for more information. >>> import camelot as cl File "F:\APP\Ides\Anaconda3\envs\CLOT\lib\site-packages\camelot\image_processing.py", line 5, in <module> import cv2 ModuleNotFoundError: No module named 'cv2' >>>Copy the code

ModuleNotFoundError: No module named ‘cv2’, opencV library is not installed. Go back to the system command line and install the OpencV library:

pip install opencv-python
Copy the code

After the preceding operations are complete, the installation is successful.

  • Install successful. test it

Enter Python again and type:

import camelot as cl
Copy the code

Errors will no longer be reported. Output its version number:

print(cl.__version__)
Copy the code

The test process is as follows:

(CLOT) C: \ Users \ yss > python 3.6.7 | Anaconda, Inc. | (default, Oct 28, 2018, 19:44:12) [MSC v.1915 64 bit (AMD64)] on win32 Type "help", "copyright", "Credits" or "license" for more information. >>> import Camelot as CL >>> cl.__version__ '0.3.2' >>>Copy the code

After the installation is completed, the following is to start to use, after the opportunity, I will update the experience of using.