model name | description | inference model size | download | dict path |
---|---|---|---|---|
picodet_lcnet_x1_0_fgd_layout | The layout analysis English model trained on the PubLayNet dataset based on PicoDet LCNet_x1_0 and FGD . the model can recognition 5 types of areas such as Text, Title, Table, Picture and List | 9.7M | inference model / trained model | PubLayNet dict |
ppyolov2_r50vd_dcn_365e_publaynet | The layout analysis English model trained on the PubLayNet dataset based on PP-YOLOv2 | 221.0M | inference_moel / trained model | same as above |
picodet_lcnet_x1_0_fgd_layout_cdla | The layout analysis Chinese model trained on the CDLA dataset, the model can recognition 10 types of areas such as Table、Figure、Figure caption、Table、Table caption、Header、Footer、Reference、Equation | 9.7M | inference model / trained model | CDLA dict |
picodet_lcnet_x1_0_fgd_layout_table | The layout analysis model trained on the table dataset, the model can detect tables in Chinese and English documents | 9.7M | inference model / trained model | Table dict |
ppyolov2_r50vd_dcn_365e_tableBank_word | The layout analysis model trained on the TableBank Word dataset based on PP-YOLOv2, the model can detect tables in English documents | 221.0M | inference model | same as above |
ppyolov2_r50vd_dcn_365e_tableBank_latex | The layout analysis model trained on the TableBank Latex dataset based on PP-YOLOv2, the model can detect tables in English documents | 221.0M | inference model | same as above |
model name | description | inference model size | download |
---|---|---|---|
en_ppocr_mobile_v2.0_table_det | Text detection model of English table scenes trained on PubTabNet dataset | 4.7M | inference model / trained model |
en_ppocr_mobile_v2.0_table_rec | Text recognition model of English table scenes trained on PubTabNet dataset | 6.9M | inference model / trained model |
If you need to use other OCR models, you can download the model in PP-OCR model_list or use the model you trained yourself to configure to det_model_dir
, rec_model_dir
field.
model | description | inference model size | download |
---|---|---|---|
en_ppocr_mobile_v2.0_table_structure | English table recognition model trained on PubTabNet dataset based on TableRec-RARE | 6.8M | inference model / trained model |
en_ppstructure_mobile_v2.0_SLANet | English table recognition model trained on PubTabNet dataset based on SLANet | 9.2M | inference model / trained model |
ch_ppstructure_mobile_v2.0_SLANet | Chinese table recognition model based on SLANet | 9.3M | inference model / trained model |
On XFUND_zh dataset, Accuracy and time cost of different models on V100 GPU are as follows.
Model | Backbone | Task | Config | Hmean | Time cost(ms) | Download link |
---|---|---|---|---|---|---|
VI-LayoutXLM | VI-LayoutXLM-base | SER | ser_vi_layoutxlm_xfund_zh_udml.yml | 93.19% | 15.49 | trained model |
LayoutXLM | LayoutXLM-base | SER | ser_layoutxlm_xfund_zh.yml | 90.38% | 19.49 | trained model |
LayoutLM | LayoutLM-base | SER | ser_layoutlm_xfund_zh.yml | 77.31% | - | trained model |
LayoutLMv2 | LayoutLMv2-base | SER | ser_layoutlmv2_xfund_zh.yml | 85.44% | 31.46 | trained model |
VI-LayoutXLM | VI-LayoutXLM-base | RE | re_vi_layoutxlm_xfund_zh_udml.yml | 83.92% | 15.49 | trained model |
LayoutXLM | LayoutXLM-base | RE | re_layoutxlm_xfund_zh.yml | 74.83% | 19.49 | trained model |
LayoutLMv2 | LayoutLMv2-base | RE | re_layoutlmv2_xfund_zh.yml | 67.77% | 31.46 | trained model |
V100 GPU + CUDA 10.2 + CUDNN 8.1.1 + TRT 7.2.3.4
On wildreceipt dataset, the algorithm result is as follows:
Model | Backbone | Config | Hmean | Download link |
---|---|---|---|---|
SDMGR | VGG6 | configs/kie/sdmgr/kie_unet_sdmgr.yml | 86.70% | trained model |