update_en.md 5.1 KB

RECENT UPDATES

  • 2022.5.9 release PaddleOCR v2.5, including:
    • PP-OCRv3: With comparable speed, the effect of Chinese scene is further improved by 5% compared with PP-OCRv2, the effect of English scene is improved by 11%, and the average recognition accuracy of 80 language multilingual models is improved by more than 5%.
    • PPOCRLabelv2: Add the annotation function for table recognition task, key information extraction task and irregular text image.
    • Interactive e-book "Dive into OCR", covers the cutting-edge theory and code practice of OCR full stack technology.
  • 2022.5.7 Add support for metric and model logging during training to Weights & Biases.
  • 2021.12.21 OCR open source online course starts. The lesson starts at 8:30 every night and lasts for ten days. Free registration: https://aistudio.baidu.com/aistudio/course/introduce/25207
  • 2021.12.21 release PaddleOCR v2.4, release 1 text detection algorithm (PSENet), 3 text recognition algorithms (NRTR、SEED、SAR), 1 key information extraction algorithm (SDMGR) and 3 DocVQA algorithms (LayoutLM、LayoutLMv2,LayoutXLM).
  • 2021.9.7 release PaddleOCR v2.3, PP-OCRv2 is proposed. The CPU inference speed of PP-OCRv2 is 220% higher than that of PP-OCR server. The F-score of PP-OCRv2 is 7% higher than that of PP-OCR mobile.
  • 2021.8.3 released PaddleOCR v2.2, add a new structured documents analysis toolkit, i.e., PP-Structure, support layout analysis and table recognition (One-key to export chart images to Excel files).
  • 2021.4.8 release end-to-end text recognition algorithm PGNet which is published in AAAI 2021. Find tutorial here;release multi language recognition models, support more than 80 languages recognition; especically, the performance of English recognition model is Optimized.

  • 2021.1.21 update more than 25+ multilingual recognition models models list, including:English, Chinese, German, French, Japanese,Spanish,Portuguese Russia Arabic and so on. Models for more languages will continue to be updated Develop Plan.

  • 2020.12.15 update Data synthesis tool, i.e., Style-Text,easy to synthesize a large number of images which are similar to the target scene image.

  • 2020.11.25 Update a new data annotation tool, i.e., PPOCRLabel, which is helpful to improve the labeling efficiency. Moreover, the labeling results can be used in training of the PP-OCR system directly.

  • 2020.9.22 Update the PP-OCR technical article, https://arxiv.org/abs/2009.09941

  • 2020.9.19 Update the ultra lightweight compressed ppocr_mobile_slim series models, the overall model size is 3.5M, suitable for mobile deployment.

  • 2020.9.17 update English recognition model and Multilingual recognition model, English, Chinese, German, French, Japanese and Korean have been supported. Models for more languages will continue to be updated.

  • 2020.8.24 Support the use of PaddleOCR through whl package installation,pelease refer PaddleOCR Package

  • 2020.8.16 Release text detection algorithm SAST and text recognition algorithm SRN

  • 2020.7.23, Release the playback and PPT of live class on BiliBili station, PaddleOCR Introduction, address

  • 2020.7.15, Add mobile App demo , support both iOS and Android ( based on easyedge and Paddle Lite)

  • 2020.7.15, Improve the deployment ability, add the C + + inference , serving deployment. In addtion, the benchmarks of the ultra-lightweight Chinese OCR model are provided.

  • 2020.7.15, Add several related datasets, data annotation and synthesis tools.

  • 2020.7.9 Add a new model to support recognize the character "space".

  • 2020.7.9 Add the data augument and learning rate decay strategies during training.

  • 2020.6.8 Add datasets and keep updating

  • 2020.6.5 Support exporting attention model to inference_model

  • 2020.6.5 Support separate prediction and recognition, output result score

  • 2020.5.30 Provide Lightweight Chinese OCR online experience

  • 2020.5.30 Model prediction and training support on Windows system

  • 2020.5.30 Open source general Chinese OCR model

  • 2020.5.14 Release PaddleOCR Open Class

  • 2020.5.14 Release PaddleOCR Practice Notebook

  • 2020.5.14 Open source 8.6M lightweight Chinese OCR model