OCR software, free and offline. 开源、免费的离线OCR软件。支持截屏/批量导入图片,PDF文档识别,排除水印/页眉页脚,扫描/生成二维码。内置多国语言库。
-
Updated
May 31, 2025 - Python
OCR software, free and offline. 开源、免费的离线OCR软件。支持截屏/批量导入图片,PDF文档识别,排除水印/页眉页脚,扫描/生成二维码。内置多国语言库。
CnOCR: Awesome Chinese/English OCR Python toolkits based on PyTorch. It comes with 20+ well-trained models for different application scenarios and can be used directly after installation. 【基于 PyTorch/MXNet 的中文/英文 OCR Python 包。】
Document (PDF, Word, PPTX ...) extraction and parse API using state of the art modern OCRs + Ollama supported models. Anonymize documents. Remove PII. Convert any document or picture to structured JSON or Markdown
Extracting Tables from Document Images using a Multi-stage Pipeline for Table Detection and Table Structure Recognition
OCR, Archive, Index and Search: Implementation agnostic OCR framework.
Lightweight & fast OCR models for license plate text recognition.
A carefully-designed OCR pipeline for universal boarded table recognition and reconstruction.
Perform text detection in a variety of languages with your computer webcam using Google Tesseract OCR and OpenCV. This script achieves a real-time OCR effect via multi-threading.
Manga OCR snipping application for desktop
Collection of PDF parsing libraries like AI based docling, claude, openai, gemini, meta's llama-vision, unstructured-io, and pdfminer, pymupdf, pdfplumber etc for efficient snapshot, text, table, and metadata extraction.
Anansi is a computer vision (cv2 and FFmpeg) + OCR (EasyOCR and tesseract) python-based crawler for finding and extracting questions and correct answers from video files of popular TV game shows in the Balkan region.
Python3 package for Chinese/English OCR,use paddleocr-v5 onnx model(~20MB), with ultra-fast inference speed. 基于ppocr-v5-onnx模型推理,中英文OCR开源SOTA,推理速度超快。
A FLOSS software for Persian Optical Character Recognition
PDF text data extraction web app with OCR for scanned documents
Easter2.0: IMPROVING CONVOLUTIONAL MODELS FOR HANDWRITTEN TEXT RECOGNITION
Multimodal document parser for high quality data understanding and extraction
OCR Tamil is a powerful tool that can detect and recognize text in Tamil images with high accuracy on Natural Scenes
Custom C++ implementation of deep learning based OCR
Turn any OCR models into online inference API endpoint 🚀 🌖
Add a description, image, and links to the ocr-python topic page so that developers can more easily learn about it.
To associate your repository with the ocr-python topic, visit your repo's landing page and select "manage topics."