1
0
Fork 0
mirror of https://github.com/tldr-pages/tldr.git synced 2025-04-21 21:22:06 +02:00
tldr/pages/common/tesseract.md
Marius Hoch bd7c605ed6
tesseract: fix --psm parameter name (#15040)
This parameter has been renamed in tesseract 3.05.00,
and the legacy `-psm` alias has been removed in version 4.
2024-12-07 15:09:57 +01:00

690 B

tesseract

OCR (Optical Character Recognition) engine. More information: https://github.com/tesseract-ocr/tesseract.

  • Recognize text in an image and save it to output.txt (the .txt extension is added automatically):

tesseract {{image.png}} {{output}}

  • Specify a custom language (default is English) with an ISO 639-2 code (e.g. deu = Deutsch = German):

tesseract -l deu {{image.png}} {{output}}

  • List the ISO 639-2 codes of available languages:

tesseract --list-langs

  • Specify a custom page segmentation mode (default is 3):

tesseract --psm {{0_to_10}} {{image.png}} {{output}}

  • List page segmentation modes and their descriptions:

tesseract --help-psm