1
0
Fork 0
mirror of https://github.com/tldr-pages/tldr.git synced 2025-04-23 19:02:09 +02:00
tldr/pages/common/tesseract.md
Lucas Gabriel Schneider a5fe31bc47
multiple pages: format technical tokens (#5119)
Co-authored-by: bl-ue <54780737+bl-ue@users.noreply.github.com>
Co-authored-by: Starbeamrainbowlabs <sbrl@starbeamrainbowlabs.com>
2021-01-31 12:05:18 -05:00

24 lines
689 B
Markdown

# tesseract
> OCR (Optical Character Recognition) engine.
> More information: <https://github.com/tesseract-ocr/tesseract>.
- Recognize text in an image and save it to `output.txt` (the `.txt` extension is added automatically):
`tesseract {{image.png}} {{output}}`
- Specify a custom language (default is English) with an ISO 639-2 code (e.g. deu = Deutsch = German):
`tesseract -l deu {{image.png}} {{output}}`
- List the ISO 639-2 codes of available languages:
`tesseract --list-langs`
- Specify a custom page segmentation mode (default is 3):
`tesseract -psm {{0_to_10}} {{image.png}} {{output}}`
- List page segmentation modes and their descriptions:
`tesseract --help-psm`