From cefbc950dac4e25f2d90c48126a0dfef1d5207ae Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?M=C3=A1t=C3=A9=20Gy=C3=B6ngy=C3=B6si?= Date: Wed, 9 Jul 2025 05:12:10 +0200 Subject: [PATCH] ocrmypdf: replace semi-duplicate with optimization (#17175) Co-authored-by: Managor <42655600+Managor@users.noreply.github.com> --- pages/common/ocrmypdf.md | 16 ++++++++-------- 1 file changed, 8 insertions(+), 8 deletions(-) diff --git a/pages/common/ocrmypdf.md b/pages/common/ocrmypdf.md index a3ae7c8273..57135a2a72 100644 --- a/pages/common/ocrmypdf.md +++ b/pages/common/ocrmypdf.md @@ -5,11 +5,7 @@ - Create a new searchable PDF/A file from a scanned PDF or image file: -`ocrmypdf {{path/to/input_file}} {{path/to/output.pdf}}` - -- Replace a scanned PDF file with a searchable PDF file: - -`ocrmypdf {{path/to/file.pdf}} {{path/to/file.pdf}}` +`ocrmypdf {{path/to/input}} {{path/to/output.pdf}}` - Skip pages of a mixed-format input PDF file that already contain text: @@ -17,11 +13,15 @@ - Clean, de-skew, and rotate pages of a poor scan: -`ocrmypdf --clean --deskew --rotate-pages {{path/to/input_file}} {{path/to/output.pdf}}` +`ocrmypdf --clean --deskew --rotate-pages {{path/to/input.pdf}} {{path/to/output.pdf}}` -- Set the metadata of the searchable PDF file: +- Perform lossy optimization on a PDF without performing any OCR: -`ocrmypdf --title "{{title}}" --author "{{author}}" --subject "{{subject}}" --keywords "{{keyword; key phrase; ...}}" {{path/to/input_file}} {{path/to/output.pdf}}` +`ocrmypdf --tesseract-timeout 0 --optimize 2 --skip-text {{path/to/input.pdf}} {{path/to/output.pdf}}` + +- Set the metadata of a searchable PDF file: + +`ocrmypdf --title "{{title}}" --author "{{author}}" --subject "{{subject}}" --keywords "{{keyword; key phrase; ...}}" {{path/to/input.pdf}} {{path/to/output.pdf}}` - Display help: