1
0
Fork 0
mirror of https://github.com/tldr-pages/tldr.git synced 2025-04-22 21:42:08 +02:00
tldr/pages/common/htmlq.md
J Wong 549ba9c42f
htmlq: add page (#15625)
Co-authored-by: Wiktor Perskawiec <git@spageektti.cc>
Co-authored-by: Juri Dispan <juri.dispan@posteo.net>
Co-authored-by: Managor <42655600+Managor@users.noreply.github.com>
2025-02-07 08:21:00 +02:00

659 B

htmlq

Use CSS selectors to extract content from HTML files. More information: https://github.com/mgdm/htmlq.

  • Return all elements of class card:

cat {{path/to/file.html}} | htmlq '.card'

  • Get the text content of the first paragraph:

cat {{path/to/file.html}} | htmlq --text 'p:first-of-type'

  • Find all the links in a page:

cat {{path/to/file.html}} | htmlq --attribute href 'a'

  • Remove all images and SVGs from a page:

cat {{path/to/file.html}} | htmlq --remove-nodes 'img' --remove-nodes 'svg'

  • Pretty print and write the output to a file:

htmlq --pretty --filename {{path/to/input.html}} --output {{path/to/output.html}}