Turn HTML into equivalent Markdown-structured text.
Project Links
Meta
Author: Aaron Swartz
Maintainer: Alireza Savand
Requires Python: >=3.9
Classifiers
Development Status
- 5 - Production/Stable
Intended Audience
- Developers
Operating System
- OS Independent
Programming Language
- Python
- Python :: 3
- Python :: 3.9
- Python :: 3.10
- Python :: 3.11
- Python :: 3.12
- Python :: 3.13
- Python :: 3 :: Only
- Python :: Implementation :: CPython
- Python :: Implementation :: PyPy
html2text
html2text is a Python script that converts a page of HTML into clean, easy-to-read plain ASCII text. Better yet, that ASCII also happens to be valid Markdown (a text-to-HTML format).
Usage: html2text [filename [encoding]]
| Option | Description |
|---|---|
--version |
Show program's version number and exit |
-h, --help |
Show this help message and exit |
--ignore-links |
Don't include any formatting for links |
--escape-all |
Escape all special characters. Output is less readable, but avoids corner case formatting issues. |
--reference-links |
Use reference links instead of links to create markdown |
--mark-code |
Mark preformatted and code blocks with [code]...[/code] |
For a complete list of options see the docs
Or you can use it from within Python:
>>> import html2text
>>>
>>> print(html2text.html2text("<p><strong>Zed's</strong> dead baby, <em>Zed's</em> dead.</p>"))
**Zed's** dead baby, _Zed's_ dead.
Or with some configuration options:
>>> import html2text
>>>
>>> h = html2text.HTML2Text()
>>> # Ignore converting links from HTML
>>> h.ignore_links = True
>>> print h.handle("<p>Hello, <a href='https://www.google.com/earth/'>world</a>!")
Hello, world!
>>> print(h.handle("<p>Hello, <a href='https://www.google.com/earth/'>world</a>!"))
Hello, world!
>>> # Don't Ignore links anymore, I like links
>>> h.ignore_links = False
>>> print(h.handle("<p>Hello, <a href='https://www.google.com/earth/'>world</a>!"))
Hello, [world](https://www.google.com/earth/)!
Originally written by Aaron Swartz. This code is distributed under the GPLv3.
How to install
html2text is available on pypi
https://pypi.org/project/html2text/
$ pip install html2text
Development
How to run unit tests
$ tox
To see the coverage results:
$ coverage html
then open the ./htmlcov/index.html file in your browser.
Code Quality & Pre Commit
The CI runs several linting steps, including:
- mypy
- Flake8
- Black
To make sure the code passes the CI linting steps, run:
$ tox -e pre-commit
Documentation
Documentation lives here
2025.4.15
Apr 15, 2025
2024.2.26
Feb 27, 2024
2024.2.25
Feb 25, 2024
2020.1.16
Jan 16, 2020
2019.9.26
Sep 26, 2019
2019.8.11
Aug 11, 2019
2018.1.9
Jan 10, 2018
2017.10.4
Oct 04, 2017
2016.9.19
Sep 18, 2016
2016.5.29
May 29, 2016
2016.4.2
Apr 01, 2016
2016.1.8
Jan 08, 2016
2015.11.4
Nov 04, 2015
2015.6.21
Jun 21, 2015
2015.6.12
Jun 12, 2015
2015.6.6
Jun 05, 2015
2015.4.14
Apr 14, 2015
2015.4.13
Apr 13, 2015
2015.2.18
Feb 18, 2015
2014.12.29
Dec 29, 2014
2014.12.24
Dec 24, 2014
2014.12.5
Dec 05, 2014
2014.9.25
Sep 25, 2014
2014.9.8
Sep 08, 2014
2014.9.7
Sep 07, 2014
2014.7.3
Jul 03, 2014
2014.4.5
Apr 05, 2014
3.200.3
Jan 07, 2012
3.200.2
Jan 06, 2012
3.200.1
Dec 21, 2011
3.200.0
Dec 21, 2011
3.101
Nov 09, 2011
3.2
Apr 19, 2011
3.1
Feb 16, 2011
2.38
Feb 04, 2010
2.37
Feb 02, 2010
2.35
Dec 14, 2008
Wheel compatibility matrix
Files in release
No dependencies