Pypandoc provides a thin wrapper for pandoc, a universal document converter.
Pypandoc uses pandoc, so it needs an available installation of pandoc. For some common cases (wheels, conda packages), pypandoc already includes pandoc (and pandoc-citeproc) in its prebuilt package.
If pandoc is already installed (i.e. pandoc is in the
PATH), pypandoc uses the version with the
higher version number, and if both are the same, the already installed version. See Specifying the location of pandoc binaries for more.
To use pandoc filters, you must have the relevant filters installed on your machine.
pip install pypandoc.
If you use Linux and have your own wheelhouse,
you can build a wheel which include pandoc with
python setup.py download_pandoc; python setup.py bdist_wheel. Be aware that this works only
on 64bit intel systems, as we only download it from the
Pypandoc is included in conda-forge. The conda packages will also install the pandoc package, so pandoc is available in the installation.
conda install -c conda-forge pypandoc.
You can also add the channel to your conda config via
conda config --add channels conda-forge. This makes it possible to
conda install pypandoc directly and also lets you update via
conda update pypandoc.
If you don't get pandoc installed via a prebuild wheel which includes pandoc or via the conda package dependencies, you need to install pandoc by yourself.
Installing via pypandoc is possible on Windows, Mac OS X or Linux (Intel-based, 64-bit):
# expects an installed pypandoc: pip install pypandoc from pypandoc.pandoc_download import download_pandoc # see the documentation how to customize the installation path # but be aware that you then need to include it in the `PATH` download_pandoc()
The default install location is included in the search path for pandoc, so you
don't need to add it to the
By default, the latest pandoc version is installed. If you want to specify your own version, say 1.19.1, use
Installing manually via the system mechanism is also possible. Such installation mechanism make pandoc available on many more platforms:
sudo apt-get install pandoc
sudo yum install pandoc
sudo pacman -S pandoc
brew install pandoc pandoc-citeproc Caskroom/cask/mactex
pkg install hs-pandoc
Be aware that not all install mechanisms put pandoc in the
PATH, so you either
have to change the
PATH yourself or set the full
PATH to pandoc in
PYPANDOC_PANDOC. See the next section for more information.
You can point to a specific pandoc version by setting the environment variable
PYPANDOC_PANDOC to the full
PATH to the pandoc binary
If this environment variable is set, this is the only place where pandoc is searched for.
In certain cases, e.g. pandoc is installed but a web server with its own user cannot find the binaries, it is useful to specify the location at runtime:
import os os.environ.setdefault('PYPANDOC_PANDOC', '/home/x/whatever/pandoc')
There are two basic ways to use pypandoc: with input files or with input strings.
import pypandoc # With an input file: it will infer the input format from the filename output = pypandoc.convert_file('somefile.md', 'rst') # ...but you can overwrite the format via the `format` argument: output = pypandoc.convert_file('somefile.txt', 'rst', format='md') # alternatively you could just pass some string. In this case you need to # define the input format: output = pypandoc.convert_text('# some title', 'rst', format='md') # output == 'some title\r\n==========\r\n\r\n'
convert_text expects this string to be unicode or utf-8 encoded bytes.
convert_* will always
return a unicode string.
It's also possible to directly let pandoc write the output to a file. This is the only way to
convert to some output formats (e.g. odt, docx, epub, epub3, pdf). In that case
return an empty string.
import pypandoc output = pypandoc.convert_file('somefile.md', 'docx', outputfile="somefile.docx") assert output == ""
In addition to
format, it is possible to pass
That makes it possible to access various pandoc options easily.
output = pypandoc.convert_text( '<h1>Primary Heading</h1>', 'md', format='html', extra_args=['--atx-headers']) # output == '# Primary Heading\r\n' output = pypandoc.convert( '# Primary Heading', 'html', format='md', extra_args=['--base-header-level=2']) # output == '<h2 id="primary-heading">Primary Heading</h2>\r\n'
pypandoc now supports easy addition of pandoc filters.
filters = ['pandoc-citeproc'] pdoc_args = ['--mathjax', '--smart'] output = pypandoc.convert_file(filename, to='html5', format='md', extra_args=pdoc_args, filters=filters)
Please pass any filters in as a list and not as a string.
Please refer to
pandoc -h and the
official documentation for further details.
Note: the old way of using
convert(input, output)is deprecated as in some cases it wasn't possible to determine whether the input should be used as a filename or as text.
Pandoc supports custom formatting though
-V parameter. In order to use it through
pypandoc, use code such as this:
output = pypandoc.convert_file('demo.md', 'pdf', outputfile='demo.pdf', extra_args=['-V', 'geometry:margin=1.5cm'])
Note: it's important to separate
-Vand its argument within a list like that or else it won't work. This gotcha has to do with the way
As it can be useful sometimes to check what pandoc version is available at your system or which particular pandoc binary is used by pypandoc. For that, pypandoc provides the following utility functions. Example:
print(pypandoc.get_pandoc_version()) print(pypandoc.get_pandoc_path()) print(pypandoc.get_pandoc_formats())
convert_textsimilar to pypandoc's. Its focus is on writing and running pandoc filters though.
Contributions are welcome. When opening a PR, please keep the following guidelines in mind:
flake8 pypandoc/*.py tests.py
README.mdunless you are already there. In that case tweak your contributions.
Note that for citeproc tests to pass you'll need to have pandoc-citeproc installed. If you installed a prebuilt wheel or conda package, it is already included.
setup.pyfail hard if pandoc is missing, Travis, Dockerfile, PyPI badge, Tox, PEP-8, improved documentation
extra_argsexample to README
_get_pandoc_urlsfor installing arbitrary version as well as the latest version of pandoc. Minor: README, Travis, setup.py.
Pypandoc is available under MIT license. See LICENSE for more details. Pandoc itself is available under the GPL2 license.