Slate is a Python package that simplifies the process of extracting text from PDF files. It depends on the PDFMiner package.
Slate provides one class, PDF. PDF takes a file-like object and will extract all text from the document, presentating each page as a string of text:
with open('example.pdf') as f: ... doc = slate.PDF(f) ... doc [..., ..., ...] doc 'Text from page 2...'
If your pdf is password protected, pass the password as the second argument:
with open('secrets.pdf') as f: ... doc = slate.PDF(f, 'password') ... doc "My mother doesn't know this, but..."
If you would like access to the images, font files and other information, then take some time to learn the PDFMiner API.