How to convert web pages to PDF with Python

When writing code documents in jupyter notebook, sometimes you need to export the pdf version, but jupyter will report an error. I was thinking that apart from the online debug method, there is no other solution to generate pdf.

Du Niang searched, and many blogs recommend Python's third-party library pdfkit, which can generate pdf files from webpages, html files and strings.

In fact, there are many softwares that provide pdf generation services, but this is too pythonic, so let's try how to use pdfkit!

**Three steps to automatically generate pdf documents: **

  1. Use pip to install the pdfkit library

Python version 3.x, type in the command line:

pip install pdfkit

There is basically no problem in the installation process. The above Successfully installed pdfkit-0.6.1 prompt appears, indicating that the installation is successful.

  1. Install the wkhtmltopdf.exe file

Note: pdfkit is a python package based on wkhtmltopdf, so wkhtmltopdf.exe needs to be installed. wkhtmltopdf is a lightweight software, very easy to install.

download link:

https://wkhtmltopdf.org/downloads.html

Download wkhtmltopdf

After the download is complete, go next all the way and install wkhtmltopdf.

Be sure to remember the installation address, find the absolute path where the wkhtmltopdf.exe file is located, and use it later.

My default path here is ""C:\Program Files\wkhtmltopdf\bin\wkhtmltopdf.exe""

Install wkhtmltopdf

  1. Use pdfkit library to generate pdf files

As mentioned earlier, pdfkit can generate pdf files from webpages, html files, and strings.

# Import library
import pdfkit

''' Generate pdf files from web pages'''
def url_to_pdf(url, to_file):
 # Add wkhtmltopdf.The absolute path of the exe program is passed into the config object
 path_wkthmltopdf = r'C:\\Program Files\\wkhtmltopdf\\bin\\wkhtmltopdf.exe'
 config = pdfkit.configuration(wkhtmltopdf=path_wkthmltopdf)
 # Generate pdf file, to_file is the file path
 pdfkit.from_url(url, to_file, configuration=config)print('carry out')

# Pass in here I know the column url and convert it to pdf
url_to_pdf(r'https://zhuanlan.zhihu.com/p/69869004','out_1.pdf')
# Import library
import pdfkit

''' Generate pdf file from html file'''
def html_to_pdf(html, to_file):
 # Add wkhtmltopdf.The absolute path of the exe program is passed into the config object
 path_wkthmltopdf = r'C:\\Program Files\\wkhtmltopdf\\bin\\wkhtmltopdf.exe'
 config = pdfkit.configuration(wkhtmltopdf=path_wkthmltopdf)
 # Generate pdf file, to_file is the file path
 pdfkit.from_file(html, to_file, configuration=config)print('carry out')html_to_pdf('sample.html','out_2.pdf')
# Import library
import pdfkit

''' Generate a pdf file from the string'''
def str_to_pdf(string, to_file):
 # Add wkhtmltopdf.The absolute path of the exe program is passed into the config object
 path_wkthmltopdf = r'C:\\Program Files\\wkhtmltopdf\\bin\\wkhtmltopdf.exe'
 config = pdfkit.configuration(wkhtmltopdf=path_wkthmltopdf)
 # Generate pdf file, to_file is the file path
 pdfkit.from_string(string, to_file, configuration=config)print('carry out')str_to_pdf('This is test!','out_3.pdf')
  1. in conclusion

This article talks about how to use the pdfkit library to generate pdf files in Python, which is very convenient and fast, suitable for batch automation.

Let's see how the generated pdf effect:

pdf effect display

The overall page has a good visual look, so hurry up and use it!

Recommended Posts

How to convert web pages to PDF with Python
How to make a globe with Python
How to process excel table with python
How to get started quickly with Python
Web Scraping with Python
How to play happily with Python3 on Ubuntu
Python how to delete rows with empty columns
How to read and write files with Python
How to deal with python file reading failure
How to comment python code
How to learn python quickly
How to uninstall python plugin
How to understand python objects
How to use python tuples
python how to view webpage code
How to use python thread pool
How to write python configuration file
How to wrap in python code
How to save the python program
How to omit parentheses in Python
How to install Python 3.8 on CentOS 8
How to install Python 3.8 on Ubuntu 18.04
How to write classes in python
How to filter numbers in python
Centos6.7 comes with python upgrade to
How to read Excel in Python
How to install Python on CentOS 8
How to solve python dict garbled
How to view errors in python
How to write return in python
How to view the python module
How to understand variables in Python
How to clear variables in python
How to understand python object-oriented programming
Python uses pdfkit to generate pdf [python]
How to use SQLite in Python
How to verify successful installation of python
How to use and and or in Python
How to delete cache files in python
How to introduce third-party modules in Python
How to represent null values in python
How to save text files in python
How to use PYTHON to crawl news articles
A complete guide to Python web development
How to write win programs in python
How to run id function in python
How to install third-party modules in Python
How to custom catch errors in python
How to write try statement in python
How to define private attributes in Python
R&D: How To Install Python 3 on CentOS 7
How to add custom modules in Python
How to understand global variables in Python
How to view installed modules in python
How to install Python2 on Ubuntu20.04 ubuntu/focal64
How to debug python program using repr
How to learn the Python time module
How to open python in different systems
How to sort a dictionary in python
How to enter python triple quotation marks
How to add background music in python