How to Write a PDF File Using Python: A Comprehensive Guide

By JoeVu, at: 20:58 Ngày 30 tháng 11 năm 2023

Thời gian đọc ước tính: 5 min read

How to Write a PDF File Using Python: A Comprehensive Guide
How to Write a PDF File Using Python: A Comprehensive Guide

Introduction

Python provides a versatile set of libraries for working with PDF files, enabling developers to create, modify, and manipulate PDF documents effortlessly. In this comprehensive guide, we will explore various Python libraries for writing PDF files, covering fundamental operations and showcasing the capabilities of popular libraries such as PyPDF2, ReportLab, pdfkit, pdfrw and PyMuPDF.

There is another blog post about reading pdf files

 

1. PyPDF2: Basic PDF Writing

Installation: pip install PyPDF2

Basic PDF Writing Example:

import PyPDF2

with open('output_pypdf2.pdf', 'wb') as output_file:
    pdf_writer = PyPDF2.PdfFileWriter()
    # Add a new page to the PDF
    page = pdf_writer.addPage()
    # Set content for the page
    page.mergePage(PyPDF2.PdfReader('input.pdf').getPage(0))
    # Encrypt the PDF (optional)
    pdf_writer.encrypt('password')
    # Write the modified PDF to the output file
    pdf_writer.write(output_file)

 


This example demonstrates creating a new PDF file, adding a page, setting its content, and optionally encrypting the PDF using PyPDF2.

 

2. ReportLab: Advanced PDF Generation

Installation: pip install reportlab

Advanced PDF Generation Example:

from reportlab.pdfgen import canvas

def generate_pdf(file_path):
    # Create a new PDF document
    pdf_canvas = canvas.Canvas(file_path)

    # Add content to the PDF
    pdf_canvas.drawString(100, 750, "Hello, ReportLab!")

    # Save the PDF
    pdf_canvas.save()

# Generate a PDF using ReportLab
generate_pdf('output_reportlab.pdf')

 


In this example, ReportLab is used to generate a simple PDF document with text content at specified coordinates.

 

3. pdfkit: HTML to PDF Conversion

Installation: pip install pdfkit

HTML to PDF Conversion Example:

import pdfkit

# Convert HTML to PDF
pdfkit.from_file('input.html', 'output_pdfkit.pdf')


pdfkit leverages the wkhtmltopdf library to convert HTML content to PDF. In this example, we convert an HTML file to a PDF document.

 

4. PyMuPDF: PDF Writing with Text and Images

Installation: pip install PyMuPDF

PDF Writing with Text and Images Example:

import fitz  # PyMuPDF

def write_pdf_with_text_and_images(file_path):
    # Create a new PDF document
    doc = fitz.open()

    # Add a page to the PDF
    page = doc.newPage()

    # Add text to the page
    page.insertText((100, 100), "Hello, PyMuPDF!")

    # Add an image to the page
    image_path = 'image.jpg'
    image_rect = fitz.Rect(100, 200, 300, 400)
    page.insertImage(image_rect, filename=image_path)

    # Save the PDF
    doc.save(file_path)
    doc.close()

# Write a PDF with text and images using PyMuPDF
write_pdf_with_text_and_images('output_pymupdf.pdf')


In this PyMuPDF example, we create a new PDF, add a page, insert text and an image, and save the resulting document.

 

5. pdfrw: Low-Level PDF Writing

Installation:  pip install pdfrw

Basic PDF Writing Example:

from pdfrw import PdfWriter, PdfDict, IndirectPdfDict

def write_pdf_with_pdfrw(file_path):
    # Create a new PDF document
    pdf_writer = PdfWriter()

    # Add a page to the PDF
    page = PdfDict(
        Type='/Page',
        Parent=IndirectPdfDict(),
        Contents=IndirectPdfDict(stream="q 1 0 0 1 72 720 cm /F1 12 Tf (Hello, pdfrw!) Tj Q"),
    )

    # Set the page size
    page.MediaBox = [0, 0, 612, 792]

    # Add the page to the PDF
    pdf_writer.pages.append(page)

    # Save the PDF
    with open(file_path, 'wb') as output_file:
        pdf_writer.write(output_file)

# Write a PDF with pdfrw
write_pdf_with_pdfrw('output_pdfrw.pdf')

 


In this example, pdfrw is used for low-level PDF writing. We create a new PDF document, add a page, set its content (text), and save the resulting document. pdfrw provides flexibility for direct manipulation of PDF objects, making it suitable for scenarios requiring precise control over PDF structure and content.

 

Conclusion

Python offers a rich ecosystem of libraries for writing PDF files, each catering to different needs and scenarios. Whether you're performing basic operations with PyPDF2, generating sophisticated documents with ReportLab, converting HTML content with pdfkit, or combining text and images with PyMuPDF, you have a plethora of tools at your disposal.

Choose the library that best aligns with your specific requirements, and dive into the world of PDF generation and customization with Python. As you explore these libraries, you'll discover the flexibility and power they bring to your PDF-related projects. Happy coding!


Liên quan

Python Learning

How to Read PDF Files with Python

Đọc thêm
Python

Common Python Problems

Đọc thêm
Experience Python

Handling Python Date and Time Issues

Đọc thêm
Theo dõi

Theo dõi bản tin của chúng tôi và không bao giờ bỏ lỡ những tin tức mới nhất.