python之PDF裁剪

首先我们得认清什么是裁剪,这和之前说得剪切有什么区别?

之前说的是把一部分的PDF页提取出来,比如你想要1-20页,把它提取出来.

而这次说的裁剪是说把单独页面里的指定区域提取出来.

即,去掉红色部分:

代码如下:

import PyPDF2
import os
input_file_path = '**.pdf'

left_margin = right_margin = 80
top_margin = 130
bottom_margin = 80
#这里要手动输入,自己手动调参一下,根据裁剪的结果




def split(page):
    page.mediaBox.lowerLeft = (left_margin, bottom_margin)
    page.mediaBox.lowerRight = (width - right_margin, bottom_margin)
    page.mediaBox.upperLeft = (left_margin, height - top_margin)
    page.mediaBox.upperRight = (width - right_margin, height - top_margin)

    
    
    
    
input_file = PyPDF2.PdfFileReader(open(input_file_path, 'rb'))
output_file = PyPDF2.PdfFileWriter()
page_info = input_file.getPage(0)# 以第一页为标准获得宽度和高度
width = float(page_info.mediaBox.getWidth())   #宽度 
height = float(page_info.mediaBox.getHeight()) #高度
print(width,height)#打印pdf原有的宽高
page_count = input_file.getNumPages()  # 统计页数




for page_num in range(page_count):
    this_page = input_file.getPage(page_num)
    split(this_page)
    output_file.addPage(this_page)
    output_file.write(open('out.pdf', 'wb'))

对于四个点的参考: