Merge multiple PDF files into one file
One excellent feature of the XSane scanning application is that it can save your scanned documents to Adobe Acrobat PDF format, merely by saving them with a .pdf extension, as in examplefile.pdf . But one not-so-nice feature is the fact that there’s really no expedient way to save multiple scans into a single PDF file. If you do 3 scans, you have 3 PDF files. What now? You need something that can merge (also called “join”) these files into one big beautiful file.
Enter pdftk, or PDF Tool Kit. This is a comprehensive set of tools that can – among other things – merge, split up, encrypt/decrypt, password protect, rotate, and repair PDF files.
The merge or join function can be done in a few different ways. I will show different examples here to illustrate why you might want to use one over the other. Let’s say we want to merge our files into a single PDF named final_tps_report. In Example 1 we will have 3 sequentially-named files that need to go in this order: page1.pdf, page2.pdf, page3.pdf . In Example 2 we’ll have 3 files with very different names that need to go in this order: coversheet.pdf, tps_report.pdf, appendix.pdf . (Your 9 bosses will be happy to explain to you why it’s so important for the coversheet to go on the front.) Each command goes all on one line, even though it may look like two here because of its length.
Format: pdftk file1 file2 file3 cat output mergedfilename
g33kgrrl@home$ pdftk page1.pdf page2.pdf page3.pdf cat output final_tps_report.pdf
g33kgrrl@home$ pdftk coversheet.pdf tps_report.pdf appendix.pdf cat output final_tps_report.pdf
Format: pdftk A=file1 B=file2 C=file3 cat A B C output mergedfilename
g33kgrrl@home$ pdftk A=page1.pdf B=page2.pdf C=page3.pdf cat A B C output final_tps_report.pdf
g33kgrrl@home$ pdftk A=coversheet.pdf B=tps_report.pdf C=appendix.pdf cat A B C output final_tps_report.pdf
Format: pdftk *.pdf cat output mergedfilename
Files will be merged sequentially (even if some numbers are skipped – e.g., page1.pdf, page2.pdf, page3.pdf, page6.pdf)
g33kgrrl@home$ pdftk *.pdf cat output final_tps_report.pdf
This won’t work the way we want it to. It will put appendix.pdf in front of coversheet.pdf and tps_report.pdf, because it starts with the letter “a” . We’d have to use Method 1 or 2 here.