Skip to content

Merge multiple PDF files into one file

September 30, 2009

One excellent feature of the XSane scanning application is that it can save your scanned documents to Adobe Acrobat PDF format, merely by saving them with a .pdf extension, as in examplefile.pdf . But one not-so-nice feature is the fact that there’s really no expedient way to save multiple scans into a single PDF file.  If you do 3 scans, you have 3 PDF files.  What now?  You need something that can merge (also called “join”) these files into one big beautiful file.

Enter pdftk, or PDF Tool Kit.  This is a comprehensive set of tools that can – among other things – merge, split up, encrypt/decrypt, password protect, rotate, and repair PDF files.

The merge or join function can be done in a few different ways.  I will show different examples here to illustrate why you might want to use one over the other.  Let’s say we want to merge our files into a single PDF named final_tps_report.  In Example 1 we will have 3 sequentially-named files that need to go in this order: page1.pdf, page2.pdf, page3.pdf .  In Example 2 we’ll have 3 files with very different names that need to go in this order: coversheet.pdf, tps_report.pdf, appendix.pdf .  (Your 9 bosses will be happy to explain to you why it’s so important for the coversheet to go on the front.)  Each command goes all on one line, even though it may look like two here because of its length.

Method 1:

Specify the full names of the files to be merged, in the order you want them in.  This is the most straightforward method.  It’s okay for 2 or 3 files, but any more than that and it starts to get tedious.
Format: pdftk file1 file2 file3 cat output mergedfilename
Example 1:
g33kgrrl@home$ pdftk page1.pdf page2.pdf page3.pdf cat output final_tps_report.pdf
Example 2:
g33kgrrl@home$ pdftk coversheet.pdf tps_report.pdf appendix.pdf cat output final_tps_report.pdf

Method 2:

Assign each file to a “handle” – that is, essentially a variable (remember algebra class?) that stands for each filename – and then use the handles to specify what order the files go in.  This can help you keep everything straight if the filenames are long and confusing – otherwise, it just makes for unnecessary extra typing.
Format: pdftk A=file1 B=file2 C=file3 cat A B C output mergedfilename
Example 1:
g33kgrrl@home$ pdftk A=page1.pdf B=page2.pdf C=page3.pdf cat A B C output final_tps_report.pdf
Example 2:
g33kgrrl@home$ pdftk A=coversheet.pdf B=tps_report.pdf C=appendix.pdf cat A B C output final_tps_report.pdf

Method 3:

You need to process all the PDF files in a given folder.  They are numbered sequentially (e.g., page1.pdf, page2.pdf, page3.pdf), according to the order they need to go in, or they need to be alphabetized.  Tell pdftk to merge all the files and they will be ordered alphabetically and/or numerically according to their names.  This is extra convenient when you have a large number of files to merge.  No one wants to type in 10, 20, or 50 filenames, never mind the potential for errors.  Here is your solution.
Format: pdftk *.pdf cat output mergedfilename
Example 1:
Files will be merged sequentially (even if some numbers are skipped – e.g., page1.pdf, page2.pdf, page3.pdf, page6.pdf)
g33kgrrl@home$ pdftk *.pdf cat output final_tps_report.pdf
Example 2:
This won’t work the way we want it to. It will put appendix.pdf in front of coversheet.pdf and tps_report.pdf, because it starts with the letter “a” .  We’d have to use Method 1 or 2 here.

Happy merging!

Ω

Advertisements
5 Comments leave one →
  1. Edward Hagihara permalink
    October 12, 2009 6:11 am

    I’ve been using pdftk since I had to merge some music that I scanned in. Awesome tool.

    • g33kgrrl permalink*
      October 12, 2009 11:22 am

      Yeah it’s quite powerful. You really couldn’t ask for more functionality than that.

  2. Martin permalink
    December 1, 2012 2:24 am

    Thank you for describing this tool. I was searching for a tool to remove pages from PDF files and gave up. But with pdftk it is very easy:

    pdftk A=in.pdf cat A1-3 A5-10 A13-18 output out.pdf

    • g33kgrrl permalink*
      December 27, 2012 4:24 pm

      You’re welcome, and I’m pleased to hear that it was helpful to you. I love pdftk because it’s so simple and powerful. Thanks for sharing your example here.

Trackbacks

  1. Manipulate PDF Files | Blorg!

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: