Skip to content

Total PDF pages in subfolders across folder structure

Last week, I wrote a script that ran through a folder structure and output the page count of every PDF in all folders and sub-folders, and also spit out a grand total.

While this worked well, what I really wanted was a script that just totaled PDF pages by sub-folder, without seeing all the file-by-file detail. After trying to retrofit the first script, I realized that was a waste of time, and started over from scratch.

The resulting script works just as I'd like it to, traversing a folder structure and showing PDF page counts by folder:

$ countpdfbydir
    47: ./_Legal
     2: ./_Medical-Dental
    15: ./_Medical-Dental/Kids
    11: ./_Medical-Dental/Marian
     2: ./_Medical-Dental/Rob
    35: ./_Personal Documents/Kids
    87: ./_Personal Documents/Marian
    28: ./_Personal Documents/Rob
    10: ./_Personal Documents/Rob/Golf
    12: ./_Personal Documents/Rob/Travel
-------------------------------------------------------------------
   249: Total PDF Pages

It took a few revisions, but I like this version; it even does some simplistic padding to keep the figures lined up in the output.

Here's what I came up with:

I feared this would be incredibly slow, but it only took about 40 seconds to traverse a folder structure with about a gigabyte of PDFs in about 1,500 files spread across 160 subfolders, and totalling 5,306 PDF pages.

Once I had this version working, I repurposed the original script to output file-level PDF page counts only for the current directory, so I can use that one when I want the details:

$ cd Home\ Stuff
$ pdfcountbyfile
     2: 2015-03-27 - Lowes.pdf
     4: 2015-07-14 - Home Depot.pdf
     1: 2015-09-03 - Home Depot.pdf
-----------------------------------------------------------------
     7: Total PDF pages in this folder

In case you want it, here's the modified script that generates the file-level PDF page counts:

These are clearly not need-every-day scripts, but I like the information they provide (because I'm a data geek), and they were fun for my shell-scripting-challenged brain to figure out. I'm 99.9% positive the efficiency could be improved by a factor of 100, but this works well enough for my needs.

4 thoughts on “Total PDF pages in subfolders across folder structure”

  1. This sounds like exactly what I need. But I am not that technical. How do I make this script work?

    1. A full-blown shell script primer is beyond my scope here, but there are lots of tutorials out there. In a nutshell, you need to:

      1) Copy the script.
      2) Paste it into a new pure text editor, like BBedit or TextEdit in plain text mode.
      3) Save the file
      4) Make it executable in Terminal with chmod 755 scriptname
      5) Run the script—but this requires either saving it somewhere on your path, or referencing the full path to the file each time. And this is where things get complicated, so the tutorials would be useful.

      regards;
      -rob.

  2. Hi Rob, when I run the script line by line it works but when I make it in to a countpdf.sh file it says
    F : ** Skipped - no page count **
    .0_BE_02May2017_cl : ** Skipped - no page count **
    : ** Skipped - no page count **
    it : ** Skipped - no page count **
    01_ : ** Skipped - no page count **
    : ** Skipped - no page count **
    ...pdf : ** Skipped - no page count **
    6x : ** Skipped - no page count **
    : ** Skipped - no page count **
    : ** Skipped - no page count **
    d : ** Skipped - no page count **
    pi : ** Skipped - no page count **
    : ** Skipped - no page count **
    .0_BE_02May2017_cl : ** Skipped - no page count **
    : ** Skipped - no page count **
    it : ** Skipped - no page count **
    02_ : ** Skipped - no page count **
    : ** Skipped - no page count **
    ...pdf : ** Skipped - no page count **
    6x : ** Skipped - no page count **
    : ** Skipped - no page count **
    : ** Skipped - no page count **
    d : ** Skipped - no page count **
    pi : ** Skipped - no page count **
    : ** Skipped - no page count **
    .0_BE_02May2017_cl : ** Skipped - no page count **
    : ** Skipped - no page count **
    it : ** Skipped - no page count **
    03_ : ** Skipped - no page count **
    : ** Skipped - no page count **
    ...pdf : ** Skipped - no page count **

    Do you know what im doing wrong?
    Hope so :)

    many thanks Patrick

Comments are closed.