Wednesday, April 23, 2014

Parsing a PDF file in Excel

Every Linux distro comes with a handy utility called pdftotext. But you can use it on a windows machine, as well.  Using the browser of your choice, visit, and download the precompiled for x86 Windows.

Different windows versions and installs give you different default directories, so I'll tell you what I did.

1. The file you downloaded is a zip, so first unzip it.Then look in the subfolders - on my pc:


     if you are running on a 32 bit OS, there is also a


2. Go to Start>Run and enter cmd. That puts me in C:\Users\Bruce. Then enter cd Documents. My dos prompt now says


    That maps to the Documents folder on the start menu. In an explorer window, copy the pdtotext.exe file from the folder in step 1 to your Documents folder.

3. Put your PDF doc in the Documents folder. Now, from the dos prompt, enter:

      pdftotext <filename>.pdf -layout

In the explorer window, you should now see a file named <filename>.txt

If that gives you the results you are looking for, then this excel macro will probably make things easier - just change the line:

        exe = "C:\Users\Bruce\Documents\pdftotext.exe"

to reflect where your exe ended up:


  1. I admit, I have not been on this web page in a long time... however it was another joy to see It is such an important topic and ignored by so many, even professionals. professionals. I thank you to help making people more aware of possible issues. pdf to excel

  2. Making PDF records is an extraordinary programming highlight, yet changing over PDF documents into Microsoft Word DOC records that can be altered by Word is far better.

  3. Thusly, this kind of PDF record can't be printed by most business counterbalance printers.

  4. Hello everyone! Our same day essay review company pride ourselves on offering a professional essay writing service and essay editing service that amazes our clients. This is why countless students come to us time and time again, asking: write my essay for me. Each writer on our team is glad to help.