So I ran into a problem the other day when I had to copy some text from a PDF file and paste it into a presentation that I was doing. The problem was I could not copy the text! Hmm, I thought, there must be something stupid I am doing since I am pretty sure I have copied text from a PDF file before.
Luckily, I wasn’t that stupid, since it ended up being that the PDF file had several pages that were scanned bitmap files that had been inserted into the PDF. So it was not actual text in the first place. Secondly, where there was actual text that could normally be copied, this PDF had some sort of security permissions set on it so that content copying was not allowed! Grrrr!
I still needed that text and I was going to figure out a way to get it. In this article, I’ll walk through the simple way to copy text that works if the document is not protected and the text is not a scanned image. I’ll also go over what to do in the tricker scenario where you are not allowed to copy the text. It’s not an ideal solution, but it’s better than nothing, especially if you have to copy a lot of text. Even if you can save yourself from typing 80% of it manually, that’s great!
Selecting Text in a PDF
In Adobe Reader, if text is copyable, then all you have to do is select it and right-click and choose Copy.
In other PDF viewer programs like Foxit, you have to click on Tools and then Select Text.
Obviously, if you were able to do this, you would not be reading this post! But just in case, that’s how you select text. Now on to the tougher issue of copying text from images or secured PDF files.
Use OCR to Copy PDF Text
You can quickly check to see if a PDF file is secured in Adobe Reader by looking up in the title bar and looking for the word SECURED.
You can see specific permissions by clicking on Edit and then clicking on Protection and then Security Properties.
As you can see below, content copying is not allowed and the security is protected by a password. If you know the password, then you could remove the security and copy all you want.
Unless you’re a hacker, breaking the password is not an option. So the only other thing you can do is take a screenshot of the text and then run it through an OCR program. Sounds like too much work, but its really not. You can take a screenshot on a Mac or PC without additional software.
Mac – Just press Command + Shift + 4 on the keyboard
Windows – Just use the Windows Snipping Tool
Here’s a screenshot I took of some text that I could not copy from a secured PDF file:
Note that when you take the screenshot, make sure the document zoom is set to 100% so that the text is crisp and clear. Once you have the screenshot, then download this free OCR software:
Obviously, if you already have OCR software, then just use that program instead. This program works well, you just have to make sure when you are installing it that you do not accept any of the other software “offers”, which will just install junk on your computer. But as long as you do that, the software has no spyware or anything like that. It’s also tested by CNET to ensure this.
Anyway, once you have the program installed, click on the big Open button and choose your image.
It’ll show you a preview of the image in the left hand pane. Then click on OCR and Start OCR Process.
That’s it! The text will now show up on the right hand side and you can copy it to the clipboard or export it to Microsoft Word.
Overall, the program did a very good job with a few minor mistakes here and there. It saved me a lot of time though not having to manually type all that text. Hopefully, this will help you copy the text you need from a PDF document. Post any comments or questions and I’ll respond. Enjoy!