Despite the promises made by children’s cartoons and most works of science fiction we continue to endure life without personal jetpacks, flying cars, robotic housemaids named Rosie, and an exhaustive and fully digitized record of human knowledge. And while there’s not much advice I can offer to get you any closer to those first three goals, I’m pretty certain that you’ve got the tools at your disposal to digitize any document or publication you can get your hands on.
You might assume that the proper tool for this kind of work would be a flatbed scanner. For a while that was definitely true. These days, however, your digital camera can almost certainly handle the job. Yes, even that point-and-shoot camera that you bought years ago to take on vacations and photograph your cat for her very own blog. I use my camera all the time to photograph manuscript material at archives and sections of books or articles that I can’t or don’t want to lug home from libraries.
That said. There are a few scenarios for which a scanner might be your best bet. If the documents you need digitized are printed on flat, standard size paper and you have access to a scanner with a document feeder such as a large, full-featured copy machine, then, by all means, go that route. Also, if you are scanning images for publication, a flatbed scanner will probably provide a more accurate rendering of the colors in your document than a camera because the scanner provides its own, well-controlled, light environment under its hood. Finally, if it is vital to you that text in your document be searchable, then a flat scanner might be right for you since it may produce straighter lines of text that will be easier for optical character recognition (OCR) software to read. But if you wish to digitize pages from tightly bound books or articles, fragile documents or those not appropriately sized for a scanner, you can use your camera to handle the job.
Camera Settings and Ambient Light
You don’t need a multi-thousand dollar DSLR to take pictures of text. You simply need to make the most of the camera and the light conditions at your disposal. In order to get crisp, well-focused photos of documents the first thing you’ll need to do is get to know the advanced settings of your camera. Most digital cameras sold in the last decade have a manual mode. Depending on the quality of your camera this mode might allow you access to settings similar to those found on the DSLRs used by professionals and photography buffs. On the other hand, you might only have access to a small group of pre-set modes. At the very least you should be able to tell the camera not to use its flash.
Your goal for these settings is to allow the most non-flash light into the camera’s sensors in the shortest amount of time required for the shot to be in focus. Here’s a rundown of suggestions for the options you might encounter on your camera…
- Flash: Turn the flash off. Libraries and Archives often don’t allow flash photography. And even if you’re at home, keeping the flash on at such close range will result in photos ruined by bright white glare spots.
- Type of Light: Many manual modes can adjust for the differing color temperatures and color ranges provided by different light sources. Being able to tell your camera that you are standing near a window in daylight or under fluorescent lights will help it accurately reproduce the colors you are photographing. Leave this setting on auto if you’ve got no idea what type of light you are under.
- ISO Speed: This variable is meant to mimic the effect on your photos that the speed of the film once had. When shooting in low light such as when you are indoors digitizing documents you want the ISO speed to be higher. This will make the camera more sensitive to the light it detects. Choose the highest ISO.
- Shutter Speed and Aperture (how wide the shutter opens): Each of these advanced settings offers a way to get more light into your camera. A slow shutter speed, however, is likely to result in a blurry photo unless you are using a tripod. And part of the value, at least for me, in using my camera to digitize documents is that I can do it very quickly without setting up a tripod. Your goal then for these settings should be to get a wider aperture (thus allowing more light into the camera) and either allowing the camera to determine the shutter speed or doing some trial and error to determine the point at which you get really crisp photos. Those of you who’ve never dabbled in photography should note that the numbers used to indicate aperture (called f-stops) get smaller as the opening of the shutter gets wider.
- Preview Zoom: When experimenting with these settings, use the zoom feature on your camera’s preview screen to see if you have good focus. I usually zoom all the way in to a single word. If it looks clear at that level of magnification, then you can be sure that it will be readable on screen. Usually the same mechanism that zooms in while you are photographing zooms in to the existing photo when you are in display mode.
- Auto Focus: You probably can’t turn off autofocus on your point-and-shoot camera. And that’s a good thing because it will do a great job of focusing on your text document. It’s important to point out, however, that you should get to know your autofocus mechanism and understand how it works. On many cameras pressing half-way down on the shutter will activate the autofocus and lock it on the target in question. You might even see blocks or lines on your preview screen indicating where the camera has established focus. Learning to focus first and then take the picture will help you avoid shots that are out of focus because you didn’t give the camera a chance to establish that focus. This system may also help you determine whether or not focus is possible. If I hold my camera too close to a document (resulting in a lack of sufficient light between it and the camera) my autofocus won’t lock, indicating to me that I’ve got to make an adjustment.
- Sounds and Post-Shot Viewing: While we are discussing your camera’s many features and settings, I’ll mention two additional tips. First, dig around in your settings to find the audio options for your camera. If you are working in an archive or library, silence every noise that your camera can make. Everyone else around you will appreciate it. And second, if you are taking many photos at a time, you probably should tell the camera not to display each photograph for a few seconds after it is taken. Turning this feature off helps when you are in the groove of turning pages and snapping photos. After you’ve established at the start that you have the settings right for quality pictures, you don’t need to check each one to see if it looks okay.
- Ambient light: Obviously without the flash your camera has no control over the amount of light in the room. However, you can often take steps to maximize the light around you. In a library or archive you can move closer to a window or stand so that your own body isn’t between a light source and your material. At home you can turn on additional lights in a room to increase the ambient light. It also helps to know what room in your house has the best light to begin with. I’ve often done this type of photography on my bathroom counter because that room boasts a big fluorescent light above the mirror.
When I’m digitizing printed or written material with my five-year-old Cannon Powershot SD500, I place the camera in manual mode, turn off the flash, set the ISO to 400 (the highest option on this camera), and tell the system what type of light I’m shooting in. With enough ambient light, these settings allow me to take crisp images of text that easily withstands the tight zooming I sometimes need to do in order to decipher nineteenth-century handwriting.
Making a PDF
Once you have a pile of jpegs downloaded from your camera to your computer, you might also want to turn them into a PDF file. I prefer to keep shots of manuscript material in JPEG format so that I can zoom more easily. But for excerpts from books and articles, PDF makes more sense to me. This allows you to join multiple images together in one file and annotate them with ease. If you have the full version of Adobe Acrobat, you can convert JPEG files to single or multiple PDFs with ease. Adobe Photoshop or other full-featured image editing software will also allow you to make that conversion.
If, like most people, you don’t have and can’t afford those programs, converting JPEGs to PDFs is still pretty easy. In another post I recommended a program called doPDF as one of many options for printing Word files into PDFs. If you are a Windows user with one of these PDF printing programs installed, you can simply print the JPEGs to PDF via Windows Explorer. First you’ll want to make sure your images are oriented properly. If you need to rotate some or all of your images, you can do so by selecting them and right clicking in Windows Explorer. You’ll see “rotate clockwise” and “rotate counterclockwise” the menu. Next, simply select the images you want to print in Explorer, right click on them, and select Print from the menu.
That will launch the photo print program built into Windows. Follow through its prompts and select your PDF printer when asked to select a printer for the output. It may take some time given the number and size of the files you’ve selected but the end result will be a PDF that you can read in any PDF reader. I’ve recommended Foxit Reader in the past for annotating PDFs. The PDF creation software bundled into the Max OS will allow for a very similar process if you happen to be a Mac user.
If you are have scanned typed text and want it to be searchable, you’ll need to run it through OCR software. The full version of Acrobat does an okay job of this. And if you upload a file to Google Docs it will also scan it for text as Phillip described in an earlier post. Unfortunately, this Google feature only applies to files 2MB or smaller. The popular program Evernote also has some OCR ability but it seems limited to the Premium version. This post from the folks at Lifehacker details some of the better OCR options out there. Be sure to read through the comments to see what their readers recommended. Unfortunately it doesn’t look like the market has produced really robust free software that performs this task.
If you’ve got suggestions for OCR software or tips on digitizing documents with your camera, we’d love the hear them. Drop us a line in the comments box below. And remember to adhere to copyright law when doing your digitizing. Just because you have the ability to make a personal copy of a whole book in a matter of minutes doesn’t mean that it has suddenly become legal.