Use Your Point-and-Shoot Digital Camera as Document Scanner

Rosie by ((carola)) on flickrDespite the promises made by children’s cartoons and most works of science fiction we continue to endure life without personal jetpacks, flying cars, robotic housemaids named Rosie, and an exhaustive and fully digitized record of human knowledge. And while there’s not much advice I can offer to get you any closer to those first three goals, I’m pretty certain that you’ve got the tools at your disposal to digitize any document or publication you can get your hands on.

You might assume that the proper tool for this kind of work would be a flatbed scanner. For a while that was definitely true. These days, however, your digital camera can almost certainly handle the job. Yes, even that point-and-shoot camera that you bought years ago to take on vacations and photograph your cat for her very own blog. I use my camera all the time to photograph manuscript material at archives and sections of books or articles that I can’t or don’t want to lug home from libraries. Continue reading

Use Creative Commons Search to Find Text, Images, and Other Media

Creative Commons SearchA few weeks back I posted some advice about how to best display images in PowerPoint. That post ended with mention of a few online resources for finding images. Today I want to highlight another, more comprehensive option. The fine folks over at Creative Commons have put together a page which provides quick and convenient access to a variety of search engines. The Creative Commons Search page can draw data from Google, Google Image Search, Flickr, blip.tv, Jamendo, SpinXpress, and Wikimedia Commons. With one search at Creative Commons you can tab through the results from each of these engines.

For those of you unfamiliar with Creative Commons, it is a non-profit organization which promotes the sharing of creative and intellectual property. Here at DIY Ivory Tower, we’ve elected to publish our posts under a Creative Commons “share and share alike” style license. It allows others to share, copy, distribute, and adapt our work so long as they provide proper attribution and make the resulting content available in a similar manner. You can learn more about our license in particular and the organization in general by following the Creative Commons link at the bottom of the content bar on the right side of this and every DiYiT page.

Creative Commons Search

As you might expect from Creative Commons, their search page automatically limits your results to the text, images, videos, music, or other media available for legal reuse. The search page gives you the option to find results which can be used for commercial purposes or those which can be legally adapted, modified, or built upon. Creative Commons does note that they do not have any control over the results displayed by these search engines so you should definitely double check the copyright status of any media you intend to use.

A few weeks back I posted some advice about how to best display images in PowerPoint. That post ended with mention of a few online resources for finding images. Today I want to highlight another, more comprehensive option. The fine folks over at Creative Commons have put together a page which provides quick and convenient access to a variety of search engines. The Creative Commons Search page can draw data from Google, Google Image Search, Flickr, blip.tv, Jamendo, SpinXpress, and Wikimedia Commons. With one search at Creative Commons you can tab through the results from each of these engines.

For those of you unfamiliar with Creative Commons, it is a non-profit organization which promotes the sharing of creative and intellectual property. Here at DIY Ivory Tower, we’ve elected to publish our posts under a Creative Commons “share and share alike” style license. It allows others to share, copy, distribute, and adapt our work so long as they provide proper attribution and make the resulting content available in a similar manner. You can learn more about our license in particular and the organization in general by following the Creative Commons link at the bottom of the content bar on the right side of this and every DiYiT page.

As you might expect from Creative Commons, their search page automatically limits your results to the text, images, videos, music, or other media available for legal reuse. The search page gives you the option to find results which can be used for commercial purposes or those which can be legally adapted, modified, or built upon. Creative Commons does note that they do not have any control over the results displayed by these search engines so you should definitely double check the copyright status of any media you intend to use.

OCR in Google Docs makes transcription simple

While running an online seminar in professional development the other night, I had a request for the text of the Nebraska folk song with which historian Louis Warren concluded his presentation, “Settling with Debt: Western Development in the Railroad Era.” All Louis had with him was a hard copy of the song’s lyrics, which were printed at the bottom of his last page of notes. He was happy to share the text, but he understandably did not want to let go of his notes. So, I snapped a photo with with my phone, focusing on the bottom portion of the page.

This morning, I opened the original image in Picasa for some simple tweaks. First, I cropped out all irrelevant, surrounding text, and then brightened the image and heightened the contrast. The result is a more white background and darker, clearer text.

Next, I uploaded the image to Google Docs. I had read that Google Docs now supports OCR (optical character recognition), and this was my first opportunity to test it. When you upload an image and want Google to attempt OCR, be sure to check the box to convert text in images and PDFs to documents (see below).

Google Docs OCR

The result, as you can see in the image below, is an image in the top portion of the page and editable text in the bottom portion.

OCR makes editing simple - Nebraska folk song

Toward the bottom of my photograph, the image bends a little. I’m not sure if this is an effect of the wide-angle lens on my phone or perhaps I did not lay the sheet of paper down flat on a table. Nonetheless, the angled lines of the image cause the OCR process not to accurately recognize the points at which one line ends and another begins.

OCR makes editing simple - Nebraska folk song

I went back to the image in Picasa, straightened it, then uploaded it once again to Google Docs. The straightened image produced better results.

To finish it up, all I needed to do was clean up some odd spacing in the text (see image below).

While this folk song presents a simple set of text, an amount that surely would not have been a burden to retype, this sample demonstrated to me the value of an accurate OCR process. I’m happy to have this tool in my belt when I need to take on a larger, longer transcription project.

OCR makes editing simple - Nebraska folk song

Hurrah for Lane County, the land of the free,
The home of the grasshopper, bedbug and flea,
I’ll holler its praises, and sing of its fame,
While starving to death on a government claim.

My clothes are all ragged, my language is rough,
My bread is case-hardened, both solid and tough,
The dough is scattered all over the room,
And the floor would get scared at the sight of a broom

How happy I am on my government claim,
I’ve nothing to lose, I’ve nothing to gain
I’ve nothing to eat and I’ve nothing to wear,
And nothing from nothing is honest and fair.

– traditional folk song, Nebraska

This post originally appeared at nicomachus.net.