1.5 Scanning Documents
There are a vast number of reasons why researchers, communities, and cultural heritage organizations might be interested in digitizing documents.The Federal Agencies Digitization Guidelines Initiative (FADGI) outlines the following common intentions:
- To make materials accessible for the purposes of:
- Exhibits
- Publications
- Web use
- To establish long-term preservation of materials
- To protect the collection from information loss due to:
- Obsolete technology
- Natural deterioration of materials
- Improper handling of materials
- Theft of materials
Documents that cultural heritage organizations might scan for a digital collection could include, but are not limited to:
- Letters
- Field notes
- Maps
- Books
- Dictionaries
- Sketches of the grammar
- Articles describing the language, culture, and people of the organization
Materials to exclude from digitization:
There are some materials that may be too fragile or in need of too much repair that they cannot be easily scanned. For these items, the Library of Congress (LOC) advises consulting with preservation specialists and discussing optimal treatment options with the curator of the collection. They recommend the following materials be excluded from digitization:
- Documents that consist of acidic, fragile, brittle, torn, missing or sticky paper.
- Pages with iron gall ink that has deteriorated the paper.
- Unstable or fragile media such as crayon, charcoal, chalk, or soft pencil.
- Bound books with severe leather deterioration or many missing pages.
- Fragile letter copy books including carbon copy correspondence, and tracing paper drawings.
- Non-traditional textual formats such as scrolls.
Prior to Scanning
When beginning a digital archiving project involving the scanning of documents, there are certain factors that one will need to consider and plan prior to beginning. The majority of this module will discuss the following factors:
-
Equipment for Scanning
-
Environment for Scanning
- Managing and tracking materials after scanning
Additionally, the LOC recommends assessing the following factors of the collection at large before beginning a project scanning task:
- Is the collection in a condition to be handled safely throughout the project?
- How much stabilization treatment is necessary for the collection to be managed safely?
- What efforts can be made to reduce the amount of physical contact with the collection?
- What is the most cost effective digitization method that allows for timeliness without damaging the collection?
It is also important to look through the collection before scanning. Assess the general structure of materials to determine how many different folders you will be likely to create. Note whether or not the pages going into each of your folders are numbered. If they are not, you will want to lightly number them in the corner or the page with a pencil. This will be very important to help you manage your items and folders later in the archiving process.
Equipment for Scanning
There is more than one way to go about scanning documents. Each method has advantages and disadvantages and varies in practicality depending on the type of document being scanned.
Scanning app method for iPhone
Using a scanner app is especially practical as an option for scanning documents when one does not have access to a flatbed scanner. It effectively turns one's iPhone into a high quality portable scanner. One iOS app that works well for this purpose is Scanner Pro: PDF Scanner App.
Screenshot of the Scanner Pro app from the AppStore
To use Scanner Pro follow these simple instructions:
- Grant access: Allow the app to access the phone camera.
- Format: Use the in-app camera to take a photo. Adjust the borders or the scan surface to fit the border of the document.
- Generate: Click ‘Save Selection' and the app will generate a JPEG.
- Export: Click “Select” in the upper right corner and select all the images you wish to export. Then click “Share” in the bottom left. One will be presented with many export destinations including email and Google Drive (if the Google Drive app is installed on the iPhone). Select the intended destination and follow the instructions for uploading.
Scanner Pro has a cost of $3.99 as of this writing. There are also plenty of free options that receive consistently high reviews. This New York Times article provides an up to date list on the best apps (iOS and Android) designed for mobile scanning: The Best Mobile Scanning Apps.
Scanning app method for Android
The ScanPro (not the same as Scanner Pro) app for Android is a very powerful and highly rated option. It is praised in the AppStore for the following features:
1. Generates high-quality scans starting at 200 dpi
2. Can be saved as either PDF or JPEG
3. Good OCR
4. Quick, automatic edge detection
5. Cloud integration so scans can be easily uploaded directly to Google Drive, Dropbox, iCloud, etc.
6. Claims to be able to scan anything including notes on whiteboards and sticky notes.
Screenshot of the ScanPro app from the AppStore
It should be noted that this app is not free. There is a yearly subscription cost of $34.99 or a monthly subscription cost of $5.99. If the funds are available to you, this may be the best option.
Digital camera, tripod and clip method
This method, as described by digital naval history collection The Subchaser Archives, delivers a high-quality scan, and can be done relatively quickly once one becomes familiar with the process involved. The necessary equipment is as follows:
- Digital camera
- Tripod
- Metal coat hanger and pliers
- Table
- White poster board or a large piece of paper
- Pencil
- Ruler
- Scanning location with good natural light
For step-by-step instructions on assembling and scanning using this method see the article linked below:
The tripod and clip method, while effective, is not always the most practical. The amount of equipment needed means one must prepare beforehand and pack accordingly if one intends to accomplish a scanning task. There may be a learning curve before one is able to optimally use this method. Additionally, if one does not already own a digital camera and tripod, not to mention the smaller variables required, there will be a financial component. The final method listed in this module is the most efficient from an equipment/cost/user-friendly standpoint.
Flatbed scanner method
If this is available it may give the best quality of scan because the light source and orientation are stable. However, not all documents are easily scanned using this method. For example, this is not the most practical method for scanning books and other bound materials. Additionally, one may not always have access to a flatbed scanner.
Environment for Scanning
If you plan to use either a digital camera or phone camera to scan there are some surrounding environmental factors one will want to be mindful of. The following list is adapted from both the Library of Congress and The Subchaser Archives:
- Clean and flat surface, preferably a table
- White background, a white sheet or poster board are the best options if the flat surface in question is not white
- Natural lighting is most effective, more so than using an overhead lamp or a camera flash
- Clean and dry hands when handling documents
- Use of only pencils near the work zone
- No food or drink near the work zone
- Close books and cover the work zone when not in use.
Managing and Tracking Scanned Materials
These four steps will help you manage your scans. Collate individual JPEGs of the same document into one folder.
- Name the image files and the folders they are in
- Preserve the name of the file as produced by the camera software as this will give the ordering of images or text as in the original
- See also discussion in Modules 3 and 6
- Establish where the digital files will be stored (see example)
- Track the status of scanning in a spreadsheet
- Create metadata describing the scanned materials
Item status checklist
The columns indicate the following:
- Folder name
- Filename
- Item type
- A notes field
- Scanned status
- Collation status
- Upload to Google Drive status (project drive)
- Upload to Digital Library status (digital archive)
- Upload to S Drive status (personal long term storage)
The status fields are color-coded according to where we are in the process:
- Green = completed
- Yellow = current status
- Red = not yet begun
References
Apple App Store. (2009, October 08). Scanner Pro: PDF Scanner App. Retrieved July 01, 2020, from https://apps.apple.com/us/app/scanner-pro-pdf-scanner-app/id333710667.
Apple App Store (2014, April 02). ScanPro App - Docs, PDF & OCR. Retrieved July 22, 2020, from https://apps.apple.com/us/app/scanpro-app-docs-pdf-ocr/id834854351
Federal Agencies Digitization Guidelines Initiative. (2020). Digitization Activities Project Planning and Management Outline. Retrieved July 01, 2020, from http://www.digitizationguidelines.gov/guidelines/DigActivities-FADGI-v1-20091104.pdf
Keough, B. (2020). The Best Mobile Scanning Apps. Retrieved July 01, 2020, from https://www.nytimes.com/wirecutter/reviews/best-mobile-scanning-apps/.
Library of Congress. (2020). Collections Care. Retrieved July 01, 2020, from https://www.loc.gov/preservation/care/scan.html
Woofenden, T. (2009, November 30). Photographing and Scanning Old Photos and Documents. Retrieved July 01, 2020, from https://www.subchaser.org/photographing-documents