4 Challenges in Downloading Historical Newspaper Articles
Jul 27, 2013 12:48 pm | By: Kenneth R Marks
I have been researching online historical newspaper sites for several years, both the free ones as well as the subscription-based sites.Their capabilities, independent of the size and quality of their scanned collections, break down into two parts:
- How do you search the site and find "stuff"
- How do you download and save the articles that you find
This post is all about the second part - how do you download and save materials. Many of these sites use different underlying base software and they all are quite different in their approach to providing a download capability.
Among the variations that I have found are several methods provided or suggested:
- Download a .pdf of the page that contains your desired article
- Download a .jpg of the page that contains your desired article
- Use a software snipping tool such as the Windows 7 "Snipping Tool" or some commercial offerings, such as SnagIt, Shutter, Jing, etc. There are many of these tools.
- The newspaper library software will snip the article for you and present it as a "whole" article.
- The newspaper library software will snip the article for you and present it in several pieces that comprise the entire article, such as the headline, and each paragraph as an independent .jpg file
The concern with all of these methods is "How do you end up with a complete article that is large enough to read or is zoomable?"
The 4 challenges are presented below with visual examples: Highlights
- almost every online site presents the selected article after a search with highlighted search terms like these:
So, depending on the capabilities provided by the site software, make sure that if you do NOT want to download the article with the highlighted text, that you download the article the way that you want it. There is always a way to get the article without the highlights. You can experiment with the software or just check some of the many tutorials I have created for the repository of interest.
Chopped Up Articles - A few of the sites will present the article in a "chopped up" format, depending on the length of the article. Usually the title or headline is separated from the text, and if it is a long article, there may be several parts. For example:
So just like your need to concern yourself with downloading highlighted text, you will need to be careful with downloading articles that are broken up, as in the examples. It is always safest to download the article (or entire page) in .pdf format and crop to your liking later. With .pdf formats you can always zoom to the size you desire and then crop or snip the article in one piece.
The Article is Too Small - this is the challenge that requires the most forethought when saving the article for later use. Much of the time, the repository software will allow you to download or snip articles in several different ways. Unfortunately, if you do not check your downloaded image before you leave the site, you might be disappointed in the size or the quality of said download. Make sure that it is either zoomable after the download or if not, that the article is zoomed to a readable size prior to downloading. Also, if it is too small when downloaded, depending on how the site prepares the clipping, zooming later may degrade the quality so that when zoomed, it is so fuzzy that you can't read it.
Just like the other challenges, forethought will lead to the best results. It is always safest to download the article (or entire page) in .pdf format and crop to your liking later. With .pdf formats you can always zoom to the size you desire and then crop or snip the article in one piece.
Oops, I Forgot the Citation - This is easily the biggest mistake one can make and creates many "smack your head" moments. When I first started searching newspapers, I got so excited when I found an article about one of my ancestors that told more of his or her life story - that I just downloaded the article and did not record the details about the newspaper that I found it in. In order for me to find the article again for those that I discovered online - I will have to perform all those searches again. Many of the repositories have a function or a link where the source details are presented. Here is an example from The Missouri Digital Newspaper collection:
In this example. the left image shows you the name of the newspaper and date of publication. In order to obtain more details, the link "View Description" when clicked will yield the image on the right, which contains much more information.
The moral of the story is that online newspaper collections can be an incredibly valuable tool for you family history and genealogy research. Fortunately, much of what is written above and the challenges presented are not huge to overcome. Just think ahead and make sure that the articles that you download are readable and in a format and size that pleases you and your future readers. And make sure you document where the articles came from!