Archives and Technology

Last week Eric Wittenberg posted a swell rant on is blog regarding the use of Google book search particularly with public domain resources.  The discussion intersected with the day job on several points, and I couldn’t hold back.  Three responses to a blog post, well that means I probably should have taken my discussion points over here in the first place! 

Mr. Wittenberg’s problem involved the permissions settings allowed to him, a viewer, of public domain documents and books.  The system allowed him to browse, but not to print, copy, download, etc.  Much like being in an archive microfilm room without benefit of a printer, copier, or perhaps even paper and pencil.   The issue isn’t directly linked to a particular regulation or governance guideline.  Rather that finally technology is catching up to some long standing knowledge management requirements, left unfilled since the beginning of this www thing. 

During the “analog” days, as I like to call them, we had books and papers.  Our portability options were limited to some variation of photography – photo-copy, microfilm, or microfiche.  The paradigm was simple.  If you wanted access to the information presented in the resource, you went to the resource – bought the book, visited the library/archive, etc.  You could reproduce the information either by written notes or, if allowed, photo reproduction.  While portability was limited, integrity of the resource was high.  It was near on impossible within the bounds of the “analog” format to change the artifact. The information “was what it is was because that was all there was…” 

I recall laboring under such constraints during undergraduate work.  The Winston Churchill Memorial in Fulton, Missouri contains a rather sizable set of papers and documents pertaining to both the man and the British War cabinet.  Perhaps the most complete set in the Western Hemisphere, or at least at that time.  I arrived in college about the time the staff was acquiring, through grants, microfilm copies of the cabinet papers.   Wonderful stuff for a young historian to go browsing through.  I produced no less than six papers based on research form those reels.  One of which, my “magnum opus” for undergraduate studies, was a 100 page typed thesis.  No telling how many hours I spent in the basement of the Church of St. Mary the Virgin, Aldermanbury.  Wonderful apprenticeship for a young history major.  But, after becoming probably the one person in the United States most acquainted with the British War Cabinet papers from the CAB 65 series, none of it was really useful once I left the bosom of Fulton, Mo.  I had, and still have my notes.  But these aren’t authoritative.  Short of another trip back to the memorial, I could not say without doubt a citation was or is accurate.   

However all that type of work is becoming as obsolete as 45 RPM singles, fins on Chevys, and Burma Shave signs.  Shortly, if not already, all those documents that I was “forced” to browse in the rather chilly reading room, through an eye-branding microfilm viewer screen, will be available in electronic format.   Great for those of us on a travel budget.  But this opens another can of worms – validity of the digital media.

Take for instance one of my treasured research trophies.  Looking at the cabinet minutes for a particular day in April 1940, I came across Churchill’s personal copy of the meeting agenda.  Next to a short paragraph regarding the French naval forces dispositions, I found a scribbled word that looked like “Nelson.”  After a week of fact checking and validation, I announced the conclusion that Churchill had already made up his mind to strike the French fleet, as a pre-emptive measure, well before the fall of France.  The reference to “Nelson” alluded to an action during the Napoleonic Wars involving Lord Nelson and the Danish fleet.  This little scribble was about as close to a smoking gun as I was allowed.  Big stuff for the undergraduate world.  Not much as for bragging rights, though.  The point to me was not all the “information” was simple printed text.  Sometimes a written note on the paper was important too.

Now lets say that same “trophy” was digitized (and it probably is today).  Given a good set of editing software, I could easily paste that scribble anywhere on the page, move it to another page, or even remove it entirely.  In short, the artifact isn’t as tamper proof now as it was in the “analog” format.  Now the information may well not be what it seems to be.  (For instance, I’ve already seen a few “photo shopped” Gardner photographs from the Civil War which claim to show evidence for flying saucers buzzing around in the 1860s!)  Back to my “trophy,” in the digital format, how can I ensure I’m not being duped and at the same time legitimately claim to the world the scribble is original?

Thus one of the long standing requirements from the knowledge management perspective since the dawn of the Internet age has been some form of tamper resistant, or integrity preserving, technology that could be cheaply extended to a whole archive of documents.  The solution passed around today is referred in the trade fliers as “digital rights management” or “resource rights management” depending on the presenter.  Large players in the software business have positioned themselves over the last few years in demand from the government in particular. 

In the 1990s, the solution most mentioned was porting a document to PDF format.  Sounded good, mostly because 99% of us had only Acrobat Reader.  The publisher part, which created the PDF in the first place and allowed editing of the PDF, was expensive.  And nothing prevented a user from printing, screenshotting, or saving a local copy of the file.   The current technology is able to wrap the PDF or other file in a permissions set.  Without the right level of access (simple password or in the deluxe option a digital certificate), a user is only allowed to open the file.  I’ve had to implement solutions where the users were only allowed to view a Word document and could not cut/paste, screen print, print, or save.  All of which was designed to ensure what was said or displayed at a given time is indeed recorded accurately beyond a shadow of a doubt.  Some sources call this “non-repudiation” of data.  I find that term somewhat misleading.  Basically, we are talking about ensuring the validity of the artifact, in the state it was declared a record. 

One major consumer of this technology, within the government sector, is of course the Department of Defense.  Federal statutes are very explicit regarding the handling and disposal of what are defined as “War Records.”  The evolution of the war records is interesting, at least to me.  We are all familiar with the Official Records of the War of the Rebellion from the Civil War era.  While a great source of information, most writers I know of have sought out primary sources where details are lacking.   Often times the reports of key leaders are missing (for example the report of Capt. John C. Tidball, of Battery A, 2nd US Artillery, regarding the Battle of Antietam somehow missed inclusion in the ORs.  But it was preserved in the Henry Hunt papers).

Moving forward to World War II, still in the “analog” days, but the Army was somewhat less conserving on the paper.  Even short inquiries to the National Archives regarding World War II topics often nets ample results.  Still much was left out of the whole.  Veteran’s recollections often reveal a whole other story. 

The researcher fifty years from now, who looks at the current wars, in my opinion, will be overwhelmed with data.  Replacing Sir Winston’s scribbles on the margins, now days document version histories, emails, and even audio recordings are preserved as parts of these war records.  The problem will be, particularly given the controversies of the Iraq war, how the historian can validate a source, and ensure his “trophy” was not altered from the original. 

My prediction is into the future the validation of digital certificates as part of a good author’s references will be as important as the MLA style. 

Sorry for the long post. 

About these ads

One response to “Archives and Technology

  1. Pingback: Odds & Ends: May 29, 2008