A Civil War Battle Database Part II

So we have a nifty database that supports organization charts and unit rosters.  Now we’ve got to indicate how those organizations (collective groups of participants) moved and reacted on the battlefield.  Here’s the part where many will have different definitions of the elements, but allow me to use some simple conventions here, which may gloss over some details for the moment, in order to facilitate the examples.

For my hypothetical “Battle of Mudpuddle” web resource, I’d define an “event” as any recorded or recordable (giving options for deductive reasoning) action taken or reacted to by a participant or unit.  I know it is not a sound legal definition, but work with me here, this is a blog, not a requirements document!  Under this definition we have different flavors of events.  Arbitrarily I’ll call two in particular – situation reports or movement reports.  There certainly will and should be other types, but allow me to work with these two for clarity.

Those nice dispatches reporting enemy contact make easy translation to situation reports.  But while those specified when the “friendlies” saw “enemy,” movement reports should detail what the “friendlies” are doing to the enemy.  It is possible to break down after action reports and other accounts written after the battle as a set of situation reports and movement reports.  Consider the reporting officer is re-stating the chain of events encountered in the battle.  Might require some work on our end however to define the breaks.

The relation is straight forward – “Events” break down into different “flavors” which have different attributes:

Event Tables
Event Tables

Events then are defined by a time, person recording the event, and a type ID.  In addition, I’d provide maybe a “narrative” column for remarks, even though I deplore such unstructured data holes.  A “Bookmark” pointer might be used to reference the line in a document from which the event is derived.  Works good if we have the “artifacts” cataloged as mentioned in the previous post.  Again for this example, I’d only use two event types:

TypeID 1 is a Situation Report

Type ID 2 is a Movement Report

Some sample event table entries would look like this:


For reference here’s the Participants table from the earlier post, from which the PersonID is derived:

So we have a movement order from Col. Smith (bookmarked to an Official Record entry), followed by a situation report from an OR, a diary entry from Mrs. Johnston, reference to a letter from Pvt. Butler, and lastly another reference to Col. Smith’s OR.

Now let’s look at some generic data for a “Movement Report” table:


Purists looking at normalization will quickly point out the re-use of the Unit name and time columns.  I’m presenting them here for ease of reading (to those not acquainted to the elegance of SQL).  The important columns here are the waypoint and endpoint locations.  Translated to English this entry into the table should mean “the 111th was moving from point 1 to point 2 at 9 am on the 2nd of July.”  The event ID ensures it is identified coming from Col. Smith.

And then the “Situation Report” Table:


Those familiar with the Army’s SitRep format know this well.  SALUTE – Size, activity, location, unit, time, and equipment.  It’s a nice, concise format that works for many generic field reports.  Arguably it isn’t the most useful for the Civil War.  But for this example, it works for a simple demonstration.

In this set we have good Capt Jones reporting an encounter with the 5th Arizona’s skirmish line.  Shortly after that his Mrs. Johnston recalled seeing a battery of the 1st Nevada deploy their 3 inch Rifled cannon nearby.  Private Butler said “we ran like the devil” in a letter home, which matches to around 10:30 a.m. that morning.  And the good Col. Smith let us know by 11:00 a.m. the Arizonians were advancing through.

I could, if I had time, further refine out events by “flavors.”  One lacking in this example is a category for friendly action reports – “We stood and fired” type events.  However we can begin to link this into the larger data structure now:

Participants and Events

Given these basic relations it is possible, if our “Lat/Long” entries support it, to overlay the events, particularly unit locations, onto a map display.  This display could be managed to show all activities within a set time frame, or to display all the events related to a particular unit.  The “hard work” of relating the data is entrusted by all those lines pointing to the tables which represent either the coded application logical rules or defined database relations.  Fairly easy at this point to call up “events where participant is Col. Smith,” or better still “events where the participant is anyone who is included within the unit 111th Alaska.”

Personally I’d tweak it one step further.  How about an additional flag for “visible to” attributes for events?  In other words, given some analysis of the first person accounts, and perhaps some assumptions, could a particular “event” be seen by a participant?  Would be nice to lay out what “was” and then compare to what Col. Smith “saw” at the Battle of Mudpuddle.  Such would at least offer some great fodder for the “what if” conversations.

That’s my Civil War Battle database on a bar napkin, if you will.  As a card carrying solutions consultant, I’m the first to say “this is just a general idea of how it could be done, and the solution defined here is not intended to be implemented as is….”


9 thoughts on “A Civil War Battle Database Part II

  1. Thanks for putting up the napkin, Craig. Very nice stuff.

    Your advice about data structure is excellent. I’ve tried to do some of the event/time/place work myself, for Antietam, but it’s a formidable task. Not in the data tables, I have those. Rather the rub is in the content.

    First, there just so much of it, I could spend the rest of my useful life just populating the database.

    Second, and show-stopping for me, is how to find a way to express (or allow for) ambiguity, conflict, and inaccuracy in the data – not to mention the holes. Holes due either to my work being necessarily incomplete or the inevitable lack of data. Also, I think it would be misleading to present a collection of data items as unalloyed “facts”, giving the impression that the unknowable is in fact precisely known.

    Then there’s the challenge of how to interpret and display a synthesis or representation of the data, transforming it into useful information. All the while maintaining some transparency through to the original source information, interpreted, weighed and balanced appropriately.

    I think it’s tempting to think that if we just plot enough data points we’ll be able to see the big picture. The trouble, at least with Antietam data, is that I expect it would look more like a scatter plot than a linear graph, if you know what I mean.

    Or maybe I worry too much 🙂

  2. Brian,
    I agree with you that the information we often have is such a daunting pile of work, we might never even get half done!

    I’ve worked on some “real time” systems that do exactly what is explained above. The difference is, instead of collecting “stuff” from the dusty archives, the data input is based on reports from participants at the time the event occurs. Even that is a mass of data to collect!

    The key, I think, to properly framing the “event” is defining them by the “flavors.” That done, you have a nice template for data input. That template would then allow for some errors, confusion, fog of war, or just flat inaccuracy in the reporting of the event.

    For example, let me go to one of my favorite Generals at Antietam. DH Hill wrote a short passage in his report:

    It was now apparent that the Yankees were massing in our front, and that their grand attack would be made upon my position, which was the center of our line. I sent several urgent messages to General Lee for re-enforcements, but before any arrived a heavy force (since ascertained to be Franklin’s corps) advanced in three parallel lines, with all the precision of a parade day, upon my two brigades.

    I’d break that down into a SitRep:
    Size: Federal Corps.
    Activity: Forming for advance (massing)
    Location: “in front of us” which I’d submit can be narrowed down to the area between the Mumma and Roulette Farms.
    Unit: Franklin’s Corps (in error)
    Time: Before 0900hrs, Sept. 17, 1862.
    Equipment: None noted.

    Now we know Franklin’s Corps was not his opponent, but we do have General French’s rather brief report:

    The enemy, who was in position in advance, opened his batteries, under which fire my lines steadily moved until the first line, encountering the enemy’s skirmishers, charged them briskly, and, entering a group of houses on Roulette’s farm, drove back the force, which had taken a strong position for defense.

    Size: Unknown
    Activity: Skirmish line in front of formal defense
    Location: Roulette Farm.
    Unit: Unknown
    Time: around 0900 hrs, Sept. 17, 1862.
    Equipment: Several batteries supporting the defense.

    For all fairness I should have a stack of these, perhaps hundreds, from which to paint the picture. But just looking at these two snips of the official reports in isolation we get something of note:

    Hill thought he was against Franklin’s Corps. On the other hand French did not relate, even after the battle, any knowledge of his foe’s composition.

    So what the framework did allow us to present is the event even with ambiguity and errors, formalize that to a degree, then allow it to be related into the larger system. I actually would want the reports to be in conflict or have some gaps in the details. Such would facilitate that “is visible?” flag.


  3. I do indeed like your data elements and relationships, and the flavor idea for events.

    I was probably too dogmatic or “can’t be done” in describing my concerns about presenting the overall picture or what the events mean in total. I was coming from the perspective that we’ll never be able to know exactly where everybody (or even every unit) was, when, and what they were doing there. Not to mention the idea that there isn’t just one history of Antietam, but potentially one from every participant or witness.

    But … offering each of the items in the stack to the reader in some kind of context is in iteself an excellent technique. Make them accessible based on criteria the reader is interested in, and let the reader decide what all the relevant datapoints convey.

    I’ve tried to do that at some basic level on AotW with a “multi-node network” of information. The nodes are people, places (maps), units, ORs, tablets, etc. Each node connects to the others through the magic of relational data. Not much “flavor” injection, though. Some opinion and interpretation, but most information presented as given by the source(s).

    Thanks again for your thoughtful approach. It looks like a sophisticated and practical model for backing a virtual view of a battle.

    So which battle are you going to implement?

  4. Brian,

    You mention…”Second, and show-stopping for me, is how to find a way to express (or allow for) ambiguity, conflict, and inaccuracy in the data – not to mention the holes.”

    As I just mentioned in Harry’s blog, what about using JavaScript to create popups/mouseovers to interject/overlay your commentary with the original data as presented, for example, in a report?


    Your sitrep stuff sounds like an interesting way to make things like “fog of war” a “virtual experience” in presenting data about battles/battlefields on the Web. It’s almost like we are creeping over into game design…

  5. You folks are really beginning to make my head hurt, but I think I’m actually following most of this. It seems to me that it’s the fluidity of a battle that makes this difficult, so something more “static” like a unit and its history would be easier. It would still encompass the various nodes that Brian mentions. In fact, to give a complete picture, if there’s an OR report from Blue’s Brigade on a battle, there should probably be a report from the unit opposing them.

    Robert, I think for there to be a virtual experience in digital history, we’re going to have to combine history and game design. “Have to” may be too strong — I think it will be more effective if combined with game design.

  6. Craig,

    If you’re interested in seeing if it will work off the napkin, I have an engagement that might make a decent test case. 4-5 hour fight, not too many reports (but enough, I think), only a couple of brigades’ worth of units. I’d be happy to do the data gathering, but it’ll take one of you digital wizards to work the magic.

  7. Don, matching up those OR reports is right in line with where I’d take the data set. Harry alludes to the Rashomon effect on one of his blog entries. Same sort of thing. If everyone agreed on the events, it would make for boring history!

    As for dressing this out beyond the bar napkin, I’m game for building the database side, but someone’s going to have to code the application….


  8. :groan: Robert’s initial Frankenstein allusion is making more and more sense the deeper we get into this. So to make such an excellent digital piece of historical work, we need a historian, a database person, a code person, a cartographer…. Or a person who’s composed of the necessary parts, but such a person is sounding more and more rare.

    Craig, still sounds like an interesting exercise if we can find a code person. May really chew up a good deal of time, but I’m willing to do the digging.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s