“Facts all come with points of view / Facts don’t do what I want them to.” – Mr. Byrne

Read the introduction to Raw Data Is an Oxymoron just now and it’s put a few things into my head, and maybe it’ll get me to take this concept of digital humanities more seriously.  Of course, data and raw data are just two out of many concepts related to this course, so I shouldn’t jump the gun.

I appreciate how Gitelman and Jackson seem to be asking more questions than providing answers, seem to be introducing the subject matter and the essays in the they’re editing as jumping-off-points for further discussion.  That seems like a good way to go about it.  We prefer questions, right?  Answers are so off-putting.

The authors stress that data is never raw – it never just exists on it’s own.  It’s always “cooked” (p.2).  Just as in phenomenology thingness comes into being by what we choose to perceive out of our notion of where said thing begins and ends, data is just as much, if not more, about what we’re choosing not to look at.  A cat is a cat insofar as it doesn’t exhibit dog qualities or behaviors or features, at least to how we see them.  If I am choosing to count red white and blue jelly-beans in a jar, I’m also choosing not to count the other colors.  Data are sets of things we want to see, and we use technology to see every larger – or smaller? – sets of like or related datum.

Which is great.  And fine.  And helpful, very often, because I’m sure these kinds of processes are behind some very important science.  But the authors here do reveal that data is a choice.  We choose to have data in our lives.  Or we don’t, insofar as the information technologies we use every day are often not thought of a things we choose to interact with.  I mean, we’d rather not deal with that inbox, right?  But again I’m reminded that values drive activities, and we probably value data and it’s use and interpretation… because it makes our masters richer!  Ha, no.  Not quite.

The pull-quote from this piece for my money is “Data need to be imagined as data to exist and function as such, and the imagination of data entails an interpretive base” (p.3).  Bullseye.  Data as the result of our imaginative doing, certainly not as objective as we like to think it is.

For me, maybe, this turns things right back to my central question for this course: what are the digital humanities, and do these kinds of questions or approaches have any real relevance for the humanities?  If data is something we imagine, couldn’t we do our authors and creators a better service by imagining what they might have hoped and wished for?  Or better, why can we imagine ourselves pleasantly reading or experiencing humanistic works for their own sakes?  Why imagine data, when it’s so much more enjoyable to imagine characters, scenes, and, I don’t know, beauty?




As a starting exercise, I’ll take the professor’s kind bait and compare two online text toys.  These would be Voyant, which I have not seen before the professor suggested it, and Wordle, which I’d had the misfortune of discovering in a class last fall.  For anyone who hasn’t come across either site / toy, these are essentially text boxes where you can copy and paste a spate of verbiage, and the toy will produce for you an image of the words in the text, arranged haphazardly in a blob, with the words that appear most frequently written larger than those that appear maybe once or twice.  I suppose like other toys, it’s all for good fun.  And as I enjoy fun, I thought I’d put in a choice bit of the old MD (Moby Dick) and see what happens.

It seems, not much.  With either site you get essentially what I’ve described: a blob or words. You play by taking what a thoughtful human mind took hours to assemble and instantly disassemble it, prioritizing words as you would integers in a crude code.  Ah yes, that big one in the middle there appears… the most.  Glad we checked.

Wordle lost me the moment I tried to use it and ran into issues with Java.  Crash, bang, install, reboot, fail, try-on-another-browser, and I got the toy to finally work.  I was rewarded by big words in the middle: ‘whenever’ ‘get’ ‘nothing’ ‘little’ ‘time’ ‘find’.  So glad I didn’t bother reading the text as sentences with subjects, verbs, and predicates, never mind voice, tone, pace, etc.  There are options to change the look of your word cloud, if you didn’t initially appreciate the text vomit the algorithm gave you.  I like mine in blue.

Voyant adds a nice level of importance by referring to their word cloud as ‘cirrus’.  Cute.  Voyant on the whole does have a more utilitarian feel to it, as if the user is expected to put in lots and lots of text for analysis.  It’s probably the case that my one paragraph yields unimpressive results because it’s so small.  Maybe I should try whole chapters of Moby Dick, or the whole novel.  While Wordle seems content to be a toy, Voyant dares you to really take it for a spin.

So I put in the whole chapter.  This is where Voyant really shines.  You can click on a word in the text, and instantly get a frequency line graph to show you how often and where your chosen word appears.  I’ll admit it’s fancy.  It looks like I could use this toy more like a tool, to, as the tag read, “see through your text”.  Tempting.

Or I could read it.  I can read the words in the order in which the author intended them to fall.  I could pleasurably speak the sentences softly as I read, noticing how different American english was back in the mid 19th century.  I could dive deep into the gloom of this massive paean to American genius and hubris.

I know, I know.  I should give this stuff a chance.  It’s easy to poke fun, and for all I know there’s a lot to unpack by analyzing text like it was computer code.  It’s a bit like holding and x-ray machine to a airline passenger and seeing his flab: it’s a neat trick, but it’s not really necessary, is it?