Project Management




So, I thought I’d write about a real-life thing I do for work that is I think technically a DH project.  Might be better than whining about how I don’t understand text mining.

I manage a sheet music digitization project for the music library that I work at.  It was designed and set up by my predecessor, so for me it’s a legacy project I inherited (I often wish I could spend my time doing other things, but that’s a different matter).  This is, for me, a decent chance to get my hands dirty with project management.

Here’s the site, if you want to take a gander:

Basically, we have lots of old sheet music in boxes.  Before my time, most all article metadata was created and uploaded to a paid site platform saved to a local (WRLC) server.  From there, a sheet music consortium out of UCLA extracts and harvests metadata to share with the consortium.  

The consortium is here:

Thus, anyone interested in looking at old sheet music has access to tons of digitized scores from numerous institutions across the nation.  The metadata and thumbnails exist on the consortium’s website, and the real images stay on our server.  

My task has been to scan and upload – with beefier metadata – sheet music that isn’t already in the consortium.  No reason to scan and upload a sheet if another school has already done it.  

In terms of project management, it’s been a bit of a bear understanding all this.  I had no experience with this sort of thing.  My predecessor left me with plenty to read over, but there was still a hefty learning curve on my part.  Part time staff who have been doing the work have been helpful, but often they just understood one narrow aspect, and didn’t understand how that aspect fit in with the whole project.  I’m also pretty dense, and get frustrated easily – neither of which has helped.  

But recently I have had the time and person-power to really sink my teeth into it, and I feel better about the whole thing.  It helps that there’s really no timetable for it’s completion – no pressure to get it done fast.

Speaking of which, re-reading over the Cult of Done Manifesto has reminded me to breathe and relax about this project.  I especially appreciate the laughing at perfection part – it’s good to want to get it right, but not at the expense of never moving forward.  

While this does count as a Digital Humanities project, it also just seems to me like a project, with the humanities aspect left to someone else.  I personally don’t feel all that invested in the thing, but it is my job, so I do it.  Hopefully someone somewhere is getting something out of these scans.

Have a look at the site’s I’ve linked above, and let me know if you have thoughts or questions. 





Team.  TAPoR.

TAPoR is a website that hosts lots of text mining tools, free to use on the open web.  It doesn’t have any tools itself, but is rather a place someone can go to explore different kinds of text rending.  The benefit here is that you can compare different tools that do the same thing.  Better: you can discover tools that you didn’t know existed, that do things that you might not have thought of, but which could be really valuable for whatever project you’re doing.  

The site groups similar tools together for easy browsing.  Two groupings stand out.  One is popular.  The site doesn’t seem to say how these tools have been deemed popular, but presumably this is a good place to go to see what others are finding useful.  The other is reviewed.  Tools are briefly reviewed, letting new users get a better sense of what each tool can do and if it might be relevant for them.  That’s a nice feature, if you ask me.

The site also functions like a community hub for tool users and developers.  Presumably, you could share information on your experience with a tool, and ask questions.  

So this is a pretty nice site to know about if you have a text and want to know different ways you could view it.  






Team.  This is where I talk about Juxta.

Juxta seems to be an online tool for comparing texts.  You put in two different versions of the same text, and Juxta helps you to visualize all the different ways that the texts are dissimilar.  You can upload text files, and before you can compare the two files, you have to do something called turning them into witnesses.  I have no idea what this means.  Maybe there’s some kind of digital authentication that happens behind the sense whereby a Hal 9000 blesses each text to be used in public comparisons on the open web.  Maybe Juxta is simply checking with their legal team.  It’s anyone’s guess.

I’m not a literary scholar, but I suppose the point of this is to help those so inclined see how an author’s vision changed over time, or perhaps how an editor changed things around to make a book, poem, essay, etc. more marketable.

To try to test this tool out, I used the first bit of Leaves of Grass.  I used this text mainly because it was given in class, but also it makes sense, as the 1855 and 1897 versions are very, very different from one another (at least how they look from the downloads – on 2nd viewing, there’s something fishy about the ’55 version that I can’t quite put my finger on).  Also, I haven’t read Whitman since I smugly dismissed him as too wide-eyed in college, and it was kinda nice to see these lines again, taking in the trademarked exuberance.

I’ll talk about how Juxta is different from Voyant and the art-rock-band-name-like TAPoR.

It’s hard for me to talk about what results where received from this tool.  I’m happy I finally got it to work.  And, again, I’m not a Whitman scholar and have no intention of becoming one.  In short, I was able to compare two texts in a way that had more digitized blue highlighter-esque graphics than if I had printed out the texts and just sat them beside one another on the kitchen table.  For example, I can very clearly see, thanks this this blue stuff, that the 2nd two stanzas in the first part of the poem where entirely added way after 1855.  After reading the famous opening salvo again, I’m happy that those stanzas made it in.  Certainly makes for a better read.