The average information worker spends two hours every day searching for information (a number that’s probably higher for document-swamped legal workers).
But a bit of overpriced tea set off the American Revolution — and the next information revolution is hatching from something as small as four little crossing lines. #GetReady.
Attorneys and their legal teams have a lot to gain (and hours of frustration to lose) from the next stage of personal information retrieval.
Nathan Morris wrote recently about how the ghosts of filing cabinets past haunt our modern offices. Even if we’ve cleared out the dusty file room and put in an office foosball table, the old metal box continues to exist in our brains.
Why is this boxed-brain a problem?
It’s the man-on-a-horse conundrum: suppose you’ve got to file away a picture of a man riding a horse. Should it be filed under horses or horseriding? And what if your image contains a poem by Rumi and the rider happens to be a shirtless Vladimir Putin?
The truth is, our documents are all as complex as bare-chested, mounted world leaders — typically they can’t be crammed into a single taxonomic scheme.
Suppose you’ve written a complex motion for partial summary judgment in the case of an Uber driver fighting arbitration agreements. You likely save your file in the folder for the case, with a file name that includes the date and motion title.
But what happens when you want this document in the future? Maybe you need to write another summary judgment motion. Maybe you’ve forgotten what this specific motion was about, but you’re up against a similar arbitration agreement and want to find your former arguments. Maybe you’ve got a new client fighting for worker’s rights in the same industry. Maybe you just want to compile your most stellar legal writing as an example to newcomers.
Whatever it is, do you honestly expect yourself to be able to remember the folder name and find the right file?
More likely, you’ll use your limited memory and limited search capacity, and spend your daily two hours hunting down that one great thing you wrote that you know has gotta be here somewhere . . .
The Future is 1993.
In the early nineties, even before there were websites, there were search engines. The first one was named ARCHIE, and it could search based only on the title of files. Soon web sites grew from the hundreds to the tens of thousands, and innovators needed a way to comb through the sheer megabytes of data. Within just a year or two, bots were created that could automatically comb through web sites and create an index of titles, content, and even metadata to find the right results for users.
This was a revolution. If the internet functioned like a filing cabinet, we’d whittle away our lives memorizing a hierarchy of categories (OK, that article on ransomware might be filed in the ‘virus’ folder in the ‘computer’ folder in the ‘technology’ folder in the . . . ). It’s unimaginable.
Instead, we now can search for abstract concepts and get relevant results that we didn’t even know were out there.
So, with technology like Google basically reading our minds, why are we still using ARCHIE-level search for our own documents? Why do we only have the ability to either comb through our disastrous file systems (does anybody else save everything to the desktop?) or hopelessly try and remember the title of the word file we made two years ago?
There are a few reasons for that:
3 Anchors That Keep Us Stuck in the Document-Retrieval Past
Problem 1: Data
Suppose you’re organizing your music collection, and you need a home for your favorite David Hasselhoff cover “Hooked on a Feeling.” Does it belong in the Hasselhoff file along with Baywatch reruns, or should it be with the song’s writer (Mark James) or original performer (B.J. Thomas)? Or maybe it belongs in a separate folder for Inappropriate Greenscreening (along with Kanye West) or that one for Absurd Use of Dachshunds.
A scrupulous file organizer might put a copy in each location. Similarly, in the example above of the motion for summary judgment, you could store duplicates in these folders:
- The case file
- Motions for summary judgment
- Arbitration agreements
- Employment status
- Uber
- Good Writing
Duplication like this would be unthinkable in the days before cheap data.
But even now that data is cheap, the rise of collaboration still makes this impossible. If you updated one version, you’d have to hunt down and replace each duplicate file. We require canonical documents which we can consistently alter and to which we can always refer. Duplication is flat out.
Problem 2: Privacy
Let’s say you write a really great paper on finding motivation on a daily basis. Because you work for Buzzfeed, you title and save it as “6 mindhacks that are so effective you won’t believe it!” But all your computer’s built-in search function knows about this document is its name.
If, years from now, you wanted to find it again, you might remember that it had something to do with motivation, psychology, or even a specific theory like “decision fatigue.” But none of those appear in the title, so your computer can’t help you find it.
If you’re storing the file as a Google Doc, then you’ll have some more search capabilities. But in exchange you’ll have signed onto terms of services that seem to give Google unrestricted license to all your files. Tech journalist Rafe Needleman concludes: “Despite what Google, Microsoft, and other providers offering limitless storage in the sky will claim, a cloud storage drive does not offer the same privacy protections as a personal hard drive.”
This may be fine for your mindhacks, but attorneys should be more concerned. As attorney Jeff Bennion notes, policies like Google’s could expose attorneys who use it to ethics complaints.
Problem 3: Scalability
Systems that try to tidy our organizing messes often rely on everyone buying into the same all-powerful structure. But ask ten attorneys the best document organizing schema, and you’re likely to get eleven different answers. How we organize our files is our little taxonomic fingerprint, whether you’re the save-everything-on-the-desktop Chaos Muppet or the kind who makes a spreadsheet about your spreadsheets.
And we’re not inclined to just accept someone else’s system. Creating a global taxonomy for the sake of organizing everybody’s personal data is just nonsense.
But even if we found the answer to these problems, as long as filing cabinet mentality is rooted deep in our brain, would we be willing to accept the change?
The answer is #rightinfrontofus.
Whoever invented the mouse first called it the X-Y Position Indicator for a Display System. Unweildy things usually get weeded out of the world of computers: now we’re accustomed to little rodents on our desks, and we’re already finding paths out of clunky hierarchical organizing systems.
What we’re looking for is something called “multiple categorization.” That’s where we store our data in categories instead of folders. Think of it this way: instead of paying attention to where the document is, we’re interested in creating multiple roads to it.
Here’s where stodgy old data storage gets turned on its head: Despite all those problems, despite the growing pains we’ve been having, we’re already doing some fairly sophisticated multiple categorization of our data every day. And it was born in the hivemind of Twitter.
We might think of hashtags as the little frills of social media, to insist that a laugh was out loud or note when Beyonce was on beat (the answer is: always). But the history of hashtags is already rich in revolutions, social movements, and humanitarian responses to natural disasters. And in any setting, each of our tweets — these tiny data points, mere 140 characters of information — already have with multiple categorization built into them.
Tags are already part of the way we think about information.
From Tweet to Trial
Let’s go back to our earlier example of the legal motion you want to find years down the road.
Well if it’s stored on paper only, you’ll never find it. There’s just no way. Unless you happen to remember the name of the client from years ago.
If it’s stored digitally you’ve got a chance. Digging through your client folders might spark a memory, but most likely the right combination of search terms in your operating system’s search function will bring up a slew of results through which you can scan and maybe find what you’re looking for. Maybe.
But what if your document system let you add tags? What if, every time you wrote something particularly impressive, you could tag it with #wow or ‘good work’ along with other relevant tags like ‘Motion to dismiss’ or ‘arbitration agreement’? You could also tag particular case law or legal concepts you used.
When you save a document, you no longer have to predict your own future: you don’t have to divine what one situation you’ll be in when you search for that motion, and what features you’ll remember about it. You can help out your future self by recognizing several possibilities, and essentially creating multiple homes for one document (just as the shirtless Vlad photo could be stored simultaneously under #horse, #horseriding, #Putin, and #machismo).
Tag: You’re It
The metadata of a tag has the same effect of duplicating a file, without taking up more storage space. You don’t need to store the file in multiple locations because instead you have multiple routes to it. That also leaves collaboration intact.
Security is still preserved, since you’re not giving access to the data itself to a Googlebot to peruse. Instead you’re supplying your own metadata in the form of tags. The search program doesn’t have to try and guess what the document is about, as Google does with websites based on their content and links. Instead you decide what it should be known for, and how important it is.
The only remaining concern is scalability. Though the one-right-answer stringency of file-folder categorizing is loosened by using tags, you and your team still require some shared vocabulary. You no longer need the document in exactly the right place with the precisely correct heading, but you do need some internal consistency.
Will all your motions in limine be labelled that way? Will your best motions be under #wow or #goodwork?
In fact, there are multiple programs out there that allow you to categorize or tag your data. But such systems are not widely used, and each has its own method which isn’t transferable to others. Unless and until a standard surfaces, law offices taking on the tag will need to establish their own protocols for which terms are used.
Metadata has Arrived at Legal
Though the stodgy legal world has often lagged on adopting new tech, it’s leading the way on better document retrieval (our docs mean that much to us).
The developers of Filevine have already made hashtags functional for documents and messages stored in its platform. It’s now simple to store searchable metadata along with your documents, while keeping them secure within a closed system. Even the system itself can’t read the documents, but the tags are searchable by users with the right privileges.
This is the sort of thing that even established cloud storage systems like Dropbox can’t do. The Dropbox world is still just files and folders. For all its innovation, in this regard it’s still 1950.
There likely won’t be a shiny new Google for your personal documents. But the principles that transformed us from information-moles digging through folders to googlers can already be applied to our personal and professional settings.
To see for yourself how Filevine is leading the way in legal document storage and retrieval, you could check out the free demo.