What is Document Indexing?

Posted on May 18, 2016

Document management is tough. Even when you have your document scanning workflow all sorted, and you've invested in a top notch electronic document management system (EDMS) or some variety of electronic content management (ECM), there's still the matter of document indexing.

In comparison to the rest of your document management project, indexing may seem like one of the less important components. It may be tempting to handle indexing later, well after document capture, or overly simplify the indexing method to expedite the process or to save on costs, but this could cause trouble down the road.

Every project is different. One organization may need something specific from document management that another does not. But if I can impart anything about indexing is that a plan should absolutely be considered before capture, and if a specific indexing process is determined to be necessary, to implement it at the earliest opportunity. Not only is retroactively applying indexing methods costly and time consuming (above and beyond doing it earlier at the time of capture), but effective indexing will yield significant savings over time from the moment it is put in place. Delaying good indexing practices will only delay your return on investment.

But What is Document Indexing Though?

This is a great question, if for no other reason than it's actually the title of the blog! Simply put, indexing is any number of methods by which electronic documents are digitally organized, usually within EDMS or ECM software (indexing can technically apply to organizing paper files as well, but for the purposes of this blog, we will be talkibg about electronic documents).

Now this is the most important part: Because the documents are organized so well, they can be retrieved quickly and efficiently. Being able to immediately call up a vital invoice or student record or case file means less time hunched over a filing cabinent and more time focusing on other, more important tasks.

There are a number of ways to index your documents. You can even combine methods to create the best organization strategy for your business. Here are four key indexing methods:

File Naming

This one is so obvious, it almost feels like cheating, but if for simple or small projects, a sound file naming strategy alone might be enough to organize your documents. Think about it: You probably already do this at the office or at home. Naming your photos something like "Florida_Vacation _001.jpg" or your taxes something like "W2_2014_Acme_Company.pdf" is, at its most basic level, indexing. You've organized your files in a meaningful way that allows you to quickly retrieve them later!

Some EDMS applications and even some MFPs have automatic naming conventions that can be applied, such as adding the current date to the file name, or appending specified prefixes to number sequences to produce something like "ABC_001.pdf," "ABC_002.pdf," and so on. If you are relying solely on file names, it's important to have a consistent and understandable convention to prevent confusion down the line. For this reason, relying just on file names is not recommended for large or complex projects.

Read More: What is Electronic Document Management?

Folder Structure

Utilizing a good folder structure is another simple, but very effective way to index your documents. Nesting folders is good for situations when just a simple file name won't do. For instance, if we take our tax example from above, "W2_2014_Acme_Company.pdf" may be great if you work just one job. But what if you work multiple jobs, take on several freelancing gigs and need to manage your expenses?

File names may become unwieldy after just a few years. Instead, creating a folder called "2014 Taxes" and inside that folder, adding folders for "W2s," 1099s," "Reciepts," and so on, and then putting consistently named documents in those folders will allow you to better locate tax documents from 2014!

Searchable Text / OCR

Now this is where we really get cooking with powerful indexing methods! OCR stands for optical character recognition, and it's the means by which your scanning or document management software reads the words on printed page once you've scanned it in to build a searchable database of your files. OCR relies on standard letters, numbers, and other characters, so it doesn't work so well with atypical fonts or especially with handwriting, but for your garden-variety printed pages, OCR gives you the means to quickly find a unique word or a specific phrase amongst the multitude of documents in your system.

With our tax example, if you're looking for a specific line item through all the years of all your documents, you can simply search for the text of the line item and bring that document right up! Often, EDMS software will even have granular options to better locate what you're looking for.

Metadata

Whoa boy. Metadata could be (and is!) a blog of its own. Simply put, metadata is data about data... which is, perhaps, not very helpful. Ok, let's try this. If you use iTunes at home to listen to your music, your MP3 music files have a file name and they're stored in folders. iTunes, all on its own, is very good at naming and organizing things in useful ways. It's programmed to do this for your benefit, and its own benefit. So you might have an MP3 file named "[01] - No Action - Elvis Costello - This Years Model.mp3" and this terrific indexing that's readable both by you and by the application, iTunes. This file is the first track, titled No Action, from Elvis Costello's album This Year's Model. Easy! But what the truth is, the file could be called "23wreff5fs.mp3" and iTunes still has ways to know what it is.

How? Metadata.

When you search in iTunes by artist or track or album name, you're not necessarily searching for the file name. You're searching using additional information attached to the file. Similarly, document management software often allows you to attach helpful information to a document or file, such as invoice number or case ID or hire date, etc. Just as iTunes can quickly call up anything by Elvis Costello, you can quickly call up anything with the invoice number "AA123450." Metadata is one of the most time consuming indexing tools to implement and maintain, but it's also one of the most powerful assets at your disposal. 

Read More: What is Metadata?

Document indexing can be quite daunting, but hopefully we've taken some of the mystery out of it. Approaching your scanning project with a solid indexing plan in place may require a little extra time, a little extra effort and, if I'm being honest, a little extra money, but the return on investment is unbeatable.

When your employees or colleagues are locating documents in seconds and breezing on through to new tasks seamlessly, you'll be thankful you planned and executed your indexing strategy ahead of time.

Now that you know more about indexing, why not contact A&A Office Systems to learn more about document scanning, document management, workflow automation, and more?


Back to Blogs

Subscribe to our blog