GPSOLO June 2008
Why I Still Hate Scanners
Moving Toward the Paper LESS Office
Lawyers and their staff universally have one thing in common: They are drowning in an unending sea of paper. Pleadings, correspondence, briefs, exhibits, memos, pink phone message slips, sticky notes—you name it, paper is everywhere, choking and clogging the flow of work in both private and public law practices. Sometimes getting client work out is more an issue of managing mounds of paper than applying legal brilliance. Is there any hope at the end of the paper-lined tunnel? Maybe, just maybe…
For years, lawyers have been on a holy quest for the mythical and fabled “paperless office.” This endlessly elusive concept is likely the “Greatest Lie of the Technology Age.” We’re never going to become paperless, at least in the foreseeable future. We just need to accept the fact that even if we reduce the amount of paper we generate ourselves, others will continue to send us more.
Microfiche was supposed to be the answer, at least at one point. But microfiche isn’t used very often in law firms because of the general inability to access it from the computers on which we do our work.
Scanning was the next great answer. But let’s be realistic. How many of you have had “bad scanning experiences”? Yes, I see all of you raising your hands out there.
Even in 2008, when many lawyers use their mega–copier–multifunction devices to scan directly to portable document format (PDF), scanning still can drive even the most tech–aware firms completely batty. Scanning has been—and for many still is—a sore spot in law firms. Why? Lawyers have traditionally viewed scanning as being synonymous with optical character recognition (OCR). Even with the best OCR products today, including the latest releases of Nuance’s OmniPage ( www.nuance.com) and Abbyy’s FineReader ( www.abbyy.com), results often fall short of expectations. Many documents are not good candidates for recognition. Without a clean, laser-printed source document, you’ll end up with gobbledygook. Your staff will tell you it would have been faster to type the document than to OCR it and have to clean up the resulting mess.
Instead, lawyers must view scanning as a way to turn physical paper into digital paper. This is like photocopying documents onto your computer screen. Scanning documents as images can be 20 times faster than the processing-intensive OCR approach. Further, imaged documents on screen look precisely like the originals: handwriting, preprinted lines, boxes—all scan perfectly. This is a core part of the concept that, for nearly 15 years, I have called the Paper LESS Office.
What’s Wrong with OCR?
As noted above, most people equate scanning with identifying the characters on a page and turning it into an editable word-processing document. Good idea conceptually, but bad in practice. Even with the cream of the modern OCR software crop hitting accuracy levels in text recognition as high as 99 percent, it’s just not good enough.
There are four problems here, and any OCR veteran/victim will immediately identify with all of them:
- Ninety-nine percent accuracy in text recognition is akin to a package of bologna proudly trumpeting its “99 percent fat-free” status. Because about 90 percent of the calories are from fat, the thing is a veritable artery–clogging, love–handle–expanding nightmare. With OCR, think of 99 percent accuracy this way: That’s one screwed–up character out of every 100, and with a single–spaced page of text containing about 2,200 words on average, that’s 22 errors per page. And what if one of those errors is a nearly–impossible–to– detect–but–a–bet–the–case–on–it number? Not good. Not at all.
- OCR software has significant problems retaining the formatting and layout of the original scanned document. For example, you get a local state court pleading and give it to your secretary to scan. Seems like a pretty simple request, doesn’t it? It’s a “clean” document that has all appearances of being a solid candidate for being OCR’d: a mainstream typeface and an original laser–printed document (not some smudged, skewed, third–generation photocopy of a fax of a photocopy). Should be no problem, right? Wrong. What you likely get back could very well be a nightmare of reformatting, with a caption that defies cleanup and tab stops that are equally baffling. OCR software tries its best to figure out what codes or styles to apply in the target word–processing format-but it’s really just guessing, and often it guesses wrong
- OCR is not terribly speedy. Even if you have a high–end, Dual Core, gazillion GHz computer, the OCR process can be pretty slow, and it seems with every increase in accuracy we have a geometric leap in the processing requirements. You need heavy–duty PC horsepower just for adequate text recognition. Forget about those decrepit, late–vintage, single–core Pentium 4 systems with a paltry 512 MB of RAM that have already been off–lease for more than three years. But either way, with OCR there’s waiting involved.
- Finally, there’s the expectation gap between what we think is OCRable and what can actually can be OCR’d. I can’t tell you how many times I’ve talked to someone who has said, “When I try and scan this thing, all I get is garbage. How come?”-and the document in question is a preprinted, state– specific divorce financial disclosure form replete with boxes and lines galore. OCR should be able at least to read the text, right? Wrong. What we technologists have to realize is that the average lawyer who expects this to work has a more legitimate claim to reality than those of us who make excuses for present technology by saying, “of course it won’t be right-look at all those lines and boxes-nothing can recognize those.”
The Paper LESS Office
Equating scanning with OCR is a fallacy you no longer need to accept. This is where my “Paper LESS Office” concept comes in. I first put forth the idea in a 1995 article of the same name in Law Office Computing, and presentation audiences all over the world have favorably received the concept. In fact, take a look at an article from the October 2007 edition of GP|Solo&’s Technology eReport, which I co–authored with one of my clients, a partner in a Wyoming litigation firm, about their office&’s Paper LESS experience: www.abanet.org/genpractice/ereport/2007/oct/paperlessoffice.html.
Here’s the concept in a nutshell:
Using low-cost, high-simplicity image scanning, physical paper is turned into “digital paper.” In effect, image scanning photocopies your documents onto your computer system. This creates digital paper, ideally stored in Adobe Acrobat’s universally readable PDF. In the Paper LESS system, you take advantage of what your scanner does best—creating images—rather than always bumping up against its limitations with OCR.
Digital paper takes up no physical space and is manipulated easily by software on your computer systems. And the beauty of digital paper is that it is perfect—it is a picture of the original document, exact in every way without any of the vagaries of the OCR process. Of course, you don’t have editable text at this point—you merely have a picture of the document. But most of the time, that’s all you need.
So you have your digital paper/PDF of the document you scanned—a letter from opposing counsel, a set of interrogatory responses, the CV of a prospective expert witness, a stack of hospital records. What does it accomplish having all these pictures, these pieces of non-editable text? Glad you asked. What you accomplish is saving the time it takes to track down the physical file or rummage through a roomful of banker’s boxes to find the documents. All that time spent searching for a document—whether it is lawyer time or staff time—wastes money.
And it gets better. One of the core problems in working on client files is that they are always split into two locations. The documents we create are located internally on our computer systems, but the client documents we receive from outside sources are stored in our paper filing systems. So, if you want to view all the correspondence on a client’s file, you have to look in two separate places.
This problem vanishes with digital paper. Whether you are using a great document manager such as WORLDOX ( www.worldox.com) or the document management capabilities inside such case managers as Time Matters ( www.timematters.com), Amicus Attorney ( www.amicusattorney.com), PracticeMaster ( www.tabs3.com), or ProLaw ( www.elite.com), you simply go to a client file’s folder/directory on your system and look in the “folder” where you store the correspondence for that client. There you’ll find document names that begin with “Letter to” (word–processed documents you created) and document names that begin with “Letter from” (scanned images of externally generated documents). Internally created and externally received documents are all in the same convenient place.
When your client files become electronic and totally contiguous, you just can’t help but save all sorts of nonbillable time you would otherwise waste just looking for things. Not to mention the ease of bringing a few client files home for the weekend on the road to a depo or a trial—without lugging back-breaking boxes of paper (and subjecting the potentially irreplaceable originals to coffee spills, misplacement, and other forms of folding, spindling, and mutilation).
And when you close the file, it’s already digital—you can store it in a convenient, byte-sized package (sorry, pun intended). This is a far better alternative for closed-file storage than the costly, space-hungry archives required for physical paper files, which usually end up commandeering an area the size of a small starter home.
Buying More for LESS
What kind of scanner should a firm deploy? What software should be used to scan, organize, and search through the content of digital paper? Factors to consider: (1) intended volume of documents to be scanned, (2) number of pages scanned per job, and (3) budget for in–house scanning versus cost–effectiveness of outsourced scanning.
When evaluating volume, read the specifications for duty cycles. Buying a $100 scanner rated for 2,000 pages monthly when your firm needs to scan 10,000 pages monthly will surely smoke that “bargain” scanner. The scanning marketing stratifies this way, roughly:
- Entry–level flatbed scanners ($50–$300). These scanners usually come without automatic document feeders and are unsuitable for law firm use because of cumbersome paper handling.
- Portable scanners (under $300). Fujitsu’s latest ScanSnap S300 ( www.fujitsu.com) is under two pounds and is a “PDF Machine” just like its wildly popular big brothers, the ScanSnap S510 and S510M. This egg–carton–sized scanner can pull eight imaged pages per minute (ppm) into your computer system (16 if the pages are double–sided), directly into PDF file format.
- Entry– to mid–level document–fed desktop scanners ($250–$1,000). These come equipped with automatic document feeders and are suitable for lower–volume scanning situations up to 15,000 pages monthly. Look at Fujitsu’s high–value ScanSnap series, which come bundled with a full copy of Adobe Acrobat Standard (Windows version) or Professional (Mac Version); Visioneer’s Strobe Series ( www.visioneer.com); or Xerox’s category–buster, the 50 ppm DocuMate 252 ( www.xeroxscanners.com), as examples.
Above this level, the sky is the limit. Spend enough money and you’ll end up owning a riding model with a 12 horsepower engine and a pull start! Well, almost. Fujitsu, Panasonic ( www.panasonic.com), Bell + Howell ( www.bellhowell.com), Canon ( www.usa.canon.com), Ricoh ( www.ricoh-usa.com), and Kodak ( www.kodak.com) produce scanners that push the 100 ppm mark with massive paper handling ability.
Take a new view of scanners: Instead of seeing them as optional peripherals, it is time to consider them to be an essential part of the law office desktop computer system. You would never order a new computer without an optical drive to burn CDs or DVDs. By the same token, a scanner to create digital paper should be a minimum requirement of your system.
Okay, so now you have these digital images in your computer system. What’s next? Organizing and searching them. Document management and work product retrieval systems are the best answer. These software systems can gently impose a file–cabinet–like consistency on the way any law practice organizes both its internally created documents and its externally received and scanned documents. WORLD–OX is the undisputed leader in the small firm marketplace and has been digging into the larger firm segment for several years with great success. For larger firms, Interwoven Worksite ( www.interwoven.com, formerly iManage) and Open Text eDOCS ( www.opentext.com, formerly PC DOCS) are popular. Most legal case management systems also incorporate document management functions that can adroitly handle electronic paper.
All document managers let you organize and search scanned image files. This presumes, of course, that the images are stored in a format that actually permits content searching of what would otherwise be only a picture. Fortunately, PDF files can be adapted so that management programs can recognize and search the underlying text. You can convert an “image” PDF into an “accessible” or “searchable” PDF either one file at a time or in batches with Adobe Acrobat 8 Professional ( www.adobe.com) or third–party tools such as Autobahn DX ( www.aquaforest.com). (Quick tip: Scanning at a lower resolution—say, 150 to 300 dpi—actual results in better text recognition because scanners can get “confused” by the fibers of the paper at higher resolutions.) Once your files are “accessible,” WORLD–OX excels at searching them as part of its overall complement of document organization, management, and retrieval functions, but it isn’t the only tool that can accomplish this. Desktop search systems from Windows Desktop Search, to Google Desktop ( www.desktop.google.com), to X1 ( www.x1.com), to Copernic ( www.copernic.com) can search “accessible” PDFs as well.
The bottom line is clear. Paperless is never going to happen. No matter how diligently you try to reduce or even eliminate the paper you generate, others will still send you paper for years to come. Learn to love your scanner by becoming Paper LESS in your practice. By employing a creative and commonsense approach to scanning, you can transform your desktop landscape. Piles recede, billable time increases. Touch the paper LESS often and you’ll find more profits, more enjoyment, and better client responsiveness in your practice.
Ross L. Kodner is a lawyer who in 1985 founded Milwaukee, Wisconsin’s MicroLaw, Inc., an international legal technology consultancy and continuing legal education company. Widely recognized as one of the top legal technology experts in the world, Ross Kodner is Former Chair of the ABA Law Practice Management Section’s Computer & Technology Division and an ABA TECHSHOW Board member. He is a 1999 Recipient of the Technolawyer @Award as Technology Consultant of the Year (a lifetime achievement award) and a five–time Technolawyer @Award Contributor of the Year. He may be reached at firstname.lastname@example.org, via www.microlaw.com, and at 414/540-9433.