John Markoff
When five television studios became entangled in a US justice department antitrust lawsuit against CBS, the cost was immense. As part of the obscure task of ?discovery??providing documents relevant to a lawsuit?the studios examined six million documents at a cost of more than $2.2 million, much of it to pay for a platoon of lawyers and paralegals who worked for months at high hourly rates.
But that was in 1978. Now, thanks to advances in artificial intelligence, ?e-discovery? software can analyse documents in a fraction of the time for a fraction of the cost. In January, for example, Blackstone Discovery of Palo Alto, Calif., helped analyse 1.5 million documents for less than $100,000.
Some programs go beyond just finding documents with relevant terms at computer speeds. They can extract relevant concepts?like documents relevant to social protest in the Middle East?even in the absence of specific terms, and deduce patterns of behaviour that would have eluded lawyers examining millions of documents.
?From a legal staffing viewpoint, it means that a lot of people who used to be allocated to conduct document review are no longer able to be billed out,? said Bill Herr, who as a lawyer at a major chemical company used to muster auditoriums of lawyers to read documents for weeks on end.
Computers are getting better at mimicking human reasoning?as viewers of ?Jeopardy!? found out when they saw Watson beat its human opponents?and they are claiming work once done by people in high-paying professions. The number of computer chip designers, for example, has largely stagnated because powerful software programs replace the work once done by legions of logic designers and draftsmen.
Software is also making its way into tasks that were the exclusive province of human decision makers, like loan and mortgage officers and tax accountants. These new forms of automation have renewed the debate over the economic consequences of technological progress. David Autor, an economics professor at the Massachusetts Institute of Technology, says the United States economy is being ?hollowed out.? New jobs, he says, are coming at the bottom of the economic pyramid, jobs in the middle are being lost to automation and outsourcing, and now job growth at the top is slowing because of automation.
?The economic impact will be huge,? said Tom Mitchell, chairman of the machine learning department at Carnegie Mellon University in Pittsburgh. ?We?re at the beginning of a 10-year period where we?re going to transition from computers that can?t understand language to a point where computers can understand quite a bit about language.?
Nowhere are these advances clearer than in the legal world.
E-discovery technologies generally fall into two broad categories that can be described as ?linguistic? and ?sociological.? The most basic linguistic approach uses specific search words to find and sort relevant documents. More advanced programs filter documents through a large web of word and phrase definitions. A user who types ?dog? will also find documents that mention ?man?s best friend? and even the notion of a ?walk.?
The sociological approach adds an inferential layer of analysis, mimicking the deductive powers of a human Sherlock Holmes. Engineers and linguists at Cataphora, an information-sifting company based in Silicon Valley, have their software mine documents for the activities and interactions of people?who did what when, and who talks to whom. The software seeks to visualise chains of events. It identifies discussions that might have taken place across e-mail, instant messages and telephone calls.
Then the computer pounces, so to speak, capturing ?digital anomalies? that white-collar criminals often create in trying to hide their activities. For example, it finds ?call me? moments ? those incidents when an employee decides to hide a particular action by having a private conversation. This usually involves switching media, perhaps from an e-mail conversation to instant messaging, telephone or even a face-to-face encounter.