CMP -- United Business Media

Intelligent Enterprise

Better Insight for Business Decisions

UBM
Intelligent Enterprise - Better Insight for Business Decisions
Part of the TechWeb Network
Intelligent Enterprise
search Intelligent Enterprise




Report from the European Text Analytics Summit | Intelligent Enterprise Blog
Breakthrough Analysis, by Seth Grimes
Seth Grimes is an analytics strategist with Washington DC based Alta Plana Corporation. He consults on data management and analysis systems.
See More by Seth Grimes

Report from the European Text Analytics Summit

Posted by Seth Grimes
Tuesday, May 1, 2007
8:53 AM

I had the privilege of chairing last week's European Text Analytics Summit in Amsterdam. The event was very enjoyable, in no small part because of the diversity of attendee backgrounds and roles. I've never attended any other computing event (outside the summit series) that mixes scientists, police investigators, and media-company product managers with technologists. While I can't say I learned anything completely new (to me), quite a few points surfaced that are worthy of note. I'll report some of them, grouped under the headings user stories, market, and technology.

Start with users present who reminded us that text-analytics, like other IT, is for them a tool and not an end in itself. Luca Toldo from Merck KGaA and Ian Harrow of Pfizer notably protested that they're scientists, working on pharmaceutical drug discovery, and that IT done right – technology and cost – facilitates their work, which touches us all. Chris Bowman, a Louisiana school administrator, is similarly passionate in stating that his job is to help kids succeed in schools. It is safe to say that he is not interested in mastering support vector machines or the like.

Another user, Randy Collica of HP, reported what I'd characterize as extreme ROI via text analytics. Where HP CRM survey analysis used to take two-and-one-half weeks for a team of six staff; Randy can now build a predictive model himself in four days. These figures mirror the experience reported at last year's Boston summit by Greg Talkington, an EDS human-resources analyst, who spoke of reducing an eight-person-week effort to half a day's work for one staffer.

Presentations and off-line discussions confirmed the strength of the expansion of the text-analytics market to include traditional BI approaches and users, what one vendor calls Unified BI. This is a trend I examined in my white paper for the upcoming North American summit, June 12-13 in Boston. So on the one hand we have traditional (for text analytics) investigatory analyses – automated discovery of needles in haystacks that is typified by pharma research and national-security investigations – and on the other a more-rapidly growing market segment applies the technology for fact and sentiment-extraction from online social media and surveys and call-center notes. Vendor SPSS has gone so far as to brand this type of work Enterprise Feedback Management. We'll see if that label sticks. Another interesting point: SPSS market-strategy VP Olivier Jouve reports that in the US and Europe, 30%-40% of their new data-mining customers license text-mining tools. The figure in Japan is 70%.

On the technology front, it seems that users are still struggling to find workable approaches to cross-language text analytics, which is important if you take in source materials in multiple languages. Thoughts on the accuracy of machine translation are mixed, but there seems to be consensus that analysis in the originating language and consolidation of results is preferable to translation to a canonical language such as English. A couple of vendors with significant linguistic capabilities, Inxight and TEMIS, were represented at the summit, and I hope to learn about another vendor's approach at Basis Technology's government user conference next month in Washington.

Lastly, it's amusing what small bits of flashiness catch an audience's attention. Last June in Boston it was David Bean of Attensity's showing, in passing, real-time syntactic analysis that included part-of-speech tagging. The software – not a component users would normally see – builds a tree from text typed in a box: a small element of a larger system but eye-catching nonetheless. This year it was Henk Alles of Infolution's controlling the length of a text summary – shortening and expanding it – by moving a slider control. A small point but a graphic illustration of what the technology can do.



E-MAIL | SLASHDOT | DIGG




This is a public forum. CMP Technology and its affiliates are not responsible for and do not control what is posted herein. CMP Technology makes no warranties or guarantees concerning any advice dispensed by its staff members or readers.

Community standards in this comment area do not permit hate language, excessive profanity, or other patently offensive language. Please be aware that all information posted to this comment area becomes the property of CMP Media LLC and may be edited and republished in print or electronic format as outlined in CMP Technology's Terms of Service.

Important Note: This comment area is NOT intended for commercial messages or solicitations of business.


 




    Subscribe to RSS