Saturday, October 5, 2013

Natural Language Processing

Regardless of the source of speech to text processing, be it a TV show or a cell phone call, the resulting text is the raw material for Natural Language Processing.  Better speech to text accuracy contributes to NLP.  NLP is where data relates to information that relates to knowledge.

Machine NLP is an extremely interesting thing involving artificial intelligence.  It fascinates me.  I have looked into it from time to time over the last 10 years.  Time to update myself to the latest and greatest in this area.

Starting with google news search

What the biggies are doing in this area tells somehting

Intel has quietly made another international acquisition in its push into artificial intelligence technology: it has bought Indisys, a Spanish startup focused on natural language recognition. The terms of the deal have not been disclosed, but it is reportedly “north” of €20 million ($26 million). It comes just two months after news broke that Intel acquired Omek, an Israeli maker of gesture-based interfaces, reportedly for about $40 million.

Siri 2.0?  Link here   More on the question and answer side of NLP

Stanford is a big player

Mitre publications in this area are here.  Enough for weeks of reading!  Like this one

Abstract

We describe an experiment to elicit judgments on the validity of gene-mutation relations in MEDLINE abstracts via crowdsourcing. The biomedical literature contains rich information on such relations, but the correct pairings are difficult to extract automatically because a single abstract may mention multiple genes and mutations. We ran an experiment presenting candidate gene-mutation relations as Amazon Mechanical Turk HITs (human intelligence tasks). We extracted candidate mutations from a corpus of 250 MEDLINE abstracts using EMU combined with curated gene lists from NCBI. The resulting document-level annotations were projected into the abstract text to highlight mentions of genes and mutations for review. Reviewers returned results within 36 hours. Initial weighted results evaluated against a gold standard of expert curated gene-mutation relations achieved 85% accuracy, with the best reviewer achieving 91% accuracy. We expect performance to increase with further experimentation, providing a scalable approach for rapid manual curation of important biological relations.

No comments: