visit woodappleTM contact us  
home | the medium 2.0 | design | content technology
Subscribe to blog feed
Processing Language the 'Natural' Way - Using OpenAmplify
Sep 04, 2009 Published in technology
Keywords: analyse any textual information, Natural Language Processing (NLP), openAmplify
Why NLP
While we try and understand the whys and wherefores of the practical implementation of Natural Language Processing (NLP) by testing out openAmplify as a tool, we also paint a picture here about the software in its entirety for prospective users.

OpenAmplify as an NLP Tool 
“Amplify is a web service based upon patented linguistic technology. This technology, sometimes described as Natural Language Processing (NLP), permits Amplify to automatically analyse any textual information and surface its meaning: not just simple things like topics, but the emotions, styles, actions, intentions, and even clues as to the demographics of the author or intended audience. We call the resulting information signals, because they are indicative and predictive of attitudes, behaviour and characteristics.”

OpenAmplify is Linux-based and is written in C++, and it is run on Amazon’s EC2 platform. The tool is API &Web service based, which inherently means we do not need any additional infrastructure and the updates are automatically applied. Therefore as the tool improves over time, so do we. It is also free to use (limited) with 1000 requests allowed in a day.

In order to process and analyse any given text, we need to submit the text along with other parameters to Amplify’s API URL

If you specify an absolute url (e.g. "http://en.wikipedia.org/wiki/Semantic_web"), Amplify will scrape the text from that page and analyze the content. However, be aware that the scraper may pick up ads, navigation and other visible text on the page that you don't really want analysed.

The API returns a linguistic analysis of the text submitted. Data is returned in whichever of the available formats you set in output format. If you don't specify a format, the results are returned as XML, which data are returned depends on the value you set in the analysis parameter of your request.

Several aspects of the text such as Topics, Actions, Demographics and Styles are considered. The ‘Topics’ is an entity, which provides the topical content of a text.  It also provides clarity on whether the topic is being described in positive or negative terms ("Polarity"). Polarity measures the perception or reveals the effect (positive/negative) of a topic in the text. For e.g.: “Cricket is a good sport” will return positive polarity of (+0.5) whereas “Cricket is a very boring game” will return negative polarity of -1.0. The Style depicts the stylistic characteristics of the input text, whether it is flamboyant etc. and if yes, the API returns the measurement of the degree to which the style used in the text is elaborate or flowery i.e., in other words, flamboyant. The Demographics factor gives an estimate of factors such as the age range of the author/audience of the text, an estimate of the education that the author has had, amongst many.

NLP - Strong Essentials
The programming tool is strong in its fundamentals, and gives a fair picture of the kinds of text, the polarities, the age of the authors and groups, a peep into the relevance and polarity for keywords required for the working of a textual search engine. The tool’s ability to extract locations could also help in automatic geo-coding of data. Deciphering writing styles majorly helps in creating an editorial notification system, thereby helping in reducing the amount of time editors need to spend correcting text. The tool further helps gather a lot of information by data mining and a whole new set of capabilities to manage and to create and monetise content in the features and solutions.



Edited by Shobha Sivakumar
By Sameer Mehra
View(s) 704703   |   Comment(s) 1
1 comment(s)
 
Alexandra Stålnacke - Senior Linguist OpenAmplify wrote at 04:38:12 PM on Sep 18, 2009
Thanks for a really nice post about OpenAmplify, Sameer!
Post Comment
CAPTCHA code image
Speak the codeChange the code
 
Is Technology Changing the Gentleman’s Game?
Smarter business with Google's advanced Universal Analytics
Growth and Trends in Indian IT Sector in 2014
Is responsive design worthwhile for your business?
Building COAI’s new brand identity in the age of telecom convergence
more >
(c) 2009 woodappleTM