Calais, a project sponsored by Reuters offers a few handy plugins that enable you to use its API to auto-tag all the posts in your blog (see our coverage). It goes through your content, extracts the relevant keywords, and adds those as tags in your CMS.
But Open Calais isn’t open source. Here are a few open source tools you can use to extract key terms from text. As far as I know, none have been turned into CMS plugins… yet.
Tagger
Tagger is a fairly new Python project by Alessandro Presta. Right now it only works in English.
Via the comments on Hacker News, I found a similar Ruby based project…
Phrasie
Phrasie is a very simple Ruby-based key term extractor. It turns out Phrasie is based on a different Python library called…
Topia’s Term Extractor
Topia’s Term Extractor is an older Python package for extracting key terms by Stephan Richter, Russ Ferriday and the Zope Community.
See Also
See also: Overview of Text Extraction Algorithms
Image by Andrew Mason
0 Responses
Stay in touch with the conversation, subscribe to the RSS feed for comments on this post.