Chinaunix首页 | 论坛 | 博客
  • 博客访问: 1264543
  • 博文数量: 3010
  • 博客积分: 10011
  • 博客等级: 上将
  • 技术积分: 30431
  • 用 户 组: 普通用户
  • 注册时间: 2008-05-27 11:53
文章分类

全部博文(3010)

文章存档

2008年(3010)

我的朋友

分类: LINUX

2008-05-27 21:29:35

Extracting content from text continues to be an important research problem for information processing and management. Approaches to capture the semantics of text-based document collections may be based on Bayesian models, probability theory, vector space models, statistical models, or even graph theory. As the volume of digitized textual media continues to grow, so does the need for designing robust, scalable indexing and search strategies (software) to meet a variety of user needs. Knowledge extraction or creation from text requires systematic yet reliable processing that can be codified and adapted for changing needs and environments. This book will draw upon experts in both academia and industry to recommend practical approaches to the purification, indexing, and mining of textual information. It will address document identification, clustering and categorizing documents, cleaning text, and visualizing semantic models of text.

阅读(189) | 评论(0) | 转发(0) |
给主人留下些什么吧!~~