节点文献
基于关键词筛选分词算法的企业级搜索引擎
Enterprise Search Engine Based on Keyword Selected Split-word Algorithm
【摘要】 随着计算机技术与数据库学科不断发展,数字化信息已经成为当今存储数数据的首要选择,并且借助大型搜索引擎,使用户可以快速找到对应信息。应用于企业级的高效搜索引擎成为当前研究的重要课题。本文提出了基于关键词筛选KWS(Key Word Selection)的搜索引擎机制,针对电网与大型发电厂智能管理系统的数据结构,通过构建双字哈希词典和双字耦合消歧分词与结果的语义筛选,将筛选后的分词结果放入Sphinx和MySQL数据库进行全文搜索并加以缓存,既提高了搜索速度又提高搜索的准确度。
【Abstract】 With the development of computer science and database subject,digital information is becoming the first choice of data forms.Nowadays,with the help of large scale search engine,users could find valuable information rapidly.The improvement of search engine with high efficiency which used in enterprise is now a hot subject.This paper describes a search engine based on Keyword Selection(KWS) which aimed to enterprise data structure.By using dictionary based on Hash Structure and measures of Coupling Degree of Double Characters,keyword strings would be splitted into pieces and results would be cached as well.Meanwhile,Sphinx and MySQL database ensure high accuracy and quick response.
【Key words】 Enterprise Search Engine; Hash Structure; Coupling Degree of Double Characters; Cache;
- 【文献出处】 微型电脑应用 ,Microcomputer Applications , 编辑部邮箱 ,2010年07期
- 【分类号】TP391.3
- 【下载频次】155