基于Web数据的农业网络信息自动采集与分类系统
为了快速、高效地获取农业Web信息,解决信息孤岛和信息不对称的问题,重点研究了农业Web数据自动采集与抽取、基于SVM(support vector machine)的文本分类、物联网异构数据采集等技术,并采用统一建模语言(unified modeling language,UML)描述了农业网络信息自动采集与分类系统。该系统实现了农业网站、物联网数据的自动抓取和共享,为用户提供农业资讯、农产品市场行情、供求信息在线查询,环境数据实时监测和个性化信息服务等功能。应用结果表明,该系统对样本集网站的信息抓取准确率为98.2%,资讯分类准确率为92.5%,具有数据采集实时性强、用户参与度好、通用性高...
Saved in:
Published in | 农业工程学报 Vol. 32; no. 12; pp. 172 - 178 |
---|---|
Main Author | |
Format | Journal Article |
Language | Chinese |
Published |
中国农业大学信息与电气工程学院,北京,100083%中国农业大学信息与电气工程学院,北京 100083
2016
北京市农业物联网工程技术研究中心,北京 100083 |
Subjects | |
Online Access | Get full text |
ISSN | 1002-6819 |
DOI | 10.11975/j.issn.1002-6819.2016.12.025 |
Cover
Summary: | 为了快速、高效地获取农业Web信息,解决信息孤岛和信息不对称的问题,重点研究了农业Web数据自动采集与抽取、基于SVM(support vector machine)的文本分类、物联网异构数据采集等技术,并采用统一建模语言(unified modeling language,UML)描述了农业网络信息自动采集与分类系统。该系统实现了农业网站、物联网数据的自动抓取和共享,为用户提供农业资讯、农产品市场行情、供求信息在线查询,环境数据实时监测和个性化信息服务等功能。应用结果表明,该系统对样本集网站的信息抓取准确率为98.2%,资讯分类准确率为92.5%,具有数据采集实时性强、用户参与度好、通用性高等特点,该系统为农业信息整合和服务提供参考。 |
---|---|
Bibliography: | Duan Qingling, Wei Fangfang, Zhang Lei, Xiao Xiaoyan (1. College of Information and Electrical Engineering, China Agricultural University, Beijing 100083, China; 2. Beijing Agricultural Networking Engineering Technology Research Center, Beijing 100083, China) 11-2047/S agriculture; text processing; information systems; information; the Internet of things The purpose of this study is to obtain agricultural web information efficiently, and to provide users with personalized service through the integration of agricultural resources scattered in different sites and the fusion of heterogeneous environmental data. The research in this paper has improved some key information technologies, which are agricultural web data acquisition and extraction technologies, text classification based on support vector machine(SVM) and heterogeneous data collection based on the Internet of things(IOT). We first add quality target seed site into the system, and get website URL(uniform resource locator) and category information. The web |
ISSN: | 1002-6819 |
DOI: | 10.11975/j.issn.1002-6819.2016.12.025 |