节点文献

知识单元重组视角下的科学主题预测研究

Research on Scientific Topic Prediction from the Perspective of Knowledge Unit Reorganization

  • 推荐 CAJ下载
  • PDF下载
  • 不支持迅雷等下载工具,请取消加速工具后下载。

【作者】 梁继文杨建林王伟

【Author】 Liang Jiwen;Yang Jianlin;Wang Wei;School of Information Management, Nanjing University;Jiangsu Key Laboratory of Data Engineering & Knowledge Service;

【机构】 南京大学信息管理学院江苏省数据工程与知识服务重点实验室

【摘要】 准确的科学主题预测能够明确学科未来的发展方向,为科研领域的发展规划和管理决策提供参考。本文着眼于新生科学主题的预测,基于知识单元重组视角,将主题-特征词的表征关系类比为科学概念-知识单元的表征关系,提出科学主题预测方法。首先,使用LDA (latent Dirichlet allocation)主题模型获取全局主题、特征词与概率矩阵,通过转置向量空间获得特征词向量;其次,运用ARIMA (autoregressive integrated moving average model)模型预测特征词的词频并计算向量调节系数,从而获得特征词预测向量,运用t-SNE (t-distributed stochastic neighbor embedding)算法将预测向量降维,并使用模糊C-均值算法将低维预测向量聚类生成预测主题,实现知识单元的重组;最后,筛选出由多个原始主题聚合而来、具有全新释义的预测主题,将其视为科学主题预测结果。本文以“知识管理-知识组织-知识服务”领域为例进行实证研究,预测出智库、数字人文等在已有领域研究中尚未出现的新词与相关主题,并通过特征词直接聚合与概念集成这两种主题映射模式,获得这些新生主题的基本内涵与相关研究内容。实证结果表明,本文提出的科学主题预测方法能够准确地预测出新生主题。

【Abstract】 Accurate scientific topic prediction can clarify the future development direction of a given discipline and provide a reference for the development planning and management decision-making in the field of scientific research. This paper focuses on the prediction of new scientific topics based on the perspective of knowledge unit reorganization, compares the representation relationship between the topic and feature words to the representation relationship between scientific concepts and knowledge units, and proposes a scientific topic prediction method. First, the LDA(latent Dirichlet allocation) topic model is used to obtain the global topic, feature words, and probability matrix and obtains the feature word vector by transposing the vector space; second, the vector adjustment coefficients are calculated based on the feature word frequencies predicted by the ARIMA(autoregressive integrated moving average model) model to obtain the feature word prediction vectors, the t-SNE(t-distributed stochastic neighbor embedding) algorithm is applied to reduce the dimensionality of the prediction vectors, and then the low-dimensional prediction vectors are clustered by the fuzzy C-mean algorithm to generate prediction topics to realize the reorganization of knowledge units. Finally, the prediction topic with a new interpretation is selected from the aggregation of several original topics, and this is regarded as the scientific topic prediction result. This paper takes the field of “knowledge management-knowledge organization-knowledge service” as an example for conducting empirical research. The results show that the proposed scientific topic prediction method in this paper can effectively predict new scientific topics from which the essential concepts and the corresponding research content of some words have not appeared at that time, such as digital humanities and knowledge payment.

【基金】 国家社会科学基金重点项目“大数据环境下领域知识加工与组织模式研究(20ATQ006)
  • 【文献出处】 情报学报 ,Journal of the China Society for Scientific and Technical Information , 编辑部邮箱 ,2023年05期
  • 【分类号】G353.1
  • 【下载频次】99
节点文献中: 

本文链接的文献网络图示:

本文的引文网络