节点文献
基于网络结构的正则化逻辑回归
Logistic Regression with Regularization Based on Network Structure
【摘要】 逻辑回归是一个应用广泛的分类模型,但由于高维数据分类任务在实际应用中变得越来越频繁,使得分类模型面临着巨大的挑战。应对该挑战的一种有效方法是对模型进行正则化。许多已有的正则化逻辑回归直接运用L1范数罚作为正则化罚项,而不考虑特征之间的复杂关联关系。也有一些研究工作基于特征的组信息设计了正则化罚项,但它们假设组信息是预先给定的。文中从网络的视角对特征数据中存在的潜在模式进行挖掘,并基于此提出了一个基于网络结构的正则化逻辑回归。首先,以网络的形式描述特征数据并构建出特征网络;其次,从网络科学的角度对特征网络进行观察和分析,并基于此设计罚函数;然后,以该罚函数为正则化罚项,提出网络结构Lasso逻辑回归;最后,结合Nesterov加速近端梯度下降法和Moreau-Yosida正则化方法,推导了模型的求解过程。在真实数据集上的实验结果显示,所提网络结构Lasso逻辑回归表现优异,这表明从网络的视角观察和分析特征数据是研究正则化模型的一个具有潜力的方向。
【Abstract】 Logistic regression is widely used as classification model.However, as the task of high-dimensional data classification becomes more and more frequent in practical application, the classification model is facing great challenge.Regularization is an effective approach to this challenge.Many existing regularized logistic regression models directly use L1-norm penalty as regularized penalty term without considering the complex relationships among features.There are also some regularization penalty terms designed on the basis of group information of features, but assuming that the group information is prior knowledge.This paper explores the pattern hidden in feature data from the perspective of network and then proposes a regularized logistic regression model based on the network structure.Firstly, this paper constructs feature network by describing feature data in the form of network.Secondly, it observes and analyzes the feature network from the perspective of network science and designs a penalty function based on the observation.Thirdly, it proposes a logistic regression model with network structured Lasso by taking the penalty function as regularized penalty term.Lastly, it infers the solution of the model by combining the Nesterov’s accelerated proximal gradient method and the Moreau-Yosida regularization method.Experiments on real datasets show that the proposed regularized logistic regression performs excellently, which demonstrates that observing and analyzing feature data from the perspective of network is a potential way to study regularized model.
【Key words】 Regularized penalty term; Logistic regression; Network structure; Feature selection; Proximal gradient method;
- 【文献出处】 计算机科学 ,Computer Science , 编辑部邮箱 ,2021年07期
- 【分类号】TP311.13;O212.1
- 【被引频次】7
- 【下载频次】431