节点文献

基于跨模态的图像浏览与推荐关键技术研究

Research on Key Techniques for Cross-modal Image Browsing and Recommendation

【作者】 李杨

【导师】 张玥杰;

【作者基本信息】 复旦大学 , 计算机应用技术, 2014, 硕士

【摘要】 随着互联网的高速发展和图像获取技术的不断提高,图像在网络中的产生速度也呈爆炸式增长。这就导致了如何更好的对图像数据进行组织,管理和获取成为一个广泛关注的问题。当前随着图像数据增长速度越来越快,无论是单纯基于语义信息或者视觉信息的图像浏览和推荐算法都不能够很好的解决该问题,因此本文着重研究基于跨模态信息的图像浏览与推荐的相关方法,并涉及三个重要的方面:图像跨模态关联网络的构建;基于跨模态信息的图像聚类算法研究;基于跨模态信息的图像个性化推荐方法研究。在图像跨模态关联网络的构建方面,语义关联网络和视觉关联网络是两个主要组成部分。语义关联网络的构建主要考虑基于概念在多标注图像数据集中共现概率的层次内关联关系和基于WordNet定义的固有层次关联关系所共同构成。视觉关联网络的构建则主要通过提取图像视觉信息特征来完成获得图像集合之间的视觉关联的任务。最后通过语义关联网络和视觉关联网络的融合形成跨模态关联网络。在基于跨模态信息的图像聚类方面,首先考虑的是视觉信息和语义信息的特征提取方法:视觉信息涉及基于SIFT特征的LLC线性编码方法和SPM分层特征提取方法;语义信息的特征提取方法则定义一个依赖于数据集标注信息的共现关联网络,并基于这个网络对图像中包含的语义标注信息进行TF-IDF优化编码。接着本文采用CCA方法将两个不同维度的特征进行跨模态信息融合;并为图像聚类提供跨模态特征。在基于跨模态信息的图像个性化推荐算法方面,本文详细介绍了该算法的设计框架和各部分的设计过程。首先是基于跨模态关联网络的图像多模态关联信息挖掘过程,它依赖于跨模态关联网络建立多模态信息的关联挖掘机制;接着是基于用户个性化模型的多模态关联融合,主要针对关联挖掘产生的图像候选集构建基于语义关联和视觉关联的跨模态图模型,并在图模型的构建中引入用户个性化模型。最后是基于随机游走的关联推荐算法,在跨模态图模型中建立一种模拟人类联想思维方式的浏览推荐模式,其目的是让算法能够更加准确的捕捉用户对于图像的兴趣关注点,并且为用户提供更好的图像浏览与推荐体验。

【Abstract】 With the rapid development of the Internet, the continuously improvement of image capturing technology leads to explosive growth of available image data. It means how to obtain a better organization, management and acquisition method for image data has become an important problem. Nowadays, the speed of image growing is faster and faster, but neither semantic information-based algorithm nor visual information-based algorithm can solve this problem completely. So this paper focuses on the image browsing and recommendation research based on cross-modal information, which includes both semantic and visual information. It mainly involves three parts:the construction of image cross-modal association network; the research of image clustering algorithm based on cross-modal information; the image personalized recommendation method based on cross-modal information.In the part of image cross-modal association network, it includes two components, semantic association network and visual association network. For semantic network, we mainly consider the co-occurrence probability between concepts as flat inter-concept semantic association relation and inherent correlation based on WordNet as the hierarchical inter-concept semantic association relation in the image data with multiple annotations. For visual association network, we can obtain visual relationship between images by the feature extraction from visual information. Finally, we formed the cross-modal association network, through the fusion of visual network and semantic network.In the part of image clustering based on cross-modal information, first of all, we need to acquire some feature extraction method for visual and semantic information. Visual information involve the method of LLC and SPM Method based on SIFT feature extraction. Semantic information is to define a semantic co-occurrence association network, and based on this network and TF-IDF method to complete the code optimization. Then we use CGA-based method to fuse different features into cross-modal feature. And finally, we provide image clustering result based on cross-modal image feature.In the part of image personalized recommendation on cross-modal information. This paper designs an algorithm framework which includes three parts. The first is the cross-modal association mining, which relies on the cross-modal association network to establish a multi-modal association mining mechanism; the second is fusion of multi-modal information based on a user personalization model, which mainly considers by image candidates generated from association mining to construct the cross-modal association graph model. The final part is a recommendation algorithm based on random walk. We want to establish a browsing recommendation model to simulated human associative thinking mechanism in cross-modal graph model. The purpose is to lead the algorithm to capture user’s interest for images more accurately and provide a better image browsing experience.

  • 【网络出版投稿人】 复旦大学
  • 【网络出版年期】2016年 03期
  • 【分类号】TP391.41;TP391.3
  • 【被引频次】2
  • 【下载频次】173
  • 攻读期成果
节点文献中: 

本文链接的文献网络图示:

本文的引文网络