节点文献

基于时间序列模型的文本数据压缩存储算法

Text data compression and storage algorithm based on time series model

  • 推荐 CAJ下载
  • PDF下载
  • 不支持迅雷等下载工具,请取消加速工具后下载。

【作者】 翁渊瀚李南

【Author】 WENG Yuan-han;LI Nan;College of Economics and Management, Nanjing University of Aeronautics and Astronautics;School of Economics and Management, Nanjing Technology University;

【通讯作者】 李南;

【机构】 南京航空航天大学经济与管理学院南京工业大学理学院

【摘要】 为了降低文本数据历史数据量,提升文本数据压缩存储效率,提出一种基于时间序列模型的文本数据压缩存储算法。采用小波阈值去噪方法估计并消除文本数据的误差和噪声;从文本数据特征角度,通过细节描述特征,设定特征类型之间的组合和继承关系,组建时间序列模型。将经过预处理的文本数据采用时间序列模型转换为结构近似的二进制编码字节,通过异或操作对结果中的冗余部分进行压缩处理,同时将压缩的数据存储到对应的数据库中,最终完成文本数据压缩存储。仿真实验结果表明,本文算法可以有效提升压缩性能,获取更优的文本数据压缩存储结果。

【Abstract】 In order to reduce the amount of historical data of text data and improve the efficiency of text data compression and storage, a text data compression and storage algorithm based on time series model is proposed. The wavelet threshold denoising method is used to estimate and eliminate the error and noise of text data; from the perspective of text data features, the features are described in detail, and the combination and inheritance relationship between feature types are set to build a time series model. Convert the preprocessed text data into binary coded bytes with similar structure using the time series model, perform XOR operation to compress the redundant part in the result, and store the compressed data in the corresponding database, and finally complete the text Data compression storage. The simulation results show that the proposed algorithm can effectively improve the compression performance and obtain more satisfactory compression and storage results of text data.

【基金】 国家自然科学基金项目(71473119)
  • 【文献出处】 吉林大学学报(工学版) ,Journal of Jilin University(Engineering and Technology Edition) , 编辑部邮箱 ,2023年07期
  • 【分类号】O211.61;TP391.1
  • 【下载频次】11
节点文献中: 

本文链接的文献网络图示:

本文的引文网络