As the short length of the Web short text and less shared words,a lot of out of vocabulary( OOV) words would appear,and these words make the task of text classification more difficult. To solve this problem,a newgeneral framework based on word embedding similarity was proposed. First,get the word embedding file with unsupervised learning method based on unlabeled data. Second,extend the OOVs with the similar words in training data through computing the similarities of different word embeddings. The comparis...