Video action recognition is an important part of intelligent video analysis. In recent years, deep learning methods, especially the two-stream convolutional neural network achieved the state-of-the-art performance. However, most methods simply use uniform sampling to get frames, which may cause the loss of information in sampling interval. We propose a segmentation method and a key-frame extraction method for video action recognition, and combine them with a multi-temporal-scale two-stream network. Our fram...