Video Action Recognition Via Neural Architecture Searching
Deep neural networks have achieved great success for video analysis and understanding. However, designing a high-performance neural architecture requires substantial efforts and expertise. In this paper, we make the first attempt to let algorithm automatically design neural networks for video action recognition tasks. Specifically, a spatio-temporal network is developed in a differentiable space modeled by a directed acyclic graph, thus a gradient-based strategy can be performed to search an optimal architecture. Nonetheless, it is computationally expensive, since the computational burden to evaluate each architecture candidate is still heavy. To alleviate this issue, we, for the video input, introduce a temporal segment approach to reduce the computational cost without losing global video information. For the architecture, we explore in an efficient search space by introducing pseudo 3D operators. Experiments show that, our architecture outperforms popular neural architectures, under the training from scratch protocol, on the challenging UCF101 dataset, surprisingly, with only around one percentage of parameters of its manual-design counterparts.
NurtureToken New!

Token crowdsale for this paper ends in

Buy Nurture Tokens

Authors

Are you an author of this paper? Check the Twitter handle we have for you is correct.

Wei Peng (edit)
Xiaopeng Hong (edit)
Guoying Zhao (add twitter)
Ask The Authors

Ask the authors of this paper a question or leave a comment.

Read it. Rate it.
#1. Which part of the paper did you read?

#2. The paper contains new data or analyses that is openly accessible?
#3. The conclusion is supported by the data and analyses?
#4. The conclusion is of scientific interest?
#5. The result is likely to lead to future research?

Github
User:
None (add)
Repo:
None (add)
Stargazers:
0
Forks:
0
Open Issues:
0
Network:
0
Subscribers:
0
Language:
None
Youtube
Link:
None (add)
Views:
0
Likes:
0
Dislikes:
0
Favorites:
0
Comments:
0
Other
Sample Sizes (N=):
Inserted:
Words Total:
Words Unique:
Source:
Abstract:
None
07/10/19 06:04PM
3,796
1,384
Tweets
arxiv_cscv: Video Action Recognition Via Neural Architecture Searching https://t.co/sb8UAy4b0L
arxiv_cscv: Video Action Recognition Via Neural Architecture Searching https://t.co/sb8UAxMA9d
arxiv_cscv: Video Action Recognition Via Neural Architecture Searching https://t.co/sb8UAy4b0L
arxivml: "Video Action Recognition Via Neural Architecture Searching", Wei Peng, Xiaopeng Hong, Guoying Zhao https://t.co/JOfeDocDnB
Memoirs: Video Action Recognition Via Neural Architecture Searching. https://t.co/f2CVqCLqLa
arxiv_cs_LG: Video Action Recognition Via Neural Architecture Searching. Wei Peng, Xiaopeng Hong, and Guoying Zhao https://t.co/TbZgO8EQHK
BrundageBot: Video Action Recognition Via Neural Architecture Searching. Wei Peng, Xiaopeng Hong, and Guoying Zhao https://t.co/63TuR66EEs
arxiv_cscv: Video Action Recognition Via Neural Architecture Searching https://t.co/sb8UAxMA9d
Images
Related