Top 10 Arxiv Papers Today in Computer Vision And Pattern Recognition


0.0 Mikeys
#1. MPTV: Matching Pursuit Based Total Variation Minimization for Image Deconvolution
Dong Gong, Mingkui Tan, Qinfeng Shi, Anton van den Hengel, Yanning Zhang
Total variation (TV) regularization has proven effective for a range of computer vision tasks through its preferential weighting of sharp image edges. Existing TV-based methods, however, often suffer from the over-smoothing issue and solution bias caused by the homogeneous penalization. In this paper, we consider addressing these issues by applying inhomogeneous regularization on different image components. We formulate the inhomogeneous TV minimization problem as a convex quadratic constrained linear programming problem. Relying on this new model, we propose a matching pursuit based total variation minimization method (MPTV), specifically for image deconvolution. The proposed MPTV method is essentially a cutting-plane method, which iteratively activates a subset of nonzero image gradients, and then solves a subproblem focusing on those activated gradients only. Compared to existing methods, MPTV is less sensitive to the choice of the trade-off parameter between data fitting and regularization. Moreover, the inhomogeneity of MPTV...
more | pdf | html
Figures
Tweets
arxivml: "MPTV: Matching Pursuit Based Total Variation Minimization for Image Deconvolution", Dong Gong, Mingkui Tan, Qinfen… https://t.co/kUmxE21QpM
arxiv_cscv: MPTV: Matching Pursuit Based Total Variation Minimization for Image Deconvolution https://t.co/s0AtHVYucD
arxiv_cscv: MPTV: Matching Pursuit Based Total Variation Minimization for Image Deconvolution https://t.co/s0AtHWg54b
ComputerPapers: MPTV: Matching Pursuit Based Total Variation Minimization for Image Deconvolution. https://t.co/dvmBMHMYmM
Github
None.
Youtube
None.
Other stats
Sample Sizes : None.
Authors: 5
Total Words: 14615
Unqiue Words: 3335

0.0 Mikeys
#2. Distortion Robust Image Classification with Deep Convolutional Neural Network based on Discrete Cosine Transform
Md Tahmid Hossain, Shyh Wei Teng, Dengsheng Zhang, Suryani Lim, Guojun Lu
State of the art CNN models for image classification are found to be highly vulnerable to image quality degradation. It is observed that even a small amount of distortion introduced in an image in the form of noise or blur severely hampers the performance of these CNN architectures. Most of the work in the literature strive to mitigate this problem simply by fine-tuning a pre-trained model on mutually exclusive or union set of distorted training data. This iterative fine-tuning process with all possible types of distortion is exhaustive and struggles to handle unseen distortions. In this work, we propose DCT-Net, a Discrete Cosine Transform based module integrated into a deep network which is built on top of VGG16 \cite{vgg1}. The proposed DCT module operates during training and discards input information based on DCT coefficients which represent the contribution of sampling frequencies. We show that this approach enables the network to be trained at one go without having to generate training data with different type of expected...
more | pdf | html
Figures
Tweets
BrundageBot: Distortion Robust Image Classification with Deep Convolutional Neural Network based on Discrete Cosine Transform. Md Tahmid Hossain, Shyh Wei Teng, Dengsheng Zhang, Suryani Lim, and Guojun Lu https://t.co/sQHkGfIqXT
arxiv_cscv: Distortion Robust Image Classification using Deep Convolutional Neural Network with Discrete Cosine Transform https://t.co/foFnanoM2Y
arxiv_cscv: Distortion Robust Image Classification using Deep Convolutional Neural Network with Discrete Cosine Transform https://t.co/foFnanoM2Y
nmfeeds: [CV] https://t.co/EVAczcw2dF Distortion Robust Image Classification with Deep Convolutional Neural Network based on Discre...
nmfeeds: [O] https://t.co/EVAczcw2dF Distortion Robust Image Classification with Deep Convolutional Neural Network based on Discret...
arxiv_cscv: Distortion Robust Image Classification with Deep Convolutional Neural Network based on Discrete Cosine Transform https://t.co/foFnanoM2Y
ComputerPapers: Distortion Robust Image Classification with Deep Convolutional Neural Network based on Discrete Cosine Transform. https://t.co/2MFIUId2qA
arxivml: "Distortion Robust Image Classification with Deep Convolutional Neural Network based on Discrete Cosine Transform",… https://t.co/Y9iB5cwzlc
Github
None.
Youtube
None.
Other stats
Sample Sizes : None.
Authors: 5
Total Words: 5339
Unqiue Words: 1884

0.0 Mikeys
#3. Characterising epithelial tissues using persistent entropy
N. Atienza, L. M. Escudero, M. J. Jimenez, M. Soriano-Trigueros
In this paper, we apply persistent entropy, a novel topological statistic, for characterization of images of epithelial tissues. We have found out that persistent entropy is able to summarize topological and geometric information encoded by \alpha-complexes and persistent homology. After using some statistical tests, we can guarantee the existence of significant differences in the studied tissues.
more | pdf | html
Figures
Tweets
M157q_News_RSS: Characterising epithelial tissues using persistent entropy. (arXiv:1810.05835v1 [eess.IV]) https://t.co/FAwygy4mGi In this paper, we apply p
arxivml: "Characterising epithelial tissues using persistent entropy", N. Atienza, L.M. Escudero, M.J. Jimenez, M. Soriano-T… https://t.co/B5Xg2Lq1n8
arxiv_cscv: Characterising epithelial tissues using persistent entropy https://t.co/k8LzlTLxjr
arxiv_cscv: Characterising epithelial tissues using persistent entropy https://t.co/k8LzlTLxjr
ComputerPapers: Characterising epithelial tissues using persistent entropy. https://t.co/J03nRr5M26
Github
None.
Youtube
None.
Other stats
Sample Sizes : None.
Authors: 4
Total Words: 3507
Unqiue Words: 1464

0.0 Mikeys
#4. Crowd disagreement of medical images is informative
Veronika Cheplygina, Josien P. W. Pluim
Classifiers for medical image analysis are often trained with a single consensus label, based on combining the labels from experts or crowds. However, disagreement between annotators may be informative, and thus removing it may not be the best strategy. As a proof of concept, we predict whether a skin lesion from the ISIC 2017 dataset is a melanoma or not, based on crowd annotations of visual characteristics of that lesion. We compare using the mean annotations, illustrating consensus, to standard deviations and other distribution moments, illustrating disagreement. We show that the mean annotations perform best, but that the disagreement measures are still informative. We also make the crowd annotations used in this paper available at \url{https://figshare.com/s/5cbbce14647b66286544}.
more | pdf | html
Figures
Tweets
arxiv_org: Crowd disagreement of medical images is informative. https://t.co/cVLSFcTa0M https://t.co/ZuV4IslHak
vcheplygina: I'll kick off with the paper I presented at #miccailabels workshop. We look at predicting melanoma in skin lesion images, based only on visual assessments from the crowd https://t.co/6OpvVniv3H #MICCAI2018
vcheplygina: Tomorrow at #MICCAILABELS #MICCAI2018 I will also present some own work on crowdsourcing for melanoma classification (preprint https://t.co/6OpvVniv3H ), as well as an gamified app for crowdsourcing annotations, in collaboration with @v_j_khan
vcheplygina: @NoelCodella Perfect! I'll drop you an email closer to the date. Some background here: https://t.co/6OpvVniv3H
HubBucket: RT @arxiv_org: Crowd disagreement of medical images is informative. https://t.co/cVLSFcTa0M https://t.co/ZuV4IslHak
RexDouglass: RT @arxiv_org: Crowd disagreement of medical images is informative. https://t.co/cVLSFcTa0M https://t.co/ZuV4IslHak
HubMedX: RT @arxiv_org: Crowd disagreement of medical images is informative. https://t.co/cVLSFcTa0M https://t.co/ZuV4IslHak
Github
None.
Youtube
None.
Other stats
Sample Sizes : None.
Authors: 2
Total Words: 2280
Unqiue Words: 904

0.0 Mikeys
#5. Identify, locate and separate: Audio-visual object extraction in large video collections using weak supervision
Sanjeel Parekh, Alexey Ozerov, Slim Essid, Ngoc Duong, Patrick Pérez, Gaël Richard
We tackle the problem of audiovisual scene analysis for weakly-labeled data. To this end, we build upon our previous audiovisual representation learning framework to perform object classification in noisy acoustic environments and integrate audio source enhancement capability. This is made possible by a novel use of non-negative matrix factorization for the audio modality. Our approach is founded on the multiple instance learning paradigm. Its effectiveness is established through experiments over a challenging dataset of music instrument performance videos. We also show encouraging visual object localization results.
more | pdf | html
Figures
Tweets
BrundageBot: Identify, locate and separate: Audio-visual object extraction in large video collections using weak supervision. Sanjeel Parekh, Alexey Ozerov, Slim Essid, Ngoc Duong, Patrick Pérez, and Gaël Richard https://t.co/hFoIENGm8H
arxiv_cscv: Identify, locate and separate: Audio-visual object extraction in large video collections using weak supervision https://t.co/6f4xrd5LdG
arxiv_cscv: Identify, locate and separate: Audio-visual object extraction in large video collections using weak supervision https://t.co/6f4xrcO9P6
Soul: Identify, locate and separate: Audio-visual object extraction in large video collections using weak supervision. https://t.co/YxY1znEVs1
arxivml: "Identify, locate and separate: Audio-visual object extraction in large video collections using weak supervision", … https://t.co/w6XxD9R1z5
Github
None.
Youtube
None.
Other stats
Sample Sizes : None.
Authors: 6
Total Words: 4350
Unqiue Words: 1792

0.0 Mikeys
#6. Pose Estimation for Objects with Rotational Symmetry
Enric Corona, Kaustav Kundu, Sanja Fidler
Pose estimation is a widely explored problem, enabling many robotic tasks such as grasping and manipulation. In this paper, we tackle the problem of pose estimation for objects that exhibit rotational symmetry, which are common in man-made and industrial environments. In particular, our aim is to infer poses for objects not seen at training time, but for which their 3D CAD models are available at test time. Previous work has tackled this problem by learning to compare captured views of real objects with the rendered views of their 3D CAD models, by embedding them in a joint latent space using neural networks. We show that sidestepping the issue of symmetry in this scenario during training leads to poor performance at test time. We propose a model that reasons about rotational symmetry during training by having access to only a small set of symmetry-labeled objects, whereby exploiting a large collection of unlabeled CAD models. We demonstrate that our approach significantly outperforms a naively trained neural network on a new pose...
more | pdf | html
Figures
Tweets
arxivml: "Pose Estimation for Objects with Rotational Symmetry", Enric Corona, Kaustav Kundu, Sanja Fidler https://t.co/VzqdnzCLEE
nmfeeds: [CV] https://t.co/Gu4Sawl9qS Pose Estimation for Objects with Rotational Symmetry. Pose estimation is a widely explored pr...
nmfeeds: [O] https://t.co/Gu4Sawl9qS Pose Estimation for Objects with Rotational Symmetry. Pose estimation is a widely explored pro...
ComputerPapers: Pose Estimation for Objects with Rotational Symmetry. https://t.co/9EI1W9KWF2
mvaldenegro: RT @arxiv_cscv: Pose Estimation for Objects with Rotational Symmetry https://t.co/x9CEop3aah
Github
None.
Youtube
None.
Other stats
Sample Sizes : None.
Authors: 3
Total Words: 10590
Unqiue Words: 2766

0.0 Mikeys
#7. Learning to Steer by Mimicking Features from Heterogeneous Auxiliary Networks
Yuenan Hou, Zheng Ma, Chunxiao Liu, Chen Change Loy
The training of many existing end-to-end steering angle prediction models heavily relies on steering angles as the supervisory signal. Without learning from much richer contexts, these methods are susceptible to the presence of sharp road curves, challenging traffic conditions, strong shadows, and severe lighting changes. In this paper, we considerably improve the accuracy and robustness of predictions through heterogeneous auxiliary networks feature mimicking, a new and effective training method that provides us with much richer contextual signals apart from steering direction. Specifically, we train our steering angle predictive model by distilling multi-layer knowledge from multiple heterogeneous auxiliary networks that perform related but different tasks, e.g., image segmentation or optical flow estimation. As opposed to multi-task learning, our method does not require expensive annotations of related tasks on the target set. This is made possible by applying contemporary off-the-shelf networks on the target set and mimicking...
more | pdf | html
Figures
Tweets
BrundageBot: Learning to Steer by Mimicking Features from Heterogeneous Auxiliary Networks. Yuenan Hou, Zheng Ma, Chunxiao Liu, and Chen Change Loy https://t.co/n6Jbz4eCUV
arxivml: "Learning to Steer by Mimicking Features from Heterogeneous Auxiliary Networks", Yuenan Hou, Zheng Ma, Chunxiao Liu… https://t.co/lwGYJYA3KP
arxiv_cscv: Learning to Steer by Mimicking Features from Heterogeneous Auxiliary Networks https://t.co/Cm0qIvYlPL
arxiv_cscv: Learning to Steer by Mimicking Features from Heterogeneous Auxiliary Networks https://t.co/Cm0qIvYlPL
Github
None.
Youtube
None.
Other stats
Sample Sizes : None.
Authors: 4
Total Words: 6618
Unqiue Words: 1995

0.0 Mikeys
#8. Image-to-Video Person Re-Identification by Reusing Cross-modal Embeddings
Zhongwei Xie, Lin Li, Xian Zhong, Luo Zhong
Image-to-video person re-identification identifies a target person by a probe image from quantities of pedestrian videos captured by non-overlapping cameras. Despite the great progress achieved,it's still challenging to match in the multimodal scenario,i.e. between image and video. Currently,state-of-the-art approaches mainly focus on the task-specific data,neglecting the extra information on the different but related tasks. In this paper,we propose an end-to-end neural network framework for image-to-video person reidentification by leveraging cross-modal embeddings learned from extra information.Concretely speaking,cross-modal embeddings from image captioning and video captioning models are reused to help learned features be projected into a coordinated space,where similarity can be directly computed. Besides,training steps from fixed model reuse approach are integrated into our framework,which can incorporate beneficial information and eventually make the target networks independent of existing models. Apart from that,our...
more | pdf | html
Figures
Tweets
arxivml: "Image-to-Video Person Re-Identification by Reusing Cross-modal Embeddings", Zhongwei Xie, Lin Li, Xian Zhong, Luo … https://t.co/uKBpkbv1X0
nmfeeds: [CV] https://t.co/xmHPpnlj1Z Image-to-Video Person Re-Identification by Reusing Cross-modal Embeddings. Image-to-video per...
nmfeeds: [O] https://t.co/xmHPpnlj1Z Image-to-Video Person Re-Identification by Reusing Cross-modal Embeddings. Image-to-video pers...
arxiv_cscv: Image-to-Video Person Re-Identification by Reusing Cross-modal Embeddings https://t.co/A4AbHqjrMW
arxiv_cscv: Image-to-Video Person Re-Identification by Reusing Cross-modal Embeddings https://t.co/A4AbHqB2Eu
arxiv_cscv: Image-to-Video Person Re-Identification by Reusing Cross-modal Embeddings https://t.co/A4AbHqjrMW
Github
None.
Youtube
None.
Other stats
Sample Sizes : None.
Authors: 4
Total Words: 5664
Unqiue Words: 1813

0.0 Mikeys
#9. Domain Randomization for Scene-Specific Car Detection and Pose Estimation
Rawal Khirodkar, Donghyun Yoo, Kris M. Kitani
We address the issue of domain gap when making use of synthetic data to train a scene-specific object detector and pose estimator. While previous works have shown that the constraints of learning a scene-specific model can be leveraged to create geometrically and photometrically consistent synthetic data, care must be taken to design synthetic content which is as close as possible to the real-world data distribution. In this work, we propose to solve domain gap through the use of appearance randomization to generate a wide range of synthetic objects to span the space of realistic images for training. An ablation study of our results is presented to delineate the individual contribution of different components in the randomization process. We evaluate our method on VIRAT, UA-DETRAC, EPFL-Car datasets, where we demonstrate that using scene specific domain randomized synthetic data is better than fine-tuning off-the-shelf models on limited real data.
more | pdf | html
Figures
Tweets
sim2realAIorg: Domain Randomization for Scene-Specific Car Detection and Pose Estimation https://t.co/4p8YFHdTfY https://t.co/WNIn69DDzS
sim2realAIorg: Domain Randomization for Scene-Specific Car Detection and Pose Estimation https://t.co/4p8YFHdTfY https://t.co/K41B2Tc9KX
sim2realAIorg: Domain Randomization for Scene-Specific Car Detection and Pose Estimation https://t.co/4p8YFHdTfY https://t.co/JalukH1rNr
arxivml: "Domain Randomization for Scene-Specific Car Detection and Pose Estimation", Rawal Khirodkar, Donghyun Yoo, Kris M.… https://t.co/QhF2bMC6t2
arxiv_cscv: Domain Randomization for Scene-Specific Car Detection and Pose Estimation https://t.co/7vAnkjHpc3
ComputerPapers: Domain Randomization for Scene-Specific Car Detection and Pose Estimation. https://t.co/HRUztf0NUQ
mir_k: RT @sim2realAIorg: Domain Randomization for Scene-Specific Car Detection and Pose Estimation https://t.co/4p8YFHdTfY https://t.co/K41B2Tc9KX
cghosh_: RT @sim2realAIorg: Domain Randomization for Scene-Specific Car Detection and Pose Estimation https://t.co/4p8YFHdTfY https://t.co/JalukH1rNr
cghosh_: RT @sim2realAIorg: Domain Randomization for Scene-Specific Car Detection and Pose Estimation https://t.co/4p8YFHdTfY https://t.co/WNIn69DDzS
keylinker: RT @arxiv_cscv: Domain Randomization for Scene-Specific Car Detection and Pose Estimation https://t.co/7vAnkjHpc3
ReedRoof: RT @sim2realAIorg: Domain Randomization for Scene-Specific Car Detection and Pose Estimation https://t.co/4p8YFHdTfY https://t.co/JalukH1rNr
Github
None.
Youtube
None.
Other stats
Sample Sizes : None.
Authors: 3
Total Words: 4793
Unqiue Words: 1688

0.0 Mikeys
#10. No-Frills Human-Object Interaction Detection: Factorization, Appearance and Layout Encodings, and Training Techniques
Tanmay Gupta, Alexander Schwing, Derek Hoiem
We show that with an appropriate factorization, and encodings of layout and appearance constructed from outputs of pretrained object detectors, a relatively simple model outperforms more sophisticated approaches on human-object interaction detection. Our model includes factors for detection scores, human and object appearance, and coarse (box-pair configuration) and optionally fine-grained layout (human pose). We also develop training techniques that improve learning efficiency by: (i) eliminating train-inference mismatch; (ii) rejecting easy negatives during mini-batch training; and (iii) using a ratio of negatives to positives that is two orders of magnitude larger than existing approaches while constructing training mini-batches. We conduct a thorough ablation study to understand the importance of different factors and training techniques using the challenging HICO-Det dataset.
more | pdf | html
Figures
Tweets
BrundageBot: No-Frills Human-Object Interaction Detection: Factorization, Appearance and Layout Encodings, and Training Techniques. Tanmay Gupta, Alexander Schwing, and Derek Hoiem https://t.co/aK3xsf12ru
arxivml: "No-Frills Human-Object Interaction Detection: Factorization, Appearance and Layout Encodings, and Training Techniq… https://t.co/JevFiSYqCJ
ComputerPapers: No-Frills Human-Object Interaction Detection: Factorization, Appearance and Layout Encodings, and Training Techniques. https://t.co/CS5qFjvjPw
Github
None.
Youtube
None.
Other stats
Sample Sizes : None.
Authors: 3
Total Words: 7190
Unqiue Words: 2150

About

Assert is a website where the best academic papers on arXiv (computer science, math, physics), bioRxiv (biology), BITSS (reproducibility), EarthArXiv (earth science), engrXiv (engineering), LawArXiv (law), PsyArXiv (psychology), SocArXiv (social science), and SportRxiv (sport research) bubble to the top each day.

Papers are scored (in real-time) based on how verifiable they are (as determined by their Github repos) and how interesting they are (based on Twitter).

To see top papers, follow us on twitter @assertpub_ (arXiv), @assert_pub (bioRxiv), and @assertpub_dev (everything else).

To see beautiful figures extracted from papers, follow us on Instagram.

Tracking 72,893 papers.

Search
Sort results based on if they are interesting or reproducible.
Interesting
Reproducible
Categories
All
Astrophysics
Cosmology and Nongalactic Astrophysics
Earth and Planetary Astrophysics
Astrophysics of Galaxies
High Energy Astrophysical Phenomena
Instrumentation and Methods for Astrophysics
Solar and Stellar Astrophysics
Condensed Matter
Disordered Systems and Neural Networks
Mesoscale and Nanoscale Physics
Materials Science
Other Condensed Matter
Quantum Gases
Soft Condensed Matter
Statistical Mechanics
Strongly Correlated Electrons
Superconductivity
Computer Science
Artificial Intelligence
Hardware Architecture
Computational Complexity
Computational Engineering, Finance, and Science
Computational Geometry
Computation and Language
Cryptography and Security
Computer Vision and Pattern Recognition
Computers and Society
Databases
Distributed, Parallel, and Cluster Computing
Digital Libraries
Discrete Mathematics
Data Structures and Algorithms
Emerging Technologies
Formal Languages and Automata Theory
General Literature
Graphics
Computer Science and Game Theory
Human-Computer Interaction
Information Retrieval
Information Theory
Machine Learning
Logic in Computer Science
Multiagent Systems
Multimedia
Mathematical Software
Numerical Analysis
Neural and Evolutionary Computing
Networking and Internet Architecture
Other Computer Science
Operating Systems
Performance
Programming Languages
Robotics
Symbolic Computation
Sound
Software Engineering
Social and Information Networks
Systems and Control
Economics
Econometrics
General Economics
Theoretical Economics
Electrical Engineering and Systems Science
Audio and Speech Processing
Image and Video Processing
Signal Processing
General Relativity and Quantum Cosmology
General Relativity and Quantum Cosmology
High Energy Physics - Experiment
High Energy Physics - Experiment
High Energy Physics - Lattice
High Energy Physics - Lattice
High Energy Physics - Phenomenology
High Energy Physics - Phenomenology
High Energy Physics - Theory
High Energy Physics - Theory
Mathematics
Commutative Algebra
Algebraic Geometry
Analysis of PDEs
Algebraic Topology
Classical Analysis and ODEs
Combinatorics
Category Theory
Complex Variables
Differential Geometry
Dynamical Systems
Functional Analysis
General Mathematics
General Topology
Group Theory
Geometric Topology
History and Overview
Information Theory
K-Theory and Homology
Logic
Metric Geometry
Mathematical Physics
Numerical Analysis
Number Theory
Operator Algebras
Optimization and Control
Probability
Quantum Algebra
Rings and Algebras
Representation Theory
Symplectic Geometry
Spectral Theory
Statistics Theory
Mathematical Physics
Mathematical Physics
Nonlinear Sciences
Adaptation and Self-Organizing Systems
Chaotic Dynamics
Cellular Automata and Lattice Gases
Pattern Formation and Solitons
Exactly Solvable and Integrable Systems
Nuclear Experiment
Nuclear Experiment
Nuclear Theory
Nuclear Theory
Physics
Accelerator Physics
Atmospheric and Oceanic Physics
Applied Physics
Atomic and Molecular Clusters
Atomic Physics
Biological Physics
Chemical Physics
Classical Physics
Computational Physics
Data Analysis, Statistics and Probability
Physics Education
Fluid Dynamics
General Physics
Geophysics
History and Philosophy of Physics
Instrumentation and Detectors
Medical Physics
Optics
Plasma Physics
Popular Physics
Physics and Society
Space Physics
Quantitative Biology
Biomolecules
Cell Behavior
Genomics
Molecular Networks
Neurons and Cognition
Other Quantitative Biology
Populations and Evolution
Quantitative Methods
Subcellular Processes
Tissues and Organs
Quantitative Finance
Computational Finance
Economics
General Finance
Mathematical Finance
Portfolio Management
Pricing of Securities
Risk Management
Statistical Finance
Trading and Market Microstructure
Quantum Physics
Quantum Physics
Statistics
Applications
Computation
Methodology
Machine Learning
Other Statistics
Statistics Theory
Feedback
Online
Stats
Tracking 72,893 papers.