Top 5 Arxiv Papers Today in Sound


2.047 Mikeys
#1. Towards Robust Voice Pathology Detection
Pavol Harar, Zoltan Galaz, Jesus B. Alonso-Hernandez, Jiri Mekyska, Radim Burget, Zdenek Smekal
Automatic objective non-invasive detection of pathological voice based on computerized analysis of acoustic signals can play an important role in early diagnosis, progression tracking and even effective treatment of pathological voices. In search towards such a robust voice pathology detection system we investigated 3 distinct classifiers within supervised learning and anomaly detection paradigms. We conducted a set of experiments using a variety of input data such as raw waveforms, spectrograms, mel-frequency cepstral coefficients (MFCC) and conventional acoustic (dysphonic) features (AF). In comparison with previously published works, this article is the first to utilize combination of 4 different databases comprising normophonic and pathological recordings of sustained phonation of the vowel /a/ unrestricted to a subset of vocal pathologies. Furthermore, to our best knowledge, this article is the first to explore gradient boosted trees and deep learning for this application. The following best classification performances...
more | pdf | html
Figures
Tweets
arxiv_cs_LG: Towards Robust Voice Pathology Detection. Pavol Harar, Zoltan Galaz, Jesus B. Alonso-Hernandez, Jiri Mekyska, Radim Burget, and Zdenek Smekal https://t.co/MoKSylmnHa
Memoirs: Towards Robust Voice Pathology Detection. https://t.co/R1YzxwWNMp
Github
None.
Youtube
None.
Other stats
Sample Sizes : None.
Authors: 6
Total Words: 7924
Unqiue Words: 2866

2.047 Mikeys
#2. Learning Complex Basis Functions for Invariant Representations of Audio
Stefan Lattner, Monika Dörfler, Andreas Arzt
Learning features from data has shown to be more successful than using hand-crafted features for many machine learning tasks. In music information retrieval (MIR), features learned from windowed spectrograms are highly variant to transformations like transposition or time-shift. Such variances are undesirable when they are irrelevant for the respective MIR task. We propose an architecture called Complex Autoencoder (CAE) which learns features invariant to orthogonal transformations. Mapping signals onto complex basis functions learned by the CAE results in a transformation-invariant "magnitude space" and a transformation-variant "phase space". The phase space is useful to infer transformations between data pairs. When exploiting the invariance-property of the magnitude space, we achieve state-of-the-art results in audio-to-score alignment and repeated section discovery for audio. A PyTorch implementation of the CAE, including the repeated section discovery method, is available online.
more | pdf | html
Figures
None.
Tweets
arxiv_cs_LG: Learning Complex Basis Functions for Invariant Representations of Audio. Stefan Lattner, Monika Dörfler, and Andreas Arzt https://t.co/0hEDO1Etc3
Memoirs: Learning Complex Basis Functions for Invariant Representations of Audio. https://t.co/KWGVXZnwmo
Github
None.
Youtube
None.
Other stats
Sample Sizes : None.
Authors: 3
Total Words: 0
Unqiue Words: 0

2.021 Mikeys
#3. Multi-Task Semi-Supervised Adversarial Autoencoding for Speech Emotion
Rajib Rana, Siddique Latif, Sara Khalifa, Raja Jurdak
Despite the widespread use of supervised deep learning methods for affect recognition from speech, they are severely limited by the lack of a sufficient amount of labelled speech data. Considering the abundant availability of unlabelled data, this paper proposed a semi-supervised model that can effectively utilise the unlabelled data in multi-task learning way in order to improve the performance of speech emotion recognition. The proposed model adversarialy learns a shared representation for two auxiliary tasks along with emotion identification as the main task. We consider speaker and gender identification as auxiliary tasks in order to operate the model on any large audio corpus. We demonstrate that in a scenario with limited labelled training samples, one can significantly improve the performance of a supervised classification task by simultaneously training with additional auxiliary tasks having an availability of large amount of data. The proposed model is rigorously evaluated for both categorical and dimensional...
more | pdf | html
Figures
Tweets
BrundageBot: Multi-Task Semi-Supervised Adversarial Autoencoding for Speech Emotion. Rajib Rana, Siddique Latif, Sara Khalifa, and Raja Jurdak https://t.co/jCRPfO6y0f
Github
None.
Youtube
None.
Other stats
Sample Sizes : None.
Authors: 4
Total Words: 8060
Unqiue Words: 2501

2.017 Mikeys
#4. Toeplitz Inverse Covariance based Robust Speaker Clustering for Naturalistic Audio Streams
Harishchandra Dubey, Abhijeet Sangwan, John Hansen
Speaker diarization determines who spoke and when? in an audio stream. In this study, we propose a model-based approach for robust speaker clustering using i-vectors. The ivectors extracted from different segments of same speaker are correlated. We model this correlation with a Markov Random Field (MRF) network. Leveraging the advancements in MRF modeling, we used Toeplitz Inverse Covariance (TIC) matrix to represent the MRF correlation network for each speaker. This approaches captures the sequential structure of i-vectors (or equivalent speaker turns) belonging to same speaker in an audio stream. A variant of standard Expectation Maximization (EM) algorithm is adopted for deriving closed-form solution using dynamic programming (DP) and the alternating direction method of multiplier (ADMM). Our diarization system has four steps: (1) ground-truth segmentation; (2) i-vector extraction; (3) post-processing (mean subtraction, principal component analysis, and length-normalization) ; and (4) proposed speaker clustering. We employ...
more | pdf | html
Figures
Tweets
arxivml: "Toeplitz Inverse Covariance based Robust Speaker Clustering for Naturalistic Audio Streams", Harishchandra Dubey, … https://t.co/Xh4sjXbQGB
arxiv_cs_LG: Toeplitz Inverse Covariance based Robust Speaker Clustering for Naturalistic Audio Streams. Harishchandra Dubey, Abhijeet Sangwan, and John Hansen https://t.co/CN3VKjmUA3
Memoirs: Toeplitz Inverse Covariance based Robust Speaker Clustering for Naturalistic Audio Streams. https://t.co/7KxfluaHF8
Github
None.
Youtube
None.
Other stats
Sample Sizes : None.
Authors: 3
Total Words: 4398
Unqiue Words: 1634

1.975 Mikeys
#5. Unsupervised Adversarial Domain Adaptation for Cross-Lingual Speech Emotion Recognition
Siddique Latif, Junaid Qadir, Muhammad Bilal
Cross-lingual speech emotion recognition (SER) is a crucial task for many real-world applications. The performance of SER systems is often degraded by the differences in the distributions of training and test data. These differences become more apparent when training and test data belong to different languages, which cause a significant performance gap between the validation and test scores. It is imperative to build more robust models that can fit in practical applications of SER systems. Therefore, in this paper, we propose a Generative Adversarial Network (GAN)-based model for multilingual SER. Our choice of using GAN is motivated by their great success in learning the underlying data distribution. The proposed model is designed in such a way that can learn language invariant representations without requiring target-language data labels. We evaluate our proposed model on four different language emotional datasets, including an Urdu-language dataset to also incorporate alternative languages for which labelled data is difficult...
more | pdf | html
Figures
None.
Tweets
Github
None.
Youtube
None.
Other stats
Sample Sizes : None.
Authors: 3
Total Words: 0
Unqiue Words: 0

About

Assert is a website where the best academic papers on arXiv (computer science, math, physics), bioRxiv (biology), BITSS (reproducibility), EarthArXiv (earth science), engrXiv (engineering), LawArXiv (law), PsyArXiv (psychology), SocArXiv (social science), and SportRxiv (sport research) bubble to the top each day.

Papers are scored (in real-time) based on how verifiable they are (as determined by their Github repos) and how interesting they are (based on Twitter).

To see top papers, follow us on twitter @assertpub_ (arXiv), @assert_pub (bioRxiv), and @assertpub_dev (everything else).

To see beautiful figures extracted from papers, follow us on Instagram.

Tracking 158,360 papers.

Search
Sort results based on if they are interesting or reproducible.
Interesting
Reproducible
Categories
All
Astrophysics
Cosmology and Nongalactic Astrophysics
Earth and Planetary Astrophysics
Astrophysics of Galaxies
High Energy Astrophysical Phenomena
Instrumentation and Methods for Astrophysics
Solar and Stellar Astrophysics
Condensed Matter
Disordered Systems and Neural Networks
Mesoscale and Nanoscale Physics
Materials Science
Other Condensed Matter
Quantum Gases
Soft Condensed Matter
Statistical Mechanics
Strongly Correlated Electrons
Superconductivity
Computer Science
Artificial Intelligence
Hardware Architecture
Computational Complexity
Computational Engineering, Finance, and Science
Computational Geometry
Computation and Language
Cryptography and Security
Computer Vision and Pattern Recognition
Computers and Society
Databases
Distributed, Parallel, and Cluster Computing
Digital Libraries
Discrete Mathematics
Data Structures and Algorithms
Emerging Technologies
Formal Languages and Automata Theory
General Literature
Graphics
Computer Science and Game Theory
Human-Computer Interaction
Information Retrieval
Information Theory
Machine Learning
Logic in Computer Science
Multiagent Systems
Multimedia
Mathematical Software
Numerical Analysis
Neural and Evolutionary Computing
Networking and Internet Architecture
Other Computer Science
Operating Systems
Performance
Programming Languages
Robotics
Symbolic Computation
Sound
Software Engineering
Social and Information Networks
Systems and Control
Economics
Econometrics
General Economics
Theoretical Economics
Electrical Engineering and Systems Science
Audio and Speech Processing
Image and Video Processing
Signal Processing
General Relativity and Quantum Cosmology
General Relativity and Quantum Cosmology
High Energy Physics - Experiment
High Energy Physics - Experiment
High Energy Physics - Lattice
High Energy Physics - Lattice
High Energy Physics - Phenomenology
High Energy Physics - Phenomenology
High Energy Physics - Theory
High Energy Physics - Theory
Mathematics
Commutative Algebra
Algebraic Geometry
Analysis of PDEs
Algebraic Topology
Classical Analysis and ODEs
Combinatorics
Category Theory
Complex Variables
Differential Geometry
Dynamical Systems
Functional Analysis
General Mathematics
General Topology
Group Theory
Geometric Topology
History and Overview
Information Theory
K-Theory and Homology
Logic
Metric Geometry
Mathematical Physics
Numerical Analysis
Number Theory
Operator Algebras
Optimization and Control
Probability
Quantum Algebra
Rings and Algebras
Representation Theory
Symplectic Geometry
Spectral Theory
Statistics Theory
Mathematical Physics
Mathematical Physics
Nonlinear Sciences
Adaptation and Self-Organizing Systems
Chaotic Dynamics
Cellular Automata and Lattice Gases
Pattern Formation and Solitons
Exactly Solvable and Integrable Systems
Nuclear Experiment
Nuclear Experiment
Nuclear Theory
Nuclear Theory
Physics
Accelerator Physics
Atmospheric and Oceanic Physics
Applied Physics
Atomic and Molecular Clusters
Atomic Physics
Biological Physics
Chemical Physics
Classical Physics
Computational Physics
Data Analysis, Statistics and Probability
Physics Education
Fluid Dynamics
General Physics
Geophysics
History and Philosophy of Physics
Instrumentation and Detectors
Medical Physics
Optics
Plasma Physics
Popular Physics
Physics and Society
Space Physics
Quantitative Biology
Biomolecules
Cell Behavior
Genomics
Molecular Networks
Neurons and Cognition
Other Quantitative Biology
Populations and Evolution
Quantitative Methods
Subcellular Processes
Tissues and Organs
Quantitative Finance
Computational Finance
Economics
General Finance
Mathematical Finance
Portfolio Management
Pricing of Securities
Risk Management
Statistical Finance
Trading and Market Microstructure
Quantum Physics
Quantum Physics
Statistics
Applications
Computation
Methodology
Machine Learning
Other Statistics
Statistics Theory
Feedback
Online
Stats
Tracking 158,360 papers.