Learning Complex Basis Functions for Invariant Representations of Audio
Learning features from data has shown to be more successful than using hand-crafted features for many machine learning tasks. In music information retrieval (MIR), features learned from windowed spectrograms are highly variant to transformations like transposition or time-shift. Such variances are undesirable when they are irrelevant for the respective MIR task. We propose an architecture called Complex Autoencoder (CAE) which learns features invariant to orthogonal transformations. Mapping signals onto complex basis functions learned by the CAE results in a transformation-invariant "magnitude space" and a transformation-variant "phase space". The phase space is useful to infer transformations between data pairs. When exploiting the invariance-property of the magnitude space, we achieve state-of-the-art results in audio-to-score alignment and repeated section discovery for audio. A PyTorch implementation of the CAE, including the repeated section discovery method, is available online.
NurtureToken New!

Token crowdsale for this paper ends in

Buy Nurture Tokens

Authors

Are you an author of this paper? Check the Twitter handle we have for you is correct.

Stefan Lattner (edit)
Monika Dörfler (add twitter)
Andreas Arzt (edit)
Ask The Authors

Ask the authors of this paper a question or leave a comment.

Read it. Rate it.
#1. Which part of the paper did you read?

#2. The paper contains new data or analyses that is openly accessible?
#3. The conclusion is supported by the data and analyses?
#4. The conclusion is of scientific interest?
#5. The result is likely to lead to future research?

Github
User:
None (add)
Repo:
None (add)
Stargazers:
0
Forks:
0
Open Issues:
0
Network:
0
Subscribers:
0
Language:
None
Youtube
Link:
None (add)
Views:
0
Likes:
0
Dislikes:
0
Favorites:
0
Comments:
0
Other
Sample Sizes (N=):
Inserted:
Words Total:
Words Unique:
Source:
Abstract:
None
07/15/19 06:08PM
6,256
2,253
Tweets
deeplearnmusic: Very happy that we won (one of the four) Best Paper Awards at the #ismir2019 conference in Delft, NL! 😃🍾😃🎉😃 "Learning Complex Basis Functions for Invariant Representations of Audio" Paper: https://t.co/HtQDpNZpoy Talk: https://t.co/tDiMEZnQTF Code: https://t.co/kAuY1JNFcI https://t.co/anFIIFcwM6
straybrid: RT @arxiv_org: Learning Complex Basis Functions for Invariant Representations of Audio. https://t.co/FrwDGtST8Q https://t.co/zi8HjRFvs5
drscotthawley: RT @arxiv_org: Learning Complex Basis Functions for Invariant Representations of Audio. https://t.co/FrwDGtST8Q https://t.co/zi8HjRFvs5
stevennjui: RT @arxiv_org: Learning Complex Basis Functions for Invariant Representations of Audio. https://t.co/FrwDGtST8Q https://t.co/zi8HjRFvs5
deeplearnmusic: This is how an artificial Neural Network looks from inside, after watching a washing machine for too long. 😵🤩 Check out the paper, and the code I just published on GitHub! #ismir2019 Paper: https://t.co/R1GPSM06ts https://t.co/kAuY1JNFcI
deeplearnmusic: This is how an artificial Neural Network looks from inside, after watching a washing machine for too long. 😵🤩 Check out the paper, and the code I just published on GitHub! Paper: https://t.co/R1GPSM06ts https://t.co/kAuY1JNFcI
vrfymbd: RT @arxiv_org: Learning Complex Basis Functions for Invariant Representations of Audio. https://t.co/FrwDGtST8Q https://t.co/zi8HjRFvs5
arxiv_org: Learning Complex Basis Functions for Invariant Representations of Audio. https://t.co/FrwDGtST8Q https://t.co/zi8HjRFvs5
arxivml: "Learning Complex Basis Functions for Invariant Representations of Audio", Stefan Lattner, Monika Dörfler, Andreas … https://t.co/DyawV269zo
Memoirs: Learning Complex Basis Functions for Invariant Representations of Audio. https://t.co/KWGVXZnwmo
arxiv_cs_LG: Learning Complex Basis Functions for Invariant Representations of Audio. Stefan Lattner, Monika Dörfler, and Andreas Arzt https://t.co/0hEDO1Etc3
Images
Related