Top 7 Arxiv Papers Today in Information Retrieval


2.018 Mikeys
#1. Learning from Bandit Feedback: An Overview of the State-of-the-art
Olivier Jeunen, Dmytro Mykhaylov, David Rohde, Flavian Vasile, Alexandre Gilotte, Martin Bompaire
In machine learning we often try to optimise a decision rule that would have worked well over a historical dataset; this is the so called empirical risk minimisation principle. In the context of learning from recommender system logs, applying this principle becomes a problem because we do not have available the reward of decisions we did not do. In order to handle this "bandit-feedback" setting, several Counterfactual Risk Minimisation (CRM) methods have been proposed in recent years, that attempt to estimate the performance of different policies on historical data. Through importance sampling and various variance reduction techniques, these methods allow more robust learning and inference than classical approaches. It is difficult to accurately estimate the performance of policies that frequently perform actions that were infrequently done in the past and a number of different types of estimators have been proposed. In this paper, we review several methods, based on different off-policy estimators, for learning from bandit...
more | pdf | html
Figures
None.
Tweets
BrundageBot: Learning from Bandit Feedback: An Overview of the State-of-the-art. Olivier Jeunen, Dmytro Mykhaylov, David Rohde, Flavian Vasile, Alexandre Gilotte, and Martin Bompaire https://t.co/MFF2FunK1t
arxivml: "Learning from Bandit Feedback: An Overview of the State-of-the-art", Olivier Jeunen, Dmytro Mykhaylov, David Rohde… https://t.co/kzNCtdx1tg
StatsPapers: Learning from Bandit Feedback: An Overview of the State-of-the-art. https://t.co/RI5smYQzez
Github
None.
Youtube
None.
Other stats
Sample Sizes : None.
Authors: 6
Total Words: 0
Unqiue Words: 0

2.011 Mikeys
#2. BPMR: Bayesian Probabilistic Multivariate Ranking
Nan Wang, Hongning Wang
Multi-aspect user preferences are attracting wider attention in recommender systems, as they enable more detailed understanding of users' evaluations of items. Previous studies show that incorporating multi-aspect preferences can greatly improve the performance and explainability of recommendation. However, as recommendation is essentially a ranking problem, there is no principled solution for ranking multiple aspects collectively to enhance the recommendation. In this work, we derive a multi-aspect ranking criterion. To maintain the dependency among different aspects, we propose to use a vectorized representation of multi-aspect ratings and develop a probabilistic multivariate tensor factorization framework (PMTF). The framework naturally leads to a probabilistic multi-aspect ranking criterion, which generalizes the single-aspect ranking to a multivariate fashion. Experiment results on a large multi-aspect review rating dataset confirmed the effectiveness of our solution.
more | pdf | html
Figures
None.
Tweets
Memoirs: BPMR: Bayesian Probabilistic Multivariate Ranking. https://t.co/ozUjRG5dcR
Github
None.
Youtube
None.
Other stats
Sample Sizes : None.
Authors: 2
Total Words: 0
Unqiue Words: 0

2.003 Mikeys
#3. Document classification methods
Madjid Khalilian, Shiva Hassanzadeh
Information on different fields which are collected by users requires appropriate management and organization to be structured in a standard way and retrieved fast and more easily. Document classification is a conventional method to separate text based on their subjects among scientific text, web pages and digital library. Different methods and techniques are proposed for document classifications that have advantages and deficiencies. In this paper, several unsupervised and supervised document classification methods are studied and compared.
more | pdf | html
Figures
None.
Tweets
arxiv_cs_LG: Document classification methods. Madjid Khalilian and Shiva Hassanzadeh https://t.co/3659mmeN1N
atsushieeeee: ドキュメントのカテゴリ分けに関するレビュー論文https://t.co/I3OIzIijC7
Github
None.
Youtube
None.
Other stats
Sample Sizes : None.
Authors: 2
Total Words: 9256
Unqiue Words: 2084

2.001 Mikeys
#4. Variational Bayesian Context-aware Representation for Grocery Recommendation
Zaiqiao Meng, Richard McCreadie, Craig Macdonald, Iadh Ounis
Grocery recommendation is an important recommendation use-case, which aims to predict which items a user might choose to buy in the future, based on their shopping history. However, existing methods only represent each user and item by single deterministic points in a low-dimensional continuous space. In addition, most of these methods are trained by maximizing the co-occurrence likelihood with a simple Skip-gram-based formulation, which limits the expressive ability of their embeddings and the resulting recommendation performance. In this paper, we propose the Variational Bayesian Context-Aware Representation (VBCAR) model for grocery recommendation, which is a novel variational Bayesian model that learns the user and item latent vectors by leveraging basket context information from past user-item interactions. We train our VBCAR model based on the Bayesian Skip-gram framework coupled with the amortized variational inference so that it can learn more expressive latent representations that integrate both the non-linearity and...
more | pdf | html
Figures
None.
Tweets
BrundageBot: Variational Bayesian Context-aware Representation for Grocery Recommendation. Zaiqiao Meng, Richard McCreadie, Craig Macdonald, and Iadh Ounis https://t.co/4yEJTfFgrn
arxiv_cs_LG: Variational Bayesian Context-aware Representation for Grocery Recommendation. Zaiqiao Meng, Richard McCreadie, Craig Macdonald, and Iadh Ounis https://t.co/o1JVypZP8J
Github
None.
Youtube
None.
Other stats
Sample Sizes : None.
Authors: 4
Total Words: 0
Unqiue Words: 0

2.001 Mikeys
#5. Characterizing and Predicting Repeat Food Consumption Behavior for Just-in-Time Interventions
Yue Liu, Helena Lee, Palakorn Achananuparp, Ee-Peng Lim, Tzu-Ling Cheng, Shou-De Lin
Human beings are creatures of habit. In their daily life, people tend to repeatedly consume similar types of food items over several days and occasionally switch to consuming different types of items when the consumptions become overly monotonous. However, the novel and repeat consumption behaviors have not been studied in food recommendation research. More importantly, the ability to predict daily eating habits of individuals is crucial to improve the effectiveness of food recommender systems in facilitating healthy lifestyle change. In this study, we analyze the patterns of repeat food consumptions using large-scale consumption data from a popular online fitness community called MyFitnessPal (MFP), conduct an offline evaluation of various state-of-the-art algorithms in predicting the next-day food consumption, and analyze their performance across different demographic groups and contexts. The experiment results show that algorithms incorporating the exploration-and-exploitation and temporal dynamics are more effective in...
more | pdf | html
Figures
Tweets
arxiv_cs_LG: Characterizing and Predicting Repeat Food Consumption Behavior for Just-in-Time Interventions. Yue Liu, Helena Lee, Palakorn Achananuparp, Ee-Peng Lim, Tzu-Ling Cheng, and Shou-De Lin https://t.co/ouhT51ISh1
Github
None.
Youtube
None.
Other stats
Sample Sizes : None.
Authors: 6
Total Words: 9332
Unqiue Words: 2731

1.997 Mikeys
#6. Fast Search with Poor OCR
Taivanbat Badamdorj, Adiel Ben-Shalom, Nachum Dershowitz, Lior Wolf
The indexing and searching of historical documents have garnered attention in recent years due to massive digitization efforts of important collections worldwide. Pure textual search in these corpora is a problem since optical character recognition (OCR) is infamous for performing poorly on such historical material, which often suffer from poor preservation. We propose a novel text-based method for searching through noisy text. Our system represents words as vectors, projects queries and candidates obtained from the OCR into a common space, and ranks the candidates using a metric suited to nearest-neighbor search. We demonstrate the practicality of our method on typewritten German documents from the WWII era.
more | pdf | html
Figures
Tweets
Github
None.
Youtube
None.
Other stats
Sample Sizes : None.
Authors: 4
Total Words: 4382
Unqiue Words: 1742

1.975 Mikeys
#7. Understanding the Information needs of Social Scientists in Germany
Dagmar Kern, Daniel Hienert
The information needs of social science researchers are manifold and almost studied in every decade since the 1950s. With this paper, we contribute to this series and present the results of three studies. We asked 367 social science researchers in Germany for their information needs and identified needs in different categories: literature, research data, measurement instruments, support for data analysis, support for data collection, variables in research data, software support, networking/cooperation, and illustrative material. Thereby, the search for literature and research data is still the main information need with more than three-quarter of our participants expressing needs in these categories. With comprehensive lists of altogether 154 concrete information needs, even those that are only expressed by one participant, we contribute to the holistic understanding of the information needs of social science researchers of today.
more | pdf | html
Figures
None.
Tweets
Github
None.
Youtube
None.
Other stats
Sample Sizes : None.
Authors: 2
Total Words: 0
Unqiue Words: 0

About

Assert is a website where the best academic papers on arXiv (computer science, math, physics), bioRxiv (biology), BITSS (reproducibility), EarthArXiv (earth science), engrXiv (engineering), LawArXiv (law), PsyArXiv (psychology), SocArXiv (social science), and SportRxiv (sport research) bubble to the top each day.

Papers are scored (in real-time) based on how verifiable they are (as determined by their Github repos) and how interesting they are (based on Twitter).

To see top papers, follow us on twitter @assertpub_ (arXiv), @assert_pub (bioRxiv), and @assertpub_dev (everything else).

To see beautiful figures extracted from papers, follow us on Instagram.

Tracking 192,930 papers.

Search
Sort results based on if they are interesting or reproducible.
Interesting
Reproducible
Categories
All
Astrophysics
Cosmology and Nongalactic Astrophysics
Earth and Planetary Astrophysics
Astrophysics of Galaxies
High Energy Astrophysical Phenomena
Instrumentation and Methods for Astrophysics
Solar and Stellar Astrophysics
Condensed Matter
Disordered Systems and Neural Networks
Mesoscale and Nanoscale Physics
Materials Science
Other Condensed Matter
Quantum Gases
Soft Condensed Matter
Statistical Mechanics
Strongly Correlated Electrons
Superconductivity
Computer Science
Artificial Intelligence
Hardware Architecture
Computational Complexity
Computational Engineering, Finance, and Science
Computational Geometry
Computation and Language
Cryptography and Security
Computer Vision and Pattern Recognition
Computers and Society
Databases
Distributed, Parallel, and Cluster Computing
Digital Libraries
Discrete Mathematics
Data Structures and Algorithms
Emerging Technologies
Formal Languages and Automata Theory
General Literature
Graphics
Computer Science and Game Theory
Human-Computer Interaction
Information Retrieval
Information Theory
Machine Learning
Logic in Computer Science
Multiagent Systems
Multimedia
Mathematical Software
Numerical Analysis
Neural and Evolutionary Computing
Networking and Internet Architecture
Other Computer Science
Operating Systems
Performance
Programming Languages
Robotics
Symbolic Computation
Sound
Software Engineering
Social and Information Networks
Systems and Control
Economics
Econometrics
General Economics
Theoretical Economics
Electrical Engineering and Systems Science
Audio and Speech Processing
Image and Video Processing
Signal Processing
General Relativity and Quantum Cosmology
General Relativity and Quantum Cosmology
High Energy Physics - Experiment
High Energy Physics - Experiment
High Energy Physics - Lattice
High Energy Physics - Lattice
High Energy Physics - Phenomenology
High Energy Physics - Phenomenology
High Energy Physics - Theory
High Energy Physics - Theory
Mathematics
Commutative Algebra
Algebraic Geometry
Analysis of PDEs
Algebraic Topology
Classical Analysis and ODEs
Combinatorics
Category Theory
Complex Variables
Differential Geometry
Dynamical Systems
Functional Analysis
General Mathematics
General Topology
Group Theory
Geometric Topology
History and Overview
Information Theory
K-Theory and Homology
Logic
Metric Geometry
Mathematical Physics
Numerical Analysis
Number Theory
Operator Algebras
Optimization and Control
Probability
Quantum Algebra
Rings and Algebras
Representation Theory
Symplectic Geometry
Spectral Theory
Statistics Theory
Mathematical Physics
Mathematical Physics
Nonlinear Sciences
Adaptation and Self-Organizing Systems
Chaotic Dynamics
Cellular Automata and Lattice Gases
Pattern Formation and Solitons
Exactly Solvable and Integrable Systems
Nuclear Experiment
Nuclear Experiment
Nuclear Theory
Nuclear Theory
Physics
Accelerator Physics
Atmospheric and Oceanic Physics
Applied Physics
Atomic and Molecular Clusters
Atomic Physics
Biological Physics
Chemical Physics
Classical Physics
Computational Physics
Data Analysis, Statistics and Probability
Physics Education
Fluid Dynamics
General Physics
Geophysics
History and Philosophy of Physics
Instrumentation and Detectors
Medical Physics
Optics
Plasma Physics
Popular Physics
Physics and Society
Space Physics
Quantitative Biology
Biomolecules
Cell Behavior
Genomics
Molecular Networks
Neurons and Cognition
Other Quantitative Biology
Populations and Evolution
Quantitative Methods
Subcellular Processes
Tissues and Organs
Quantitative Finance
Computational Finance
Economics
General Finance
Mathematical Finance
Portfolio Management
Pricing of Securities
Risk Management
Statistical Finance
Trading and Market Microstructure
Quantum Physics
Quantum Physics
Statistics
Applications
Computation
Methodology
Machine Learning
Other Statistics
Statistics Theory
Feedback
Online
Stats
Tracking 192,930 papers.