We present the design and methodology for the large scale hybrid paper
recommender system used by Microsoft Academic. The system provides
recommendations for approximately 160 million English research papers and
patents. Our approach handles incomplete citation information while also
alleviating the cold-start problem that often affects other recommender
systems. We use the Microsoft Academic Graph (MAG), titles, and available
abstracts of research papers to build a recommendation list for all documents,
thereby combining co-citation and content based approaches. Tuning system
parameters also allows for blending and prioritization of each approach which,
in turn, allows us to balance paper novelty versus authority in recommendation
results. We evaluate the generated recommendations via a user study of 40
participants, with over 2400 recommendation pairs graded and discuss the
quality of the results using P@10 and nDCG scores. We see that there is a
strong correlation between participant scores and the similarity rankings
produced...

We propose the use of beamplots - which can be produced by using the R
package BibPlots and WoS downloads - as a preferred alternative to h index
values for assessing single researchers.

Nowadays, Machine Learning (ML) is seen as the universal solution to improve
the effectiveness of information retrieval (IR) methods. However, while
mathematics is a precise and accurate science, it is usually expressed by less
accurate and imprecise descriptions, contributing to the relative dearth of
machine learning applications for IR in this domain. Generally, mathematical
documents communicate their knowledge with an ambiguous, context-dependent, and
non-formal language. Given recent advances in ML, it seems canonical to apply
ML techniques to represent and retrieve mathematics semantically. In this work,
we apply popular text embedding techniques to the arXiv collection of STEM
documents and explore how these are unable to properly understand mathematics
from that corpus. In addition, we also investigate the missing aspects that
would allow mathematics to be learned by computers.

