Top 10 Arxiv Papers Today in Data Structures And Algorithms


2.104 Mikeys
#1. Randomisation Algorithms for Large Sparse Matrices
Kai Puolamäki, Andreas Henelius, Antti Ukkonen
In many domains it is necessary to generate surrogate networks, e.g., for hypothesis testing of different properties of a network. Furthermore, generating surrogate networks typically requires that different properties of the network is preserved, e.g., edges may not be added or deleted and the edge weights may be restricted to certain intervals. In this paper we introduce a novel efficient property-preserving Markov Chain Monte Carlo method termed CycleSampler for generating surrogate networks in which (i) edge weights are constrained to an interval and node weights are preserved exactly, and (ii) edge and node weights are both constrained to intervals. These two types of constraints cover a wide variety of practical use-cases. The method is applicable to both undirected and directed graphs. We empirically demonstrate the efficiency of the CycleSampler method on real-world datasets. We provide an implementation of CycleSampler in R, with parts implemented in C.
more | pdf | html
Figures
None.
Tweets
ComputerPapers: Randomisation Algorithms for Large Sparse Matrices. https://t.co/Z0IoNzPghY
Github

CycleSampler

Repository: cyclesampler
User: edahelsinki
Language: R
Stargazers: 0
Subscribers: 3
Forks: 0
Open Issues: 0
Youtube
None.
Other stats
Sample Sizes : None.
Authors: 3
Total Words: 9147
Unqiue Words: 2114

2.009 Mikeys
#2. Secretary Ranking with Minimal Inversions
Sepehr Assadi, Eric Balkanski, Renato Paes Leme
We study a twist on the classic secretary problem, which we term the secretary ranking problem: elements from an ordered set arrive in random order and instead of picking the maximum element, the algorithm is asked to assign a rank, or position, to each of the elements. The rank assigned is irrevocable and is given knowing only the pairwise comparisons with elements previously arrived. The goal is to minimize the distance of the rank produced to the true rank of the elements measured by the Kendall-Tau distance, which corresponds to the number of pairs that are inverted with respect to the true order. Our main result is a matching upper and lower bound for the secretary ranking problem. We present an algorithm that ranks $n$ elements with only $O(n^{3/2})$ inversions in expectation, and show that any algorithm necessarily suffers $\Omega(n^{3/2})$ inversions when there are $n$ available positions. In terms of techniques, the analysis of our algorithm draws connections to linear probing in the hashing literature, while our lower...
more | pdf | html
Figures
None.
Tweets
DO: Secretary Ranking with Minimal Inversions. https://t.co/K8utC1kZDj
Github
None.
Youtube
None.
Other stats
Sample Sizes : None.
Authors: 3
Total Words: 11286
Unqiue Words: 2553

2.009 Mikeys
#3. Prophet Inequalities for Independent Random Variables from an Unknown Distribution
José R. Correa, Paul Dütting, Felix Fischer, Kevin Schewior
A central object in optimal stopping theory is the single-choice prophet inequality for independent, identically distributed random variables: Given a sequence of random variables $X_1,\dots,X_n$ drawn independently from a distribution $F$, the goal is to choose a stopping time $\tau$ so as to maximize $\alpha$ such that for all distributions $F$ we have $\mathbb{E}[X_\tau] \geq \alpha \cdot \mathbb{E}[\max_tX_t]$. What makes this problem challenging is that the decision whether $\tau=t$ may only depend on the values of the random variables $X_1,\dots,X_t$ and on the distribution $F$. For quite some time the best known bound for the problem was $\alpha\geq1-1/e\approx0.632$ [Hill and Kertz, 1982]. Only recently this bound was improved by Abolhassani et al. [2017], and a tight bound of $\alpha\approx0.745$ was obtained by Correa et al. [2017]. The case where $F$ is unknown, such that the decision whether $\tau=t$ may depend only on the values of the first $t$ random variables but not on $F$, is equally well motivated (e.g., [Azar...
more | pdf | html
Figures
None.
Tweets
DO: Prophet Inequalities for Independent Random Variables from an Unknown Distribution. https://t.co/bxk4XaNx6b
Github
None.
Youtube
None.
Other stats
Sample Sizes : None.
Authors: 4
Total Words: 14832
Unqiue Words: 2274

2.004 Mikeys
#4. Communication-Optimal Distributed Dynamic Graph Clustering
Chun Jiang Zhu, Tan Zhu, Kam-Yiu Lam, Song Han, Jinbo Bi
We consider the problem of clustering graph nodes over large-scale dynamic graphs, such as citation networks, images and web networks, when graph updates such as node/edge insertions/deletions are observed distributively. We propose communication-efficient algorithms for two well-established communication models namely the message passing and the blackboard models. Given a graph with $n$ nodes that is observed at $s$ remote sites over time $[1,t]$, the two proposed algorithms have communication costs $\tilde{O}(ns)$ and $\tilde{O}(n+s)$ ($\tilde{O}$ hides a polylogarithmic factor), almost matching their lower bounds, $\Omega(ns)$ and $\Omega(n+s)$, respectively, in the message passing and the blackboard models. More importantly, we prove that at each time point in $[1,t]$ our algorithms generate clustering quality nearly as good as that of centralizing all updates up to that time and then applying a standard centralized clustering algorithm. We conducted extensive experiments on both synthetic and real-life datasets which...
more | pdf | html
Figures
None.
Tweets
ComputerPapers: Communication-Optimal Distributed Dynamic Graph Clustering. https://t.co/4WGb7FASxD
Github
None.
Youtube
None.
Other stats
Sample Sizes : None.
Authors: 5
Total Words: 9476
Unqiue Words: 2545

2.004 Mikeys
#5. Computing Quartet Distance is Equivalent to Counting 4-Cycles
Bartłomiej Dudek, Paweł Gawrychowski
The quartet distance is a measure of similarity used to compare two unrooted phylogenetic trees on the same set of $n$ leaves, defined as the number of subsets of four leaves related by a different topology in both trees. After a series of previous results, Brodal et al. [SODA 2013] presented an algorithm that computes this number in $\mathcal{O}(nd\log n)$ time, where $d$ is the maximum degree of a node. Our main contribution is a two-way reduction establishing that the complexity of computing the quartet distance between two trees on $n$ leaves is the same, up to polylogarithmic factors, as the complexity of counting 4-cycles in an undirected simple graph with $m$ edges. The latter problem has been extensively studied, and the fastest known algorithm by Vassilevska Williams [SODA 2015] works in $\mathcal{O}(m^{1.48})$ time. In fact, even for the seemingly simpler problem of detecting a 4-cycle, the best known algorithm works in $\mathcal{O}(m^{4/3})$ time, and a conjecture of Yuster and Zwick implies that this might be optimal....
more | pdf | html
Figures
None.
Tweets
ComputerPapers: Computing Quartet Distance is Equivalent to Counting 4-Cycles. https://t.co/84UygnqcrO
Github
None.
Youtube
None.
Other stats
Sample Sizes : None.
Authors: 2
Total Words: 15314
Unqiue Words: 2843

2.003 Mikeys
#6. Submodular Optimization Over Streams with Inhomogeneous Decays
Junzhou Zhao, Shuo Shang, Pinghui Wang, John C. S. Lui, Xiangliang Zhang
Cardinality constrained submodular function maximization, which aims to select a subset of size at most $k$ to maximize a monotone submodular utility function, is the key in many data mining and machine learning applications such as data summarization and maximum coverage problems. When data is given as a stream, streaming submodular optimization (SSO) techniques are desired. Existing SSO techniques can only apply to insertion-only streams where each element has an infinite lifespan, and sliding-window streams where each element has a same lifespan (i.e., window size). However, elements in some data streams may have arbitrary different lifespans, and this requires addressing SSO over streams with inhomogeneous-decays (SSO-ID). This work formulates the SSO-ID problem and presents three algorithms: BasicStreaming is a basic streaming algorithm that achieves an $(1/2-\epsilon)$ approximation factor; HistApprox improves the efficiency significantly and achieves an $(1/3-\epsilon)$ approximation factor; HistStreaming is a streaming...
more | pdf | html
Figures
None.
Tweets
nmfeeds: [O] https://t.co/dowvvw7B1s Submodular Optimization Over Streams with Inhomogeneous Decays. Cardinality constrained submod...
Github
None.
Youtube
None.
Other stats
Sample Sizes : None.
Authors: 5
Total Words: 9249
Unqiue Words: 2089

2.002 Mikeys
#7. Quantum-inspired low-rank stochastic regression with logarithmic dependence on the dimension
András Gilyén, Seth Lloyd, Ewin Tang
We construct an efficient classical analogue of the quantum matrix inversion algorithm (HHL) for low-rank matrices. Inspired by recent work of Tang, assuming length-square sampling access to input data, we implement the pseudoinverse of a low-rank matrix and sample from the solution to the problem $Ax=b$ using fast sampling techniques. We implement the pseudo-inverse by finding an approximate singular value decomposition of $A$ via subsampling, then inverting the singular values. In principle, the approach can also be used to apply any desired "smooth" function to the singular values. Since many quantum algorithms can be expressed as a singular value transformation problem, our result suggests that more low-rank quantum algorithms can be effectively "dequantised" into classical length-square sampling algorithms.
more | pdf | html
Figures
None.
Tweets
lukOlejnik: Interesting result. A specific algorithm for a quantum computer ("low-rank stochastic regression") found to be efficiently solved by non-quantum computers. Not yet understood well which problems are best solved with quantum computers. https://t.co/ugGX81ga6m https://t.co/pr8CAPK6uy
QuantumMemeing: Fig. 1.26: A meme [1]. [1] Quantum Computing Memes for QMA-Complete Teens, Studies in Ancient Greek and Syriac Memeing (2018). Ewin Tang destroying the hopes and dreams of QMLers everywhere. Check out the paper here: https://t.co/jNo9XaPsRI https://t.co/oQwzeVkLyS
fgrosshans: .@ewintang is indeed on her way to dequantize everything: with https://t.co/EZ2M6ccYjk, HHL is dequantized. By the way, sorry for the first name typo and the gender error in the cited tweet. https://t.co/gAQrtarWea
yuyu_hf: また量子情報◯すマンの新作だ、、、 https://t.co/Ebas1imqAV
fgksk: うぉ。第三弾か。https://t.co/oMrsU9OB02
ComputerPapers: Quantum-inspired low-rank stochastic regression with logarithmic dependence on the dimension. https://t.co/jDZCT9f2us
ewintang: New twin papers on quantum-inspired algorithms for low-rank matrix inversion: https://t.co/3T6Xx9G3hy and https://t.co/tUPCxbq0UA
octonion: RT @QuantumHazzard: Wow -- the famous HHL algorithm doesn't actually need a quantum computer! https://t.co/5ZduTl0vcZ Another quantum algor…
makoto0218ne56: RT @fgksk: うぉ。第三弾か。https://t.co/oMrsU9OB02
spidermanzano: RT @QuantumHazzard: Wow -- the famous HHL algorithm doesn't actually need a quantum computer! https://t.co/5ZduTl0vcZ Another quantum algor…
Lukasaoz: RT @QuantumHazzard: Wow -- the famous HHL algorithm doesn't actually need a quantum computer! https://t.co/5ZduTl0vcZ Another quantum algor…
nick_farina: RT @QuantumHazzard: Wow -- the famous HHL algorithm doesn't actually need a quantum computer! https://t.co/5ZduTl0vcZ Another quantum algor…
postquantum: RT @QuantumHazzard: Wow -- the famous HHL algorithm doesn't actually need a quantum computer! https://t.co/5ZduTl0vcZ Another quantum algor…
katelovesneuro: RT @QuantumHazzard: Wow -- the famous HHL algorithm doesn't actually need a quantum computer! https://t.co/5ZduTl0vcZ Another quantum algor…
RichFelker: RT @QuantumHazzard: Wow -- the famous HHL algorithm doesn't actually need a quantum computer! https://t.co/5ZduTl0vcZ Another quantum algor…
durumcrustulum: RT @QuantumHazzard: Wow -- the famous HHL algorithm doesn't actually need a quantum computer! https://t.co/5ZduTl0vcZ Another quantum algor…
rdviii: RT @QuantumHazzard: Wow -- the famous HHL algorithm doesn't actually need a quantum computer! https://t.co/5ZduTl0vcZ Another quantum algor…
gejikeiji: RT @fgksk: うぉ。第三弾か。https://t.co/oMrsU9OB02
rbtcollins: RT @QuantumHazzard: Wow -- the famous HHL algorithm doesn't actually need a quantum computer! https://t.co/5ZduTl0vcZ Another quantum algor…
iKodack: RT @fgksk: うぉ。第三弾か。https://t.co/oMrsU9OB02
jpdowling: RT @QuantumHazzard: Wow -- the famous HHL algorithm doesn't actually need a quantum computer! https://t.co/5ZduTl0vcZ Another quantum algor…
kamakiri_ys: RT @fgksk: うぉ。第三弾か。https://t.co/oMrsU9OB02
MGimenoSegovia: RT @QuantumHazzard: Wow -- the famous HHL algorithm doesn't actually need a quantum computer! https://t.co/5ZduTl0vcZ Another quantum algor…
aquintex: RT @QuantumHazzard: Wow -- the famous HHL algorithm doesn't actually need a quantum computer! https://t.co/5ZduTl0vcZ Another quantum algor…
henryquantum: RT @QuantumHazzard: Wow -- the famous HHL algorithm doesn't actually need a quantum computer! https://t.co/5ZduTl0vcZ Another quantum algor…
BulentKIzILtan: RT @QuantumHazzard: Wow -- the famous HHL algorithm doesn't actually need a quantum computer! https://t.co/5ZduTl0vcZ Another quantum algor…
amy8492: RT @QuantumHazzard: Wow -- the famous HHL algorithm doesn't actually need a quantum computer! https://t.co/5ZduTl0vcZ Another quantum algor…
matt_reagor: RT @QuantumHazzard: Wow -- the famous HHL algorithm doesn't actually need a quantum computer! https://t.co/5ZduTl0vcZ Another quantum algor…
rayohauno: RT @QuantumHazzard: Wow -- the famous HHL algorithm doesn't actually need a quantum computer! https://t.co/5ZduTl0vcZ Another quantum algor…
cocori_aqua: RT @fgksk: うぉ。第三弾か。https://t.co/oMrsU9OB02
cocori_aqua: RT @yuyu_hf: また量子情報◯すマンの新作だ、、、 https://t.co/Ebas1imqAV
ilyaraz2: RT @QuantumHazzard: Wow -- the famous HHL algorithm doesn't actually need a quantum computer! https://t.co/5ZduTl0vcZ Another quantum algor…
TilmaLabs: RT @QuantumHazzard: Wow -- the famous HHL algorithm doesn't actually need a quantum computer! https://t.co/5ZduTl0vcZ Another quantum algor…
yoshi_and_aki: RT @QuantumHazzard: Wow -- the famous HHL algorithm doesn't actually need a quantum computer! https://t.co/5ZduTl0vcZ Another quantum algor…
yoshi_and_aki: RT @fgksk: うぉ。第三弾か。https://t.co/oMrsU9OB02
quantumbtc: RT @fgksk: うぉ。第三弾か。https://t.co/oMrsU9OB02
AlyTarekIbrahim: RT @QuantumHazzard: Wow -- the famous HHL algorithm doesn't actually need a quantum computer! https://t.co/5ZduTl0vcZ Another quantum algor…
ywyamashiro: RT @fgksk: うぉ。第三弾か。https://t.co/oMrsU9OB02
wjmzbmr1: RT @QuantumHazzard: Wow -- the famous HHL algorithm doesn't actually need a quantum computer! https://t.co/5ZduTl0vcZ Another quantum algor…
world_fantasia: RT @fgksk: うぉ。第三弾か。https://t.co/oMrsU9OB02
akinori_kawachi: RT @fgksk: うぉ。第三弾か。https://t.co/oMrsU9OB02
gshartnett: RT @QuantumHazzard: Wow -- the famous HHL algorithm doesn't actually need a quantum computer! https://t.co/5ZduTl0vcZ Another quantum algor…
Deepneuron: RT @QuantumHazzard: Wow -- the famous HHL algorithm doesn't actually need a quantum computer! https://t.co/5ZduTl0vcZ Another quantum algor…
DhammaKimpara: RT @QuantumHazzard: Wow -- the famous HHL algorithm doesn't actually need a quantum computer! https://t.co/5ZduTl0vcZ Another quantum algor…
ons_yy: RT @fgksk: うぉ。第三弾か。https://t.co/oMrsU9OB02
Github
None.
Youtube
None.
Other stats
Sample Sizes : None.
Authors: 3
Total Words: 5438
Unqiue Words: 1428

2.0 Mikeys
#8. A Duality Based 2-Approximation Algorithm for Maximum Agreement Forest
Neil Olver, Frans Schalekamp, Suzanne van der Ster, Leen Stougie, Anke van Zuylen
We give a 2-approximation algorithm for the Maximum Agreement Forest problem on two rooted binary trees. This NP-hard problem has been studied extensively in the past two decades, since it can be used to compute the rooted Subtree Prune-and-Regraft (rSPR) distance between two phylogenetic trees. Our algorithm is combinatorial and its running time is quadratic in the input size. To prove the approximation guarantee, we construct a feasible dual solution for a novel linear programming formulation. In addition, we show this linear program is stronger than previously known formulations, and we give a compact formulation, showing that it can be solved in polynomial time
more | pdf | html
Figures
None.
Tweets
ComputerPapers: A Duality Based 2-Approximation Algorithm for Maximum Agreement Forest. https://t.co/h5pRHTzcWo
Github
None.
Youtube
None.
Other stats
Sample Sizes : None.
Authors: 5
Total Words: 17622
Unqiue Words: 2562

2.0 Mikeys
#9. Robustness of spectral methods for community detection
Ludovic Stephan, Laurent Massoulié
The present work is concerned with community detection. Specifically, we consider a random graph drawn according to the stochastic block model~: its vertex set is partitioned into blocks, or communities, and edges are placed randomly and independently of each other with probability depending only on the communities of their two endpoints. In this context, our aim is to recover the community labels better than by random guess, based only on the observation of the graph. In the sparse case, where edge probabilities are in $O(1/n)$, we introduce a new spectral method based on the distance matrix $D^{(l)}$, where $D^{(l)}_{ij} = 1$ iff the graph distance between $i$ and $j$, noted $d(i, j)$ is equal to $\ell$. We show that when $\ell \sim c\log(n)$ for carefully chosen $c$, the eigenvectors associated to the largest eigenvalues of $D^{(l)}$ provide enough information to perform non-trivial community recovery with high probability, provided we are above the so-called Kesten-Stigum threshold. This yields an efficient algorithm for...
more | pdf | html
Figures
None.
Tweets
ComputerPapers: Robustness of spectral methods for community detection. https://t.co/OYBNeoh6gW
Github
None.
Youtube
None.
Other stats
Sample Sizes : None.
Authors: 2
Total Words: 8125
Unqiue Words: 1814

1.968 Mikeys
#10. Vectorized Character Counting for Faster Pattern Matching
Roman Snytsar
Many modern sequence alignment tools implement fast string matching using the space efficient data structure called FM-index. The succinct nature of this data structure presents unique challenges for the algorithm designers. In this paper, we explore the opportunities for parallelization of the exact and inexact matches and present an efficient SIMD solution for the Occ portion of the algorithm. Our implementation computes all eight Occ values required for the inexact match algorithm step in a single pass. We showcase the algorithm performance in a multi-core genome aligner and discuss effects of the memory prefetch.
more | pdf | html
Figures
None.
Tweets
Github
None.
Youtube
None.
Other stats
Sample Sizes : None.
Authors: 1
Total Words: 3451
Unqiue Words: 1156

About

Assert is a website where the best academic papers on arXiv (computer science, math, physics), bioRxiv (biology), BITSS (reproducibility), EarthArXiv (earth science), engrXiv (engineering), LawArXiv (law), PsyArXiv (psychology), SocArXiv (social science), and SportRxiv (sport research) bubble to the top each day.

Papers are scored (in real-time) based on how verifiable they are (as determined by their Github repos) and how interesting they are (based on Twitter).

To see top papers, follow us on twitter @assertpub_ (arXiv), @assert_pub (bioRxiv), and @assertpub_dev (everything else).

To see beautiful figures extracted from papers, follow us on Instagram.

Tracking 57,756 papers.

Search
Sort results based on if they are interesting or reproducible.
Interesting
Reproducible
Categories
All
Astrophysics
Cosmology and Nongalactic Astrophysics
Earth and Planetary Astrophysics
Astrophysics of Galaxies
High Energy Astrophysical Phenomena
Instrumentation and Methods for Astrophysics
Solar and Stellar Astrophysics
Condensed Matter
Disordered Systems and Neural Networks
Mesoscale and Nanoscale Physics
Materials Science
Other Condensed Matter
Quantum Gases
Soft Condensed Matter
Statistical Mechanics
Strongly Correlated Electrons
Superconductivity
Computer Science
Artificial Intelligence
Hardware Architecture
Computational Complexity
Computational Engineering, Finance, and Science
Computational Geometry
Computation and Language
Cryptography and Security
Computer Vision and Pattern Recognition
Computers and Society
Databases
Distributed, Parallel, and Cluster Computing
Digital Libraries
Discrete Mathematics
Data Structures and Algorithms
Emerging Technologies
Formal Languages and Automata Theory
General Literature
Graphics
Computer Science and Game Theory
Human-Computer Interaction
Information Retrieval
Information Theory
Machine Learning
Logic in Computer Science
Multiagent Systems
Multimedia
Mathematical Software
Numerical Analysis
Neural and Evolutionary Computing
Networking and Internet Architecture
Other Computer Science
Operating Systems
Performance
Programming Languages
Robotics
Symbolic Computation
Sound
Software Engineering
Social and Information Networks
Systems and Control
Economics
Econometrics
General Economics
Theoretical Economics
Electrical Engineering and Systems Science
Audio and Speech Processing
Image and Video Processing
Signal Processing
General Relativity and Quantum Cosmology
General Relativity and Quantum Cosmology
High Energy Physics - Experiment
High Energy Physics - Experiment
High Energy Physics - Lattice
High Energy Physics - Lattice
High Energy Physics - Phenomenology
High Energy Physics - Phenomenology
High Energy Physics - Theory
High Energy Physics - Theory
Mathematics
Commutative Algebra
Algebraic Geometry
Analysis of PDEs
Algebraic Topology
Classical Analysis and ODEs
Combinatorics
Category Theory
Complex Variables
Differential Geometry
Dynamical Systems
Functional Analysis
General Mathematics
General Topology
Group Theory
Geometric Topology
History and Overview
Information Theory
K-Theory and Homology
Logic
Metric Geometry
Mathematical Physics
Numerical Analysis
Number Theory
Operator Algebras
Optimization and Control
Probability
Quantum Algebra
Rings and Algebras
Representation Theory
Symplectic Geometry
Spectral Theory
Statistics Theory
Mathematical Physics
Mathematical Physics
Nonlinear Sciences
Adaptation and Self-Organizing Systems
Chaotic Dynamics
Cellular Automata and Lattice Gases
Pattern Formation and Solitons
Exactly Solvable and Integrable Systems
Nuclear Experiment
Nuclear Experiment
Nuclear Theory
Nuclear Theory
Physics
Accelerator Physics
Atmospheric and Oceanic Physics
Applied Physics
Atomic and Molecular Clusters
Atomic Physics
Biological Physics
Chemical Physics
Classical Physics
Computational Physics
Data Analysis, Statistics and Probability
Physics Education
Fluid Dynamics
General Physics
Geophysics
History and Philosophy of Physics
Instrumentation and Detectors
Medical Physics
Optics
Plasma Physics
Popular Physics
Physics and Society
Space Physics
Quantitative Biology
Biomolecules
Cell Behavior
Genomics
Molecular Networks
Neurons and Cognition
Other Quantitative Biology
Populations and Evolution
Quantitative Methods
Subcellular Processes
Tissues and Organs
Quantitative Finance
Computational Finance
Economics
General Finance
Mathematical Finance
Portfolio Management
Pricing of Securities
Risk Management
Statistical Finance
Trading and Market Microstructure
Quantum Physics
Quantum Physics
Statistics
Applications
Computation
Methodology
Machine Learning
Other Statistics
Statistics Theory
Feedback
Online
Stats
Tracking 57,756 papers.