Top 10 Arxiv Papers Today in Data Structures And Algorithms


2.005 Mikeys
#1. Communication-Efficient Weighted Sampling and Quantile Summary for GBDT
Ziyue Huang, Ke Yi
Gradient boosting decision tree (GBDT) is a powerful and widely-used machine learning model, which has achieved state-of-the-art performance in many academic areas and production environment. However, communication overhead is the main bottleneck in distributed training which can handle the massive data nowadays. In this paper, we propose two novel communication-efficient methods over distributed dataset to mitigate this problem, a weighted sampling approach by which we can estimate the information gain over a small subset efficiently, and distributed protocols for weighted quantile problem used in approximate tree learning.
more | pdf | html
Figures
None.
Tweets
arxiv_in_review: #NeurIPS2019 Communication-Efficient Weighted Sampling and Quantile Summary for GBDT. (arXiv:1909.07633v1 [cs\.DS]) https://t.co/OAufxyP3AI
arxiv_cs_LG: Communication-Efficient Weighted Sampling and Quantile Summary for GBDT. Ziyue Huang and Ke Yi https://t.co/DR9TXGXQ0l
okateim: 2019/09/18 [5] Communication-Efficient Weighted Sampling and Quantile Summary for GBDT (https://t.co/0Kq4FR7IWO)
Github
None.
Youtube
None.
Other stats
Sample Sizes : None.
Authors: 2
Total Words: 3029
Unqiue Words: 954

2.002 Mikeys
#2. Efficient Computation of Multi-Modal Public Transit Traffic Assignments using ULTRA
Jonas Sauer, Dorothea Wagner, Tobias Zündorf
We study the problem of computing public transit traffic assignments in a multi-modal setting: Given a public transit timetable, an additional unrestricted transfer mode (in our case walking), and a set of origin-destination pairs, we aim to compute the utilization of every vehicle in the network. While it has been shown that considering unrestricted transfers can significantly improve journeys, computing such journeys efficiently remains algorithmically challenging. Since traffic assignments require the computation of millions of shortest paths, using a multi-modal network has previously not been feasible. A recently proposed approach (ULTRA) enables efficient algorithms with UnLimited TRAnsfers at the cost of a short preprocessing phase. In this work we combine the ULTRA approach with a state-of-the-art assignment algorithm, making multi-modal assignments practical. Careful algorithm engineering results in a new public transit traffic assignment algorithm that even outperforms the algorithm it builds upon, while enabling...
more | pdf | html
Figures
Tweets
okateim: 2019/09/19 [1] Efficient Computation of Multi-Modal Public Transit Traffic Assignments using ULTRA (https://t.co/sELT8ceL5r)
Github
None.
Youtube
None.
Other stats
Sample Sizes : None.
Authors: 3
Total Words: 9920
Unqiue Words: 2376

2.002 Mikeys
#3. When Maximum Stable Set can be solved in FPT time
Édouard Bonnet, Nicolas Bousquet, Stéphan Thomassé, Rémi Watrigant
Maximum Independent Set (MIS for short) is in general graphs the paradigmatic $W[1]$-hard problem. In stark contrast, polynomial-time algorithms are known when the inputs are restricted to structured graph classes such as, for instance, perfect graphs (which includes bipartite graphs, chordal graphs, co-graphs, etc.) or claw-free graphs. In this paper, we introduce some variants of co-graphs with parameterized noise, that is, graphs that can be made into disjoint unions or complete sums by the removal of a certain number of vertices and the addition/deletion of a certain number of edges per incident vertex, both controlled by the parameter. We give a series of FPT Turing-reductions on these classes and use them to make some progress on the parameterized complexity of MIS in $H$-free graphs. We show that for every fixed $t \geqslant 1$, MIS is FPT in $P(1,t,t,t)$-free graphs, where $P(1,t,t,t)$ is the graph obtained by substituting all the vertices of a four-vertex path but one end of the path by cliques of size $t$. We also...
more | pdf | html
Figures
None.
Tweets
okateim: 2019/09/19 [2] When Maximum Stable Set can be solved in FPT time (https://t.co/jVQJtfpZUT)
Github
None.
Youtube
None.
Other stats
Sample Sizes : None.
Authors: 4
Total Words: 23281
Unqiue Words: 3518

2.002 Mikeys
#4. Deterministic algorithms for the Lovasz Local Lemma: simpler, more general, and more parallel
David G. Harris
The Lovasz Local Lemma (LLL) is a keystone principle in probability theory, guaranteeing the existence of configurations which avoid a collection $\mathcal B$ of "bad" events which are mostly independent and have low probability. In its simplest "symmetric" form, it asserts that whenever a bad-event has probability $p$ and affects at most $d$ bad-events, and $e p d < 1$, then a configuration avoiding all $\mathcal B$ exists. A seminal algorithm of Moser & Tardos (2010) gives nearly-automatic randomized algorithms for most constructions based on the LLL. However, deterministic algorithms have lagged behind. We address three specific shortcomings of the prior deterministic algorithms. First, our algorithm applies to the LLL criterion of Shearer (1985); this is more powerful than alternate LLL criteria and also removes a number of nuisance parameters and leads to cleaner and more legible bounds. Second, we provide parallel algorithms with much greater flexibility in the functional form of of the bad-events. Third, we provide a...
more | pdf | html
Figures
None.
Tweets
okateim: 2019/09/19 [4] Deterministic algorithms for the Lovasz Local Lemma: simpler, more general, and more parallel (https://t.co/Go8ufqr0qz)
Github
None.
Youtube
None.
Other stats
Sample Sizes : None.
Authors: 1
Total Words: 0
Unqiue Words: 0

2.002 Mikeys
#5. Hypergraph partitions
Alexander Mishchenko, Vladimir Manuilov, Chao You, Han Yang
We suggest a reduction of the combinatorial problem of hypergraph partitioning to a continuous optimization problem.
more | pdf | html
Figures
None.
Tweets
okateim: 2019/09/19 [3] Hypergraph partitions (https://t.co/anQgvpll3Q)
Github
None.
Youtube
None.
Other stats
Sample Sizes : None.
Authors: 4
Total Words: 1625
Unqiue Words: 737

1.997 Mikeys
#6. A heuristic use of dynamic programming to upperbound treewidth
Hisao Tamaki
For a graph $G$, let $\Omega(G)$ denote the set of all potential maximal cliques of $G$. For each subset $\Omega$ of $\Omega(G)$, let $\tw(G, \Omega)$ denote the smallest $k$ such that there is a tree-decomposition of $G$ of width $k$ whose bags all belong to $\Omega$. Bouchitt\'{e} and Todinca observed in 2001 that $\tw(G, \Omega(G))$ is exactly the treewidth of $G$ and developed a dynamic programming algorithm to compute it. Indeed, their algorithm can readily be applied to an arbitrary non-empty subset $\Omega$ of $\Omega(G)$ and computes $\tw(G, \Omega)$, or reports that it is undefined, in time $|\Omega||V(G)|^{O(1)}$. This efficient tool for computing $\tw(G, \Omega)$ allows us to conceive of an iterative improvement procedure for treewidth upper bounds which maintains, as the current solution, a set of potential maximal cliques rather than a tree-decomposition. We design and implement an algorithm along this approach. Experiments show that our algorithm vastly outperforms previously implemented heuristic algorithms for treewidth.
more | pdf | html
Figures
None.
Tweets
okateim: 2019/09/18 [4] A heuristic use of dynamic programming to upperbound treewidth (https://t.co/3k9HmG76qp)
Github
None.
Youtube
None.
Other stats
Sample Sizes : None.
Authors: 1
Total Words: 0
Unqiue Words: 0

1.997 Mikeys
#7. Streaming PTAS for Constrained k-Means
Dishant Goyal, Ragesh Jaiswal, Amit Kumar
We generalise the results of Bhattacharya et al. (Journal of Computing Systems, 62(1):93-115, 2018) for the list-$k$-means problem defined as -- for a (unknown) partition $X_1, ..., X_k$ of the dataset $X \subseteq \mathbb{R}^d$, find a list of $k$-center sets (each element in the list is a set of $k$ centers) such that at least one of $k$-center sets $\{c_1, ..., c_k\}$ in the list gives an $(1+\varepsilon)$-approximation with respect to the cost function $\min_{\textrm{permutation } \pi} \left[ \sum_{i=1}^{k} \sum_{x \in X_i} ||x - c_{\pi(i)}||^2 \right]$. The list-$k$-means problem is important for the constrained $k$-means problem since algorithms for the former can be converted to PTAS for various versions of the latter. Following are the consequences of our generalisations: - Streaming algorithm: Our $D^2$-sampling based algorithm running in a single iteration allows us to design a 2-pass, logspace streaming algorithm for the list-$k$-means problem. This can be converted to a 4-pass, logspace streaming PTAS for various...
more | pdf | html
Figures
None.
Tweets
okateim: 2019/09/18 [9] Streaming PTAS for Constrained k-Means (https://t.co/ISfJkRpCet)
Github
None.
Youtube
None.
Other stats
Sample Sizes : None.
Authors: 3
Total Words: 0
Unqiue Words: 0

1.997 Mikeys
#8. Combinatorial Algorithms for Edge-Disjoint $T$-Paths and Integer Free Multiflow
Satoru Iwata, Yu Yokoi
Let $G=(V,E)$ be a multigraph with a set $T\subseteq V$ of terminals. A path in $G$ is called a $T$-path if its ends are distinct vertices in $T$ and no internal vertices belong to $T$. In 1978, Mader showed a characterization of the maximum number of edge-disjoint $T$-paths. The original proof was not constructive, and hence it did not suggest an efficient algorithm. In this paper, we provide a combinatorial, deterministic algorithm for finding the maximum number of edge-disjoint $T$-paths. The algorithm adopts an augmenting path approach. More specifically, we introduce a novel concept of augmenting walks in auxiliary labeled graphs to capture a possible augmentation of the number of edge-disjoint $T$-paths. To design a search procedure for an augmenting walk, we introduce blossoms analogously to the matching algorithm of Edmonds (1965), while it is neither a special case nor a generalization of the present problem. When the search procedure terminates without finding an augmenting walk, the algorithm provides a certificate...
more | pdf | html
Figures
None.
Tweets
okateim: 2019/09/18 [1] Combinatorial Algorithms for Edge-Disjoint $T$-Paths and Integer Free Multiflow (https://t.co/kgmKgo276E)
Github
None.
Youtube
None.
Other stats
Sample Sizes : None.
Authors: 2
Total Words: 11710
Unqiue Words: 1824

1.997 Mikeys
#9. Regular Partitions and Their Use in Structural Pattern Recognition
Marco Fiorucci
Recent years are characterized by an unprecedented quantity of available network data which are produced at an astonishing rate by an heterogeneous variety of interconnected sensors and devices. This high-throughput generation calls for the development of new effective methods to store, retrieve, understand and process massive network data. In this thesis, we tackle this challenge by introducing a framework to summarize large graphs based on Szemer\'edi's Regularity Remma (RL), which roughly states that any sufficiently large graph can almost entirely be partitioned into a bounded number of random-like bipartite graphs. The partition resulting from the RL gives rise to a summary, which inherits many of the essential structural properties of the original graph. We first extend an heuristic version of the RL to improve its efficiency and its robustness. We use the proposed algorithm to address graph-based clustering and image segmentation tasks. In the second part of the thesis, we introduce a new heuristic algorithm which is...
more | pdf | html
Figures
None.
Tweets
okateim: 2019/09/18 [11] Regular Partitions and Their Use in Structural Pattern Recognition (https://t.co/Hko8KIEVNC)
Github
None.
Youtube
None.
Other stats
Sample Sizes : None.
Authors: 1
Total Words: 0
Unqiue Words: 0

1.997 Mikeys
#10. Object Reachability via Swaps under Strict and Weak Preferences
Sen Huang, Mingyu Xiao
The \textsc{Housing Market} problem is a widely studied resource allocation problem. In this problem, each agent can only receive a single object and has preferences over all objects. Starting from an initial endowment, we want to reach a certain assignment via a sequence of rational trades. We first consider whether an object is reachable for a given agent under a social network, where a trade between two agents is allowed if they are neighbors in the network and no participant has a deficit from the trade. Assume that the preferences of the agents are strict (no tie among objects is allowed). This problem is polynomially solvable in a star-network and NP-complete in a tree-network. It is left as a challenging open problem whether the problem is polynomially solvable when the network is a path. We answer this open problem positively by giving a polynomial-time algorithm. Then we show that when the preferences of the agents are weak (ties among objects are allowed), the problem becomes NP-hard even when the network is a path. In...
more | pdf | html
Figures
None.
Tweets
okateim: 2019/09/18 [6] Object Reachability via Swaps under Strict and Weak Preferences (https://t.co/pPlQbalhmo)
Github
None.
Youtube
None.
Other stats
Sample Sizes : None.
Authors: 2
Total Words: 15776
Unqiue Words: 1939

About

Assert is a website where the best academic papers on arXiv (computer science, math, physics), bioRxiv (biology), BITSS (reproducibility), EarthArXiv (earth science), engrXiv (engineering), LawArXiv (law), PsyArXiv (psychology), SocArXiv (social science), and SportRxiv (sport research) bubble to the top each day.

Papers are scored (in real-time) based on how verifiable they are (as determined by their Github repos) and how interesting they are (based on Twitter).

To see top papers, follow us on twitter @assertpub_ (arXiv), @assert_pub (bioRxiv), and @assertpub_dev (everything else).

To see beautiful figures extracted from papers, follow us on Instagram.

Tracking 192,929 papers.

Search
Sort results based on if they are interesting or reproducible.
Interesting
Reproducible
Categories
All
Astrophysics
Cosmology and Nongalactic Astrophysics
Earth and Planetary Astrophysics
Astrophysics of Galaxies
High Energy Astrophysical Phenomena
Instrumentation and Methods for Astrophysics
Solar and Stellar Astrophysics
Condensed Matter
Disordered Systems and Neural Networks
Mesoscale and Nanoscale Physics
Materials Science
Other Condensed Matter
Quantum Gases
Soft Condensed Matter
Statistical Mechanics
Strongly Correlated Electrons
Superconductivity
Computer Science
Artificial Intelligence
Hardware Architecture
Computational Complexity
Computational Engineering, Finance, and Science
Computational Geometry
Computation and Language
Cryptography and Security
Computer Vision and Pattern Recognition
Computers and Society
Databases
Distributed, Parallel, and Cluster Computing
Digital Libraries
Discrete Mathematics
Data Structures and Algorithms
Emerging Technologies
Formal Languages and Automata Theory
General Literature
Graphics
Computer Science and Game Theory
Human-Computer Interaction
Information Retrieval
Information Theory
Machine Learning
Logic in Computer Science
Multiagent Systems
Multimedia
Mathematical Software
Numerical Analysis
Neural and Evolutionary Computing
Networking and Internet Architecture
Other Computer Science
Operating Systems
Performance
Programming Languages
Robotics
Symbolic Computation
Sound
Software Engineering
Social and Information Networks
Systems and Control
Economics
Econometrics
General Economics
Theoretical Economics
Electrical Engineering and Systems Science
Audio and Speech Processing
Image and Video Processing
Signal Processing
General Relativity and Quantum Cosmology
General Relativity and Quantum Cosmology
High Energy Physics - Experiment
High Energy Physics - Experiment
High Energy Physics - Lattice
High Energy Physics - Lattice
High Energy Physics - Phenomenology
High Energy Physics - Phenomenology
High Energy Physics - Theory
High Energy Physics - Theory
Mathematics
Commutative Algebra
Algebraic Geometry
Analysis of PDEs
Algebraic Topology
Classical Analysis and ODEs
Combinatorics
Category Theory
Complex Variables
Differential Geometry
Dynamical Systems
Functional Analysis
General Mathematics
General Topology
Group Theory
Geometric Topology
History and Overview
Information Theory
K-Theory and Homology
Logic
Metric Geometry
Mathematical Physics
Numerical Analysis
Number Theory
Operator Algebras
Optimization and Control
Probability
Quantum Algebra
Rings and Algebras
Representation Theory
Symplectic Geometry
Spectral Theory
Statistics Theory
Mathematical Physics
Mathematical Physics
Nonlinear Sciences
Adaptation and Self-Organizing Systems
Chaotic Dynamics
Cellular Automata and Lattice Gases
Pattern Formation and Solitons
Exactly Solvable and Integrable Systems
Nuclear Experiment
Nuclear Experiment
Nuclear Theory
Nuclear Theory
Physics
Accelerator Physics
Atmospheric and Oceanic Physics
Applied Physics
Atomic and Molecular Clusters
Atomic Physics
Biological Physics
Chemical Physics
Classical Physics
Computational Physics
Data Analysis, Statistics and Probability
Physics Education
Fluid Dynamics
General Physics
Geophysics
History and Philosophy of Physics
Instrumentation and Detectors
Medical Physics
Optics
Plasma Physics
Popular Physics
Physics and Society
Space Physics
Quantitative Biology
Biomolecules
Cell Behavior
Genomics
Molecular Networks
Neurons and Cognition
Other Quantitative Biology
Populations and Evolution
Quantitative Methods
Subcellular Processes
Tissues and Organs
Quantitative Finance
Computational Finance
Economics
General Finance
Mathematical Finance
Portfolio Management
Pricing of Securities
Risk Management
Statistical Finance
Trading and Market Microstructure
Quantum Physics
Quantum Physics
Statistics
Applications
Computation
Methodology
Machine Learning
Other Statistics
Statistics Theory
Feedback
Online
Stats
Tracking 192,929 papers.