We present new techniques for automatically constructing probabilistic
programs for data analysis, interpretation, and prediction. These techniques
work with probabilistic domain-specific data modeling languages that capture
key properties of a broad class of data generating processes, using Bayesian
inference to synthesize probabilistic programs in these modeling languages
given observed data. We provide a precise formulation of Bayesian synthesis for
automatic data modeling that identifies sufficient conditions for the resulting
synthesis procedure to be sound. We also derive a general class of synthesis
algorithms for domain-specific languages specified by probabilistic
context-free grammars and establish the soundness of our approach for these
languages. We apply the techniques to automatically synthesize probabilistic
programs for time series data and multivariate tabular data. We show how to
analyze the structure of the synthesized programs to compute, for key
qualitative properties of interest, the probability that the...

more |
pdf
| html
None.

BrundageBot:
Bayesian Synthesis of Probabilistic Programs for Automatic Data Modeling. Feras A. Saad, Marco F. Cusumano-Towner, Ulrich Schaechtle, Martin C. Rinard, and Vikash K. Mansinghka https://t.co/dAUvfwDbOB

SciFi:
Bayesian Synthesis of Probabilistic Programs for Automatic Data Modeling. https://t.co/E2j0wVcaFM

None.

None.

Sample Sizes : None.

Authors: 5

Total Words: 21264

Unqiue Words: 4721

A computer program is a set of electronic instructions executed from within
the computer memory by the computer central processing unit. Its purpose is to
control the functionalities of the computer allowing it to perform various
tasks. Basically, a computer program is written by humans using a programming
language. A programming language is the set of grammatical rules and vocabulary
that governs the correct writing of a computer program. In practice, the
majority of the existing programming languages are written in English-speaking
countries and thus they all use the English language to express their syntax
and vocabulary. However, many other programming languages were written in
non-English languages, for instance, the Chinese BASIC, the Chinese Python, the
Russian Rapira, and the Arabic Loughaty. This paper discusses the design and
implementation of a new programming language, called Phoenix. It is a
General-Purpose, High-Level, Imperative, Object-Oriented, and Compiled Arabic
programming language that uses the Arabic language...

more |
pdf
| html
None.

None.

None.

Sample Sizes : None.

Authors: 1

Total Words: 0

Unqiue Words: 0

Inference metaprogramming enables effective probabilistic programming by
supporting the decomposition of executions of probabilistic programs into
subproblems and the deployment of hybrid probabilistic inference algorithms
that apply different probabilistic inference algorithms to different
subproblems. We introduce the concept of independent subproblem inference (as
opposed to entangled subproblem inference in which the subproblem inference
algorithm operates over the full program trace) and present a mathematical
framework for studying convergence properties of hybrid inference algorithms
that apply different Markov-Chain Monte Carlo algorithms to different parts of
the inference problem. We then use this formalism to prove asymptotic
convergence results for probablistic programs with inference metaprogramming.
To the best of our knowledge this is the first asymptotic convergence result
for hybrid probabilistic inference algorithms defined by (subproblem-based)
inference metaprogramming.

more |
pdf
| html
None.

None.

None.

Sample Sizes : None.

Authors: 3

Total Words: 0

Unqiue Words: 0

Common programming tools, like compilers, debuggers, and IDEs, crucially rely
on the ability to analyse program code to reason about its behaviour and
properties. There has been a great deal of work on verifying compilers and
static analyses, but far less on verifying dynamic analyses such as program
slicing. Recently, a new mathematical framework for slicing was introduced in
which forward and backward slicing are dual in the sense that they constitute a
Galois connection. This paper formalises forward and backward dynamic slicing
algorithms for a simple imperative programming language, and formally verifies
their duality using the Coq proof assistant.

more |
pdf
| html
None.

None.

None.

Sample Sizes : None.

Authors: 2

Total Words: 12488

Unqiue Words: 2542

One of the main purposes of a computer is automation. In fact, automation is
the technology by which a manual task is performed with minimum or zero human
assistance. Over the years, automation has proved to reduce operation cost and
maintenance time in addition to increase system productivity, reliability, and
performance. Today, most computerized automation are done by a computer program
which is a set of instructions executed from within the computer memory by the
computer central processing unit to control the computers various operations.
This paper proposes a compiler program that automates the validation and
translation of input documents written in the Arabic language into XML output
files that can be read by a computer. The input document is by nature
unstructured and in plain-text as it is written by people manually; while, the
generated output is a structured machine-readable XML file. The proposed
compiler program is actually a part of a bigger project related to digital
government and is meant to automate the...

more |
pdf
| html
None.

None.

Sample Sizes : None.

Authors: 1

Total Words: 2891

Unqiue Words: 983

Effective program synthesis requires a way to minimise the number of
candidate programs being searched. A type signature, for example, places some
small restrictions on the structure of potential candidates. We introduce and
motivate a distilled program synthesis problem where a type signature is the
only machine-readable information available, but does not sufficiently minimise
the search space. To address this, we develop a system of property relations
that can be used to flexibly encode and query information that was not
previously available to the synthesiser. Our experience using these tools has
been positive: by encoding simple properties and by using a minimal set of
synthesis primitives, we have been able to synthesise complex programs in novel
contexts

more |
pdf
| html
None.

None.

None.

Sample Sizes : None.

Authors: 2

Total Words: 1776

Unqiue Words: 802

We revisit occurrence typing, a technique to refine the type of variables
occurring in type-cases and, thus, capture some programming patterns used in
untyped languages. Although occurrence typing was tied from its inception to
set-theoretic types-union types, in particular-it never fully exploited the
capabilities of these types. Here we show how, by using set-theoretic types, it
is possible to develop a general typing framemork that encompasses and
generalizes several aspects of current occurrence typing proposals and that can
be applied to tackle other problems such as the inference of intersection types
for functions and the optimization of the compilation of gradually typed
languages.

more |
pdf
| html
None.

None.

None.

Sample Sizes : None.

Authors: 4

Total Words: 32402

Unqiue Words: 3986

Concolic testing is a test generation technique which works effectively by
integrating random testing generation and symbolic execution. Existing concolic
testing engines focus on numeric programs. Heap-manipulating programs make
extensive use of complex heap objects like trees and lists. Testing such
programs is challenging due to multiple reasons. Firstly, test inputs for such
program are required to satisfy non-trivial constraints which must be specified
precisely. Secondly, precisely encoding and solving path conditions in such
programs are challenging and often expensive. In this work, we propose the
first concolic testing engine called CSF for heap-manipulating programs based
on separation logic. CSF effectively combines specification-based testing and
concolic execution for test input generation. It is evaluated on a set of
challenging heap-manipulating programs. The results show that CSF generates
valid test inputs with high coverage efficiently. Furthermore, we show that CSF
can be potentially used in combination with...

more |
pdf
| html
None.

None.

None.

Sample Sizes : None.

Authors: 4

Total Words: 9446

Unqiue Words: 2670

Guarded Kleene Algebra with Tests (GKAT) is a variation on Kleene Algebra
with Tests (KAT) that arises by restricting the union ($+$) and iteration ($*$)
operations from KAT to predicate-guarded versions. We develop the (co)algebraic
theory of GKAT and show how it can be efficiently used to reason about
imperative programs. In contrast to KAT, whose equational theory is
PSPACE-complete, we show that the equational theory of GKAT is (almost) linear
time. We also provide a full Kleene theorem and prove completeness for an
analogue of Salomaa's axiomatization of Kleene Algebra.

more |
pdf
| html
None.

None.

None.

Sample Sizes : None.

Authors: 6

Total Words: 24768

Unqiue Words: 4335

The problem of resolving virtual method and interface calls in
object-oriented languages has been a long standing challenge to the program
analysis community. The complexities are due to various reasons, such as
increased levels of class inheritance and polymorphism in large programs. In
this paper, we propose a new approach called type flow analysis that represent
propagation of type information between program variables by a group of
relations without the help of a heap abstraction. We prove that regarding the
precision on reachability of class information to a variable, our method
produces results equivalent to that one can derive from a points-to analysis.
Moreover, in practice, our method consumes lower time and space usage, as
supported by the experimental results.

more |
pdf
| html
None.

None.

None.

Sample Sizes : None.

Authors: 2

Total Words: 0

Unqiue Words: 0

Assert is a website where the best academic papers on arXiv (computer science, math, physics), bioRxiv (biology), BITSS (reproducibility), EarthArXiv (earth science), engrXiv (engineering), LawArXiv (law), PsyArXiv (psychology), SocArXiv (social science), and SportRxiv (sport research) bubble to the top each day.

Papers are scored (in real-time) based on how verifiable they are (as determined by their Github repos) and how interesting they are (based on Twitter).

To see top papers, follow us on twitter @assertpub_ (arXiv), @assert_pub (bioRxiv), and @assertpub_dev (everything else).

To see beautiful figures extracted from papers, follow us on Instagram.

*Tracking 158,360 papers.*

Sort results based on if they are interesting or reproducible.

Interesting

Reproducible