Top 10 Biorxiv Papers Today in Bioinformatics


2.065 Mikeys
#1. Fully-sensitive Seed Finding in Sequence Graphs Using a Hybrid Index
Ali Ghaffaari, Tobias Marschall
Motivation: Sequence graphs are versatile data structures that are, for instance, able to represent the genetic variation found in a population and to facilitate genome assembly. Read mapping to sequence graphs constitutes an important step for many applications and is usually done by first finding exact seed matches, which are then extended by alignment. Existing methods for finding seed hits prune the graph in complex regions, leading to a loss of information especially in highly polymorphic regions of the genome. While such complex graph structures can indeed lead to a combinatorial explosion of possible alleles, the query set of reads from a diploid individual realizes only two alleles per locus - a property that is not exploited by extant methods. Results: We present the P an-genome S eed I ndex (PSI) , a fully-sensitive hybrid method for seed finding, which takes full advantage of this property by combining an index over selected paths in the graph with an index over the query reads. This enables PSI to find all seeds while...
more | pdf
Figures
None.
Tweets
biorxivpreprint: Fully-sensitive Seed Finding in Sequence Graphs Using a Hybrid Index https://t.co/iKAt0IzihP #bioRxiv
biorxiv_bioinfo: Fully-sensitive Seed Finding in Sequence Graphs Using a Hybrid Index https://t.co/XYIi4wwOGP #biorxiv_bioinfo
razoralign: PSI : Fully-sensitive Seed Finding in Sequence Graphs Using a Hybrid Index https://t.co/2MC0ncPq8b https://t.co/J6sBRxIkRl
claczny: RT @biorxiv_bioinfo: Fully-sensitive Seed Finding in Sequence Graphs Using a Hybrid Index https://t.co/XYIi4wwOGP #biorxiv_bioinfo
GUILLAUMEGAUTRE: RT @biorxiv_bioinfo: Fully-sensitive Seed Finding in Sequence Graphs Using a Hybrid Index https://t.co/XYIi4wwOGP #biorxiv_bioinfo
TaherMun: RT @biorxiv_bioinfo: Fully-sensitive Seed Finding in Sequence Graphs Using a Hybrid Index https://t.co/XYIi4wwOGP #biorxiv_bioinfo
hdeshmuk: RT @biorxiv_bioinfo: Fully-sensitive Seed Finding in Sequence Graphs Using a Hybrid Index https://t.co/XYIi4wwOGP #biorxiv_bioinfo
leandro_ishi: RT @biorxiv_bioinfo: Fully-sensitive Seed Finding in Sequence Graphs Using a Hybrid Index https://t.co/XYIi4wwOGP #biorxiv_bioinfo
Github

Pan-genome Seed Index

Repository: psi
User: cartoonist
Language: C++
Stargazers: 0
Subscribers: 4
Forks: 0
Open Issues: 8
Youtube
None.
Other stats
Sample Sizes : None.
Authors: 2
Total Words: 9184
Unqiue Words: 2453

2.032 Mikeys
#2. Affordable Remote Monitoring of Plant Growth and Facilities using Raspberry Pi Computers
Brandin Grindstaff, Makenzie E Mabry, Paul D Blischak, Micheal Quinn, J Chris Pires
Premise of the study: Environmentally controlled facilities, such as growth chambers, are essential tools for experimental research. Automated remote monitoring of such facilities with low-cost hardware can greatly improve both the reproducibility and the accurate maintenance of their conditions. Methods and Results: Using a Raspberry Pi computer, open-source software, environmental sensors, and a camera, we developed a cost-effective system for monitoring growth chamber conditions, which we have called GMpi. Coupled with our software, GMpi_Pack, our setup automates sensor readings, photography, alerts when conditions fall out of range, and data transfer to cloud storage services. Conclusions: The GMpi offers low-cost access to environmental data logging, improving reproducibility of experiments, as well as reinforcing the stability of controlled environmental facilities. The device is also flexible and scalable, allowing customization and expansion to include other features such as machine vision.
more | pdf
Figures
Tweets
biorxivpreprint: Affordable Remote Monitoring of Plant Growth and Facilities using Raspberry Pi Computers https://t.co/YgH1cSRrqc #bioRxiv
biorxiv_bioinfo: Affordable Remote Monitoring of Plant Growth and Facilities using Raspberry Pi Computers https://t.co/YcgJEBVVXB #biorxiv_bioinfo
surt_lab: RT @biorxivpreprint: Affordable Remote Monitoring of Plant Growth and Facilities using Raspberry Pi Computers https://t.co/YgH1cSRrqc #bio…
BCHEPPdepthead: RT @biorxivpreprint: Affordable Remote Monitoring of Plant Growth and Facilities using Raspberry Pi Computers https://t.co/YgH1cSRrqc #bio…
phylogeo: RT @biorxiv_bioinfo: Affordable Remote Monitoring of Plant Growth and Facilities using Raspberry Pi Computers https://t.co/YcgJEBVVXB #bio…
SpicyBotrytis: RT @biorxivpreprint: Affordable Remote Monitoring of Plant Growth and Facilities using Raspberry Pi Computers https://t.co/YgH1cSRrqc #bio…
Yvan2935: RT @biorxiv_bioinfo: Affordable Remote Monitoring of Plant Growth and Facilities using Raspberry Pi Computers https://t.co/YcgJEBVVXB #bio…
Young_Eukaryote: RT @biorxivpreprint: Affordable Remote Monitoring of Plant Growth and Facilities using Raspberry Pi Computers https://t.co/YgH1cSRrqc #bio…
helenbrabs: RT @biorxivpreprint: Affordable Remote Monitoring of Plant Growth and Facilities using Raspberry Pi Computers https://t.co/YgH1cSRrqc #bio…
jnmaloof: RT @biorxivpreprint: Affordable Remote Monitoring of Plant Growth and Facilities using Raspberry Pi Computers https://t.co/YgH1cSRrqc #bio…
nutseba: RT @biorxivpreprint: Affordable Remote Monitoring of Plant Growth and Facilities using Raspberry Pi Computers https://t.co/YgH1cSRrqc #bio…
LFaino: RT @biorxiv_bioinfo: Affordable Remote Monitoring of Plant Growth and Facilities using Raspberry Pi Computers https://t.co/YcgJEBVVXB #bio…
Adam_Mott: RT @biorxivpreprint: Affordable Remote Monitoring of Plant Growth and Facilities using Raspberry Pi Computers https://t.co/YgH1cSRrqc #bio…
Dani_M_Stevens: RT @biorxiv_bioinfo: Affordable Remote Monitoring of Plant Growth and Facilities using Raspberry Pi Computers https://t.co/YcgJEBVVXB #bio…
poppingjun: RT @biorxivpreprint: Affordable Remote Monitoring of Plant Growth and Facilities using Raspberry Pi Computers https://t.co/YgH1cSRrqc #bio…
itsjeffreyy76: RT @biorxiv_bioinfo: Affordable Remote Monitoring of Plant Growth and Facilities using Raspberry Pi Computers https://t.co/YcgJEBVVXB #bio…
hdeshmuk: RT @biorxiv_bioinfo: Affordable Remote Monitoring of Plant Growth and Facilities using Raspberry Pi Computers https://t.co/YcgJEBVVXB #bio…
TongZhou2017: RT @biorxivpreprint: Affordable Remote Monitoring of Plant Growth and Facilities using Raspberry Pi Computers https://t.co/YgH1cSRrqc #bio…
cnachteg: RT @biorxiv_bioinfo: Affordable Remote Monitoring of Plant Growth and Facilities using Raspberry Pi Computers https://t.co/YcgJEBVVXB #bio…
FredZhou91: RT @biorxiv_bioinfo: Affordable Remote Monitoring of Plant Growth and Facilities using Raspberry Pi Computers https://t.co/YcgJEBVVXB #bio…
Github

A growth chamber sensing system using the Raspberry Pi as a platform

Repository: GMpi
User: BrandinGrindstaff
Language: Python
Stargazers: 3
Subscribers: 3
Forks: 2
Open Issues: 0
Youtube
None.
Other stats
Sample Sizes : None.
Authors: 5
Total Words: 6767
Unqiue Words: 1922

2.025 Mikeys
#3. Cell BLAST: Searching large-scale scRNA-seq database via unbiased cell embedding
Zhi-Jie Cao, Lin Wei, Shen Lu, De-Chang Yang, Ge Gao
Large amount of single-cell RNA sequencing data produced by various technologies is accumulating rapidly. An efficient cell querying method facilitates integrating existing data and annotating new data. Here we present a novel cell querying method Cell BLAST based on deep generative modeling, together with a well-curated reference database and a user-friendly Web interface at http://cblast.gao-lab.org, as an accurate and robust solution to large-scale cell querying.
more | pdf
Figures
Tweets
biorxivpreprint: Cell BLAST: Searching large-scale scRNA-seq database via unbiased cell embedding https://t.co/2TdmAUpfdj #bioRxiv
biorxiv_bioinfo: Cell BLAST: Searching large-scale scRNA-seq database via unbiased cell embedding https://t.co/lFPlwwOW9P #biorxiv_bioinfo
razoralign: Cell BLAST: Searching large-scale scRNA-seq database via unbiased cell embedding https://t.co/SuimjdGnjy https://t.co/meEYsu8XEA
kpaszkiewicz: RT @biorxiv_bioinfo: Cell BLAST: Searching large-scale scRNA-seq database via unbiased cell embedding https://t.co/lFPlwwOW9P #biorxiv_bio…
PLURplus: RT @biorxivpreprint: Cell BLAST: Searching large-scale scRNA-seq database via unbiased cell embedding https://t.co/2TdmAUpfdj #bioRxiv
artofbiology: RT @biorxivpreprint: Cell BLAST: Searching large-scale scRNA-seq database via unbiased cell embedding https://t.co/2TdmAUpfdj #bioRxiv
ansuman90: RT @biorxivpreprint: Cell BLAST: Searching large-scale scRNA-seq database via unbiased cell embedding https://t.co/2TdmAUpfdj #bioRxiv
TongZhou2017: RT @biorxiv_bioinfo: Cell BLAST: Searching large-scale scRNA-seq database via unbiased cell embedding https://t.co/lFPlwwOW9P #biorxiv_bio…
Github
None.
Youtube
None.
Other stats
Sample Sizes : None.
Authors: 5
Total Words: 12962
Unqiue Words: 3490

2.013 Mikeys
#4. COMICS: A pipeline for the composite identification of selection across multiple genomic scans using Invariant Coordinate Selection in R
Joel T Nelson, Omar E Cornejo
Identifying loci that are under selection versus those that are evolving neutrally is a common challenge in evolutionary genetics. Moreover, with the increase in sequence data, genomic studies have begun to incorporate the use of multiple methods to identify candidate loci under selection. Composite methods are usually implemented to transform the data into a multi-dimensional scatter where outliers are identified using a distance metric, the most common being Mahalanobis distance. However, studies have shown that the power of Mahalanobis distance reduces as the number of dimensions increases. Because the number of methods for detecting selection continue to grow, this is an undesirable feature of Mahalanobis distance. Other composite methods such as invariant coordinate selection (ICS) have proven to be a robust method for identifying outliers in multi-dimensional space; though, this method has not been implemented for genomic data. Here we use simulated genomic data to test the performance of ICS in identifying outlier loci from...
more | pdf
Figures
Tweets
biorxivpreprint: COMICS: A pipeline for the composite identification of selection across multiple genomic scans using Invariant Coordinate Selection in R https://t.co/xWjcr54WxB #bioRxiv
biorxiv_bioinfo: COMICS: A pipeline for the composite identification of selection across multiple genomic scans using Invariant Coordinate Selection in R https://t.co/pwt0Am5eEH #biorxiv_bioinfo
Github

ICS pipeline for genomic data using COMICS

Repository: COMICS
User: JTNelsonWSU
Language: R
Stargazers: 0
Subscribers: 1
Forks: 0
Open Issues: 1
Youtube
None.
Other stats
Sample Sizes : None.
Authors: 2
Total Words: 5386
Unqiue Words: 1839

2.011 Mikeys
#5. On Transformative Adaptive Activation Functions in Neural Networks for Gene Expression Inference
Vladimir Kunc, Jiri Klema
Motivation: Gene expression profiling was made cheaper by the NIH LINCS program that profiles only cca 1,000 selected landmark genes and uses them to reconstruct the whole profile. The D-GEX method employs neural networks to infer the whole profile. However, the original D-GEX can be further significantly improved. Results: We have analyzed the D-GEX method and determined that the inference can be improved using a logistic sigmoid activation function instead of the hyperbolic tangent. Moreover, we propose a novel transformative adaptive activation function that improves the gene expression inference even further and which generalizes several existing adaptive activation functions. Our improved neural network achieves average mean absolute error of 0.1340 which is a significant improvement over our reimplementation of the original D-GEX which achieves average mean absolute error 0.1637
more | pdf
Figures
None.
Tweets
biorxivpreprint: On Transformative Adaptive Activation Functions in Neural Networks for Gene Expression Inference https://t.co/49gMLyjFFb #bioRxiv
biorxiv_bioinfo: On Transformative Adaptive Activation Functions in Neural Networks for Gene Expression Inference https://t.co/F8LLSH0a1S #biorxiv_bioinfo
razoralign: On Transformative Adaptive Activation Functions in Neural Networks for Gene Expression Inference https://t.co/H95iZobPa0
Github
None.
Youtube
None.
Other stats
Sample Sizes : None.
Authors: 2
Total Words: 6837
Unqiue Words: 2034

2.01 Mikeys
#6. MethylCal: Bayesian calibration of methylation levels
Eguzkine Ochoa, Verena Zuber, Nora Fernandez-Jimenez, Jose Ramon Bilbao, Graeme R. Clark, Eamonn R. Maher, Leonardo Bottolo
Bisulfite amplicon sequencing has become the primary choice for single-base methylation quantification of multiple targets in parallel. The main limitation of this technology is a preferential amplification of an allele and strand in the PCR due to methylation state. This effect, known as "PCR bias", causes inaccurate estimation of the methylation levels and calibration methods based on standard controls have been proposed to correct for it. Here, we present a Bayesian calibration tool, MethylCal, which can analyse jointly all CpGs within a DMR or CpG island, avoiding "one-at-a-time" CpG calibration. This enables more precise modeling of the methylation levels observed in the standard controls. It also provides accurate predictions of the methylation levels not considered in the controlled experiment, a feature that is paramount in the derivation of the corrected methylation degree. We tested the proposed method on eight independent assays (two CpG islands and six imprinting DMRs) and demonstrated its benefits, including the...
more | pdf
Figures
None.
Tweets
biorxivpreprint: MethylCal: Bayesian calibration of methylation levels https://t.co/YXqL2NFuhI #bioRxiv
biorxiv_bioinfo: MethylCal: Bayesian calibration of methylation levels https://t.co/MtiZXOMeMy #biorxiv_bioinfo
hectorleoh: RT @biorxiv_bioinfo: MethylCal: Bayesian calibration of methylation levels https://t.co/MtiZXOMeMy #biorxiv_bioinfo
Github
None.
Youtube
None.
Other stats
Sample Sizes : None.
Authors: 7
Total Words: 10710
Unqiue Words: 2622

2.009 Mikeys
#7. Self-Organizing Map Methodology for Sorting Differential Expression Data of MMP-9 Inhibition
Rachel St Clair, Michael Teti, Ania Knapinska, Gregg Fields, William Hahn, Elan Barenholtz
An unsupervised machine-learning model, based on a self-organizing map (SOM), was employed to extract suggested target genes from DESeq2 differential expression analysis data. Such methodology was tested on matrixmetalloproteinase 9 (MMP9) inhibitors. The model generated information on several novel gene hits that may be regulated by MMP-9, suggesting the self-organizing map method may serve as a useful analytic tool in degradomics research for further differential expression data analysis. Original data was generated from a previous study, which consisted of quantitative measures in changes of levels of gene expression from 32,000 genes in four different conditions of stimulated T-cells treated with an MMP-9 inhibitor. Since intracellular target of MMP-9 are not yet well characterized, the functional enrichment analysis program, WebGestalt, was used for validation of the SOM identified regulated genes. The proposed data analysis method indicated MMP-9's prominent role in biological regulatory and metabolic processes as major...
more | pdf
Figures
Tweets
biorxivpreprint: Self-Organizing Map Methodology for Sorting Differential Expression Data of MMP-9 Inhibition https://t.co/SQ07Gtc43G #bioRxiv
biorxiv_bioinfo: Self-Organizing Map Methodology for Sorting Differential Expression Data of MMP-9 Inhibition https://t.co/113UIdT5EA #biorxiv_bioinfo
Github
None.
Youtube
None.
Other stats
Sample Sizes : None.
Authors: 6
Total Words: 2304
Unqiue Words: 973

2.008 Mikeys
#8. A generic multivariate framework for the integration of microbiome longitudinal studies with other data types
Antoine Bodein, Olivier Chapleur, Arnaud Droit, Kim-Anh Le Cao
Simultaneous profiling of biospecimens using different technological platforms enables the study of many data types, encompassing microbial communities, omics and meta-omics as well as clinical or chemistry variables. Reduction in costs now enables longitudinal or time course studies on the same bi- ological material or system. The overall aim of such studies is to investigate relationships between these longitudinal measures in a holistic manner to further decipher the link between molecular mechanisms and microbial community structures, or host-microbiota interactions. However, analytical frameworks enabling an integrated analysis between microbial communities and other types of biological, clinical or phenotypic data are still at their infancy. The challenges include few time points that may be unevenly spaced and unmatched between different data types, a small number of unique individual biospecimens and high individual variability. Those challenges are further exacerbated by the inherent characteristics of microbial...
more | pdf
Figures
Tweets
biorxivpreprint: A generic multivariate framework for the integration of microbiome longitudinal studies with other data types https://t.co/tPzKNnsp6j #bioRxiv
biorxiv_bioinfo: A generic multivariate framework for the integration of microbiome longitudinal studies with other data types https://t.co/oguxVn7wE2 #biorxiv_bioinfo
Github
None.
Youtube
None.
Other stats
Sample Sizes : None.
Authors: 4
Total Words: 9336
Unqiue Words: 2746

2.006 Mikeys
#9. Prediction of inter-residue contacts with DeepMetaPSICOV in CASP13
Shaun Mathew Kandathil, Joe G Greener, David T Jones
In this article, we describe our efforts in contact prediction in the CASP13 experiment. We employed a new deep learning-based contact prediction tool, DeepMetaPSICOV (or DMP for short), together with new methods and data sources for alignment generation. DMP evolved from MetaPSICOV and DeepCov and combines the input feature sets used by these methods as input to a deep, fully convolutional residual neural network. We also improved our method for multiple sequence alignment generation and included metagenomic sequences in the search. We discuss successes and failures of our approach and identify areas where further improvements may be possible. DMP is freely available at: https://github.com/psipred/DeepMetaPSICOV.
more | pdf
Figures
Tweets
razoralign: Prediction of inter-residue contacts with DeepMetaPSICOV in CASP13 https://t.co/RBINDVoEXm
Github

Deep ResNet-based protein contact prediction

Repository: DeepMetaPSICOV
User: psipred
Language: C
Stargazers: 2
Subscribers: 2
Forks: 0
Open Issues: 0
Youtube
None.
Other stats
Sample Sizes : None.
Authors: 3
Total Words: 6221
Unqiue Words: 2057

2.004 Mikeys
#10. Automated reconstruction of all gene histories in large bacterial pangenome datasets and search for co-evolved gene modules with Pantagruel
Florent Lassalle, Xavier Didelot, Elita Jauneikaite, Philippe Veber
The availability of bacterial pangenome data grows exponentially, requiring efficient new methods of analysis. Currently popular approaches for the fast comparison of genomes have the drawback of not being based on explicit evolutionary models of diversification. Making sense of bacterial genome evolution, and notably in the accessory genome, requires however to take into account the complex processes by which the genomes evolve. Here we present the Pantagruel bioinformatic software pipeline, which enables the construction of a complete bacterial pangenome database geared towards the inference of gene evolution scenarios using gene tree/species tree reconciliation. Pantagruel is a modular pipeline that combines state-of-the-art external software with unique new methods. It can be executed with no supervision to perform a standard pangenome analysis, or be configured by advanced users to integrate methods of choice. A relational database underlies its data structure, allowing efficient retrieval of the large-scale data generated by...
more | pdf
Figures
Tweets
Github

a pipeline for reconciliation of phylogenetic histories within a bacterial pangenome

Repository: pantagruel
User: flass
Language: Python
Stargazers: 5
Subscribers: 3
Forks: 0
Open Issues: 1
Youtube
None.
Other stats
Sample Sizes : None.
Authors: 4
Total Words: 6176
Unqiue Words: 2028

About

Assert is a website where the best academic papers on arXiv (computer science, math, physics), bioRxiv (biology), BITSS (reproducibility), EarthArXiv (earth science), engrXiv (engineering), LawArXiv (law), PsyArXiv (psychology), SocArXiv (social science), and SportRxiv (sport research) bubble to the top each day.

Papers are scored (in real-time) based on how verifiable they are (as determined by their Github repos) and how interesting they are (based on Twitter).

To see top papers, follow us on twitter @assertpub_ (arXiv), @assert_pub (bioRxiv), and @assertpub_dev (everything else).

To see beautiful figures extracted from papers, follow us on Instagram.

Tracking 100,376 papers.

Search
Sort results based on if they are interesting or reproducible.
Interesting
Reproducible
Feedback
Online
Stats
Tracking 100,376 papers.