lens, align.

Lang ist Die Zeit, es ereignet sich aber Das Wahre.

Quadrilateral.

2018-06-03 03:03:03 | Science News


□ Edward Witten / "Notes on Some Entanglement Properties of Quantum Field Theory":

>> https://arxiv.org/pdf/1803.04993.pdf

The main goal is to explain how to deal with entanglement when -- as in quantum field theory -- it is a property of the algebra of observables and not just of the states.

The infinite-dimensional case becomes essentially different from a finite-dimensional matrix algebra when one considers the behavior of ∆isΨ when s is no longer real. For a matrix algebra, there is no problem; ∆izΨ = exp(iz log ∆Ψ) is an entire matrix-valued function of z. In quantum field theory, ∆Ψ is unbounded and the analytic properties of ∆izΨχ for a state χ depend very much on χ.




□ Edward Witten / "A Mini-Introduction To [Quantum] Information Theory":

>> https://arxiv.org/pdf/1805.11965.pdf

Basic properties of the classical Shannon entropy and the quantum von Neumann entropy are described, along with related concepts such as classical and quantum relative entropy, conditional entropy, and mutual information.



how many bits of information can Alice send to Bob by sending a quantum system X with a k-dimensional Hilbert space H? Alice cannot encode more than logk bits of classical information in an k-dimensional quantum state, though it takes strong subadditivity (or equivalents) to prove this.




□ Stochastic Zeroth-order Optimization via Variance Reduction method:

>> https://arxiv.org/pdf/1805.11811v1.pdf

a novel Stochastic Zeroth-order method with Variance Reduction under Gaussian smoothing (SZVR-G) and establish the complexity for optimizing non-convex problems. With variance reduction on both sample space and search space, the complexity of our algorithm is sublinear to d and is strictly better than current approaches, in both smooth and non-smooth cases. SZVR-G algorithm is more efficient than both RGF and RSG in canonical logistic regression problem and successfully apply this algorithm to a real black-box adversarial attack problem that involves high-dimensional zeroth order optimization.




□ Root-cause Analysis for Time-series Anomalies via Spatiotemporal Graphical Modeling in Distributed Complex Systems:

>> https://arxiv.org/pdf/1805.12296v1.pdf

formulate the sequential state switching (S3, based on free energy concept of a restricted Boltzmann machine, RBM) and artificial anomaly association (A3, a classification framework using deep neural networks, DNN). S3 and A3 approaches can obtain high accuracy in root-cause analysis under both pattern-based and node-based fault scenar- ios, in addition to successfully handling multiple nominal operating modes.






□ Dimension Reduction and Visualization for Single-copy Alignments via Generalized PCA.:

>> https://www.biorxiv.org/content/biorxiv/early/2018/06/04/338442.full.pdf

the application of multiple correspondence analysis (MCA) directly to the sequence characters. p−dimensional single-copy DNA can be trans- formed into coordinates in genetic space, analo- gous to the way in which diploid DNA is trans-formed via PCA. The new vectors are ordered by the amount of variability explained by each ‘principal dimension’. Often the first few dimensions are used to visualise points in the new transformed space.




□ GenotypeTensors: Efficient Neural Network Genotype Callers:

>> https://www.biorxiv.org/content/biorxiv/early/2018/06/05/338780.full.pdf

Clairvoyante is described as a convolutional network and therefore also must convert alignment data to three dimensional tensors. They hypothesize that so-far unnoticed software implementation problems in available code bases, and/or insufficient hyper-parameter tuning for Clairvoyante, could be responsible for the differences observed in model performance, rather than being driven only by differences in model architecture and differences in representation of aligned reads in each genomic context.




□ A fast mrMLM algorithm for multi-locus genome-wide association studies:

>> https://www.biorxiv.org/content/biorxiv/early/2018/06/07/341784.full.pdf

the accelerated mrMLM algorithm by using GEMMA idea, matrix transformations and identities. The target functions and derivatives in vector/matrix forms for each marker scanning are transformed into some simple forms that are easy & efficient to evaluate during optimization step. All potentially associated QTNs with P-values ≤ 0.01 are evaluated in a multi-locus model by LARS algorithm and/or EM-Empirical Bayes.




□ SVCollector: Optimized sample selection for validating and long-read resequencing of structural variants:

>> https://www.biorxiv.org/content/biorxiv/early/2018/06/08/342386.full.pdf

For the topN mode, it picks samples with the largest number of SVs irrespective if the SVs are shared with other samples. For the greedy mode, it finds a set of samples that collectively contain the largest number of distinct variants. hey assessed the performance of SVCollector based on 4,424 human genomes from the Center for Common Disease Genetics (CCDG) freeze 1 dataset composed of 425,500 SVs identified with SURVIVOR.




□ An Implementation of Empirical Bayesian Inference and Non-Null Bootstrapping for Threshold Selection and Power Estimation in Multiple and Single Statistical Testing:

>> https://www.biorxiv.org/content/biorxiv/early/2018/06/08/342964.full.pdf

The new implementation eliminates the need for parameter tuning (especially by using AIK for GMD fitting) and allows the method to be used in broader range of conditions. Importantly, the statistical power is explicitly estimated and made available for inference. using an implementation of EBI using non-parametric test statistics, Gaussian Mixture Models and null bootstrapping. This implementation readily handles one-sample, two-sample and correlation problems in multi-dimensional data with arbitrary distributions.




□ Co-fuse: a new class discovery analysis tool to identify and prioritize recurrent fusion genes from RNA-sequencing data:

>> https://link.springer.com/article/10.1007%2Fs00438-018-1454-1

Co-fuse can perform two or more groups comparison analysis to identify significant over-represented recurrent fusion genes that are associated with a particular group using the combination of pattern mining and statistical analysis. The Recursive Partitioning and Regression Trees (rpart) algorithm was used to further prioritise the recurrent fusion genes result obtained from the two groups comparison.




□ DataPackageR: Reproducible data preprocessing, standardization and sharing using R/Bioconductor for collaborative data analysis.:

>> https://www.biorxiv.org/content/biorxiv/early/2018/06/08/342907.full.pdf

The principle behind the tool is that it remains a lightweight and non-intrusive framework that easily plugs into most R-based data analytic work-flows. It places few restrictions on the user code therefore most existing scripts can be ported to use the package. It also builds boilerplate roxygen documentation of the R objects specified in the .yml, computes checksums of stored R objects and version tags the entire data set collection.

yml_find <- function(path) {
path <- normalizePath(path)
config_yml <- is_r_package$find_file("datapackager.yml", path = path)
if (!file.exists(config_yml)) {
stop("Can't find a datapackager.yml config at ",
dirname(config_yml),
call. = FALSE)






□ GLnexus: joint variant calling for large cohort sequencing:

>> https://www.biorxiv.org/content/biorxiv/early/2018/06/11/343970.full.pdf

Joint-calling with the cohort sharded across compute nodes will enable GLnexus to scale up to any N foreseeable with the gVCF/pVCF data model for short-read sequencing. a standalone open-source version of GLnexus and a DNAnexus cloud-native deployment supporting very large projects, which has been employed for cohorts of >240,000 exomes and >22,000 whole-genomes.






□ Reconciling Multiple Genes Trees via Segmental Duplications and Losses:

>> https://arxiv.org/pdf/1806.03988v1.pdf

the problem is polynomial-time solvable when δ≤λ (via LCA-mapping), while if δ>λ the problem is NP-hard, even when λ=0 and a single gene tree is given, solving a long standing open problem on the complexity of the reconciliation problem. On the positive side, give a fixed-parameter algorithm for the problem, where the parameters are δ/λ and the number d of segmental duplications, of time complexity O(⌈δ/λ⌉d⋅n⋅δ/λ).






□ φ-evo: A program to evolve phenotypic models of biological networks:

>> http://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1006244

illustrate the predictive power of φ-evo by first recovering the asymmetrical structure of the lac operon regulation from an objective function with symmetrical constraints. Simulations are run in a deterministic mode, and both Euler and a Runge-Kutta integrators are available in the program. An option to run equations in a stochastic mode using τ-leaping algorithm (a biochemical numerical generalization of the Langevin equation) is also included.




□ SSCC: a computational framework for rapid and accurate clustering of large-scale single cell RNA-seq data:

>> https://www.biorxiv.org/content/biorxiv/early/2018/06/11/344242.full.pdf

simpler single cell RNAseq data clustering (sscClust), is a package implement mutilple functionalities which are basic procedures in single cell RNAseq data analysis, including variable genes identification, dimension reduction, clustering on reduced data. The few positive ΔNMI values were attributed to poor clustering accuracy with total cells, most of which were related to the k-medoids algorithm. To absolutely eliminate the influence of the selection of clustering algorithms, they added an oracle-clustering algorithm.






□ BALLI: Bartlett-Adjusted Likelihood-based LInear Model Approach for Identifying Differentially Expressed Gene with RNA-seq Data:

>> https://www.biorxiv.org/content/biorxiv/early/2018/06/12/344929.full.pdf




□ DIvERGE that precisely mutates long genomic segments up-to 1,000,000-times faster than non-targeted regions

>> http://www.pnas.org/content/early/2018/05/30/1801646115

DirectedEvolution of multiple genomic loci allows the prediction of antibiotic resistance




□ Scavenger: A pipeline for recovery of unaligned reads utilising similarity with aligned reads:

>> https://www.biorxiv.org/content/biorxiv/early/2018/06/13/345876.full.pdf

The reads recovered contain more genetic variants compared to previously aligned reads, indicating that divergence between personal and reference genome plays a role in the false-negative non-alignment problem. the majority of genes with >1 fold change in expression after recovery are of pseudogenes category, indicating that pseudogenes expression can be substantially affected by the false-negative non-alignment problem.






□ Matchmaker Exchange now connects seven genomic matchmakers and two knowledge sources.

>> http://www.matchmakerexchange.org/i_am_a_clinician_laboratory.html

Have a candidate gene? Enter your case into one of the connected databases which allows you to query the Matchmaker Exchange network for a match.







□ AQ-seq: Accurate quantification of microRNAs and their variants:

>> https://www.biorxiv.org/content/biorxiv/early/2018/06/05/339606.full.pdf

AQ-seq diminishes the ligation bias of sRNA-seq. AQ-seq detects miRNAs of low abundance and reliably defines the terminal sequences of miRNAs undetected when using the conventional sRNA-seq method. AQ-seq incorporates RNA spike-in controls that consist of 30 exogenous RNA molecules. Use of the spike-ins allows us to monitor ligation bias and detection sensitivity.




□ Common Disease Is More Complex Than Implied by the Core Gene Omnigenic Model:

>> https://www.cell.com/cell/fulltext/S0092-8674(18)30714-1






□ RAMODO: Learning Representations of Ultrahigh-dimensional Data for Random Distance-based Outlier Detection:

>> https://arxiv.org/pdf/1806.04808.pdf

RAMODO learns a representation function f (·) to map D-dimensional input objects into a M-dimensional space, with M ≪ D. RAMODO unifies representation learning and outlier detection to learn a small set of features that are tailored for the random distance-based detectors. on eight real-world ultrahigh dimensional data sets show that REPEN enables a random distance-based detector to obtain significantly better AUC performance and two orders of magnitude speedup and leverages less than 1% labeled data to achieve up to 32% AUC improvement.





最新の画像もっと見る

コメントを投稿