Research

We are developing informatics methods for high-throughput analysis of measurement data from next-generation sequencers and mass spectrometers. Recent years have seen an ever-increasing amount of electronic data being measured in biology, making it already a challenge to process large amounts of biological data in a standard way. In addition, in order to integrate data of different dimensions and to find relationships between data that have traditionally been difficult to model, it is essential to apply the latest advances in big data analysis and machine learning (data science) to the analysis of information. We are conducting the following research.

(1)Epitranscriptome analysis using nanopore sequencer

The increasing understanding of how RNA is chemically modified after transcription and how it involved in fundamental functions such as splicing, export, translation, and phase transitions, forms a new field of research called epitranscriptomics. Because the conventional method cannot distinguish between multiple RNA modifications simultaneously, as well as problems with accuracy and sensitivity, the analysis of RNA modification using nanopore sequencers has been attempted. In nanopore sequencing, an RNA molecule passes directly through a nanoscale hole and the current value of the molecule is measured according to the resistance of the molecule, and in principle, the RNA modification can be detected in addition to the RNA base sequence. However, in order to analyze a large amount of complex nanopore signal data, it is necessary to combine deep learning and large-scale parallel processing and thus, by combining cloud computing and GPU is one of the solutions.
A part of our algorithm is published in the following paper.

Ueda, H. nanoDoc: RNA modification detection using Nanopore raw reads with Deep One-Class Classification.
bioRxiv 2020.09.13.295089 (2020) doi:10.1101/2020.09.13.295089.

GitHub

(2)Information Analysis for Cancer Genomics

The ability to comprehensively detect somatic mutations in the cancer with next-generation sequencers has made it possible to use them not only in research but also in clinical applications. Cancer samples often have low tumor cell purity (tumor percentage), making analysis difficult. We have developed algorithms that can accurately calculate somatic mutations, copy number variations, and tumor purity in cancer cells, even in noisy environments, and these algorithms are being used in research at several organizations, including RCAST.

karkinos download

In addition, the University of Tokyo Oncopanel is also utilizing our laboratory’s programs.

Clinical Research Support Center, The University of Tokyo Hospital

(3)Bioinformatics using Data Science

In order to find biological meanings and relationships among large amounts of genomic data, it is necessary to aggregate and distribute data on a large scale. In anticipation of future large-scale cloud computing operations, we are developing a platform for analyzing biological information using standard cloud-based distributed technologies such as Hadoop/Spark and deep learning libraries.

VoltMR download

(4)Others

Through collaborative research, we are conducting a wide range of research in bioinformatics, especially in the field of HLA-binding neoantigens, and the analysis of single-cell RNA data.

Laboratory for Systems Biology and Medicine,
Research Center for Advanced Science and Technology

〒153-8904
Building 4, 121, Komaba Research Campus,
4-6-1 Komaba, Meguro-ku, Tokyo

Copyright © Biological Data Science
トップへ戻るボタン