Biological Data Science
November 9 - 12, 2022

You must register for the meeting in order to submit abstracts.
After registering you will be sent a web link for abstract submission by email.
You may copy and paste your abstract from Word, Google Docs, or Notepad; abstracts are limited to ~2900 characters.

Program information: An electronic version of the program abstract book will be sent three working days prior to the first day of the meeting, and hard copies will be available for collection upon your arrival at Cold Spring Harbor Laboratory. First night and keynote speakers are informed of their session date and time, otherwise program information is only available upon release of the electronic version of the abstract book. The reason we do this is to try and maximize interactions by encouraging participants to stay for the duration of the meeting.

Please check your email for talk length, poster instructions (in-person and virtual), and how to have your poster printed at CSHL for collection upon arrival.

Abstract Status

Presenting Author

Abstract Title

Presentation

Aggarwal, Manu

Scaling persistent homology to large biological datasets

virtual poster

Aguilera, Joseph L

X marks the spot—Targeting an active chromatin domain to the X-chromosome

poster

Ahlmann-Eltze, Constantin

Differential embedding analysis of multi-condition single-cell datasets

poster

Ahmed, Omar J

khoice—Choose k-mers or MEMs based on their discriminatory power for your data

poster

Albanese, Roberto

De novo functional transcriptomics with RNA-seq and Ribo-seq

poster

ALGAZLAN, ALHANOF S

The association of telomere length with childhood stress

virtual poster

Awan, Ahmed H

Galaxy tool search—Past, present and future

poster

Ay, Ahmet

Machine learning and network analysis of the general anxiety disorder for patients in the UK Biobank database suggests novel circadian genotypes predictive of anxiety

talk

Ayub, Shanza

A multi-view non-linear latent space model to learn novel cell states from highly multiplexed imaging data

talk

Baharav, Tavor Z

A statistical reference-free genomic algorithm subsumes common workflows and enables novel discovery

talk

Bai, Yu

Image guided construction of a common coordinate framework for spatial transcriptome data

virtual poster

Baker, Daniel N

isoseq3—A complete end-to-end workflow for single-cell applications using long reads

poster

Bancarz, Iain

Djerba—Generating clinical genome interpretation reports for cancer

virtual poster

Bartom, Elizabeth T

Automated cancer cell line identification from RNA-seq data

poster

Biederstedt, Evan

Cell Annotation Platform (CAP)—A centralized platform for defining cell types for the human cell atlas and beyond

talk

Blise, Katie

Machine learning and single-cell spatial analyses predict immunotherapy response and clinical outcome in pancreatic ductal adenocarcinoma

poster

Bobo, Dean M

Hill numbers at the edge of a pandemic—Rapid SARS-COV2 surveillance using clinical, pooled, or wastewater sequence as a sensor for population change

poster

Bollas, Audrey E

A machine learning approach to detect somatic variants in tumor RNA-seq

poster

Bonnie, Jessica K

DandD—Utilizing “Delta delta” (Δδ) to quantify novel contributions from genomes

talk

Borsari, Beatrice

transferQTL—Expanding existing expression-QTL catalogs across human tissues by leveraging chromatin data

poster

Bredikhin, Danila

Scalable analysis of chromatin features in single cells

poster

Brown, Samuel S

Clinical irradiation dose and the tumour microenvironment—A spatial transcriptomics investigation

poster

Bzikadze, Andrey

TandemAligner—A new parameter-free framework for fast sequence alignment

poster

Capraz, Tumay

Feature selection by replicate reproducibility and non-redundancy

poster

CARLUER, Jean-Baptiste

Full epistatic maps retrieve part of missing heritability and improve phenotypic predictions

poster

Casill, Alyssa

Decoding “undruggable” drug targets using SpliceCore®, a machine learning platform for RNA therapeutics development

poster

Chao, Kuan-Hao

The first gapless, reference-quality, fully annotated genome from a Southern Han Chinese individual

poster

Chavez-Fuentes, Joselyn C

Local and cloud-based analysis of spatial transcriptomics with Giotto suite

poster

Chen, Can

Teasing out missing reactions in genome-scale metabolic networks through deep learning

talk

Chen, Chao B

Deep topological analysis of spatial transcriptomics for contextualization of high-resolution cellular interactions

poster

Cheng, Haoyu

An integrated algorithm for robust and cost-effective telomere-to-telomere genome assembly

talk

Chetnik, Kelsey

Accurate segmentation of airways in human lung imaging mass cytometry data using a U-Net model

poster

Cho, Hyeon Jin

Large scale stochastic modeling of heterogeneous cell populations

poster

Conceicao, Izabela M

Differential expression of long non-coding RNA isoforms in human cells treated with Metformin

poster

Corrada Bravo, Héctor

The Single Cell Hub at Genentech, or, how to manage and query data from 100M cells

poster

Cresswell-Clay, Evan

Protein contact prediction with expectation reflection

poster

de Lange, Nikola

Identification of cell-type and time point specific gene regulatory networks with DREMflow

poster

De, Subhajyoti

A computational framework for identifying host-microbiome interactions in vivo at single cell resolution

poster

Deards, Gabriel

Identifying novel neuro-immune metabolomic signatures from ROS/MAP brain samples using deconvoluted immune cell counts

poster

Demetci, Pinar

Simultaneous alignment of cells and genomic features of single-cell multi-omics datasets with co-optimal transport

poster

Devoto, Audra

Efficient metatranscriptome assembly enables recovery of novel microbial proteins

poster

Dey, Anubha

iMCLAPE—A multi-class classifier for epistatic interaction in cancer

virtual poster

Dinalankara, Wikum

Scoring chromatin aberrant potential of tumors though gene co-expression networks associated with chromatin interaction

poster

Dunn, Tim

A cloud-based pipeline for analysis of FHIR and long-read data

virtual poster

Duren, Zhana

Inference and analysis of gene regulatory networks from single cell multi-omics data

poster

Eagles, Nicholas

Benchmarking spot-level cell-type deconvolution methods using Visium immunofluorescence benchmark data on the human dorsolateral prefrontal cortex

poster

Ellington, Caleb

Sample-specific contextualized graphical models using clinical and molecular data reveal transcriptional network heterogeneity across 7000 tumors

talk

Erdogdu, Beril

Detecting differential transcript usage in complex diseases

poster

Fan, Xiao

SHINE—Protein language model based pathogenicity prediction for short inframe insertion and deletion variants

poster

Fansler, Mervin M

Analysis of a genome-wide Perturb-seq dataset enables discovery of gene-specific regulators of alternative polyadenylation

poster

Farmer, Rohit

HDStIM—An unsupervised high-dimensional approach for analyzing stimulation responses at the single cell level in mass or flow cytometry

virtual poster

Feng, Claudia

A genome-wide dependency map of trans gene regulation in human pluripotent cells

poster

Fisher, Jennifer

Divergent disease signatures for signature reversion drug repurposing reveal promising drug candidates for low-survival human cancers

poster

Gao, Jiahao

Identifying transcription factors whose binding is strongly affected by genomic variants

poster

Garma, Hailey R

Application of machine learning to identify markers of response in flow cytometry data

poster

Ge, Xijin

iDEP—An interactive, reproducible platform for analyzing RNA-Seq data

poster

Ge, Yuchen

Clustering of high-dimensional RNA-seq data to identify glioblastoma subtypes

poster

Ghosh, Rikhiya

Pathway-Gene selectoR (PaGR)—A new feature selection algorithm in genomic datasets

poster

Goecks, Jeremy

A web-based software resource for interactive analysis of multiplex tissue imaging datasets

talk

Goeke, Jonathan

Detection of m6A from direct RNA sequencing using a multiple instance learning framework

virtual poster

Guare, Lindsay A

Clinical phenotyping of endometriosis using a genotype-first approach—A data-driven strategy to complex disease subtyping

talk

Guruvayurappan, Karthik

Measuring enhancer-enhancer interactions by analysis of single-cell CRISPR perturbations

talk

Hammelman, Jennifer L

pandFX—A python framework for integrating and interpreting large-scale high-dimensional screens with transcriptome readouts

poster

Hansen, Kasper D

Inference of natural selection on epigenetic marks—Implications for gene regulation and germline mutation rates

poster

Hansen, Kasper D

Pumping the brakes on RNA velocity—Understanding and interpreting RNA velocity estimates

poster

Hao, Yuhan

Dictionary learning for integrative, multimodal, and massively scalable single-cell analysis

poster

Harbort, Christopher J

Integrating host and microbiome genetics to identify determinants of typhoid fever

virtual poster

He, Zitong

A deep learning model for predicting Alu exonization events

poster

Heinen, Tobias

DeepCellRegMap—An interpretable deep learning framework for mapping context-specific genetic effects from population-scale single-cell sequencing data

talk

Heisler, Lawrence

he OICR Genome Sequencing Informatics Analysis Service offerings

poster

Hicks, Stephanie C

Scalable identification of spatially variable genes using nearest-neighbor Gaussian processes

talk

Hinthorn, Samuel

Leveraging machine learning to predict and characterize cellular senescence in tissue and cell culture

poster

Holt, James M

Comprehensive SMN1 and SMN2 profiling for spinal muscular atrophy analysis using long-read PacBio HiFi sequencing

poster

Hong, Spencer

Unlocking the history of the Human Genome Project through deep learning

poster

Huang, Kuan-lin

Personalized prediction of cancer risk and treatment option based on the full genome architecture

poster

Huuki-Myers, Louise A

Integrated single cell and unsupervised spatial transcriptomic analysis defines molecular anatomy of the human dorsolateral prefrontal cortex

poster

Jain, Atishay

Scalable and memory efficient segmentation of large microscopy images using graph-based neural networks

talk

Jangi, Radhika

Dynamic genetic regulation of gene expression in multiple differentiation trajectories using embryoid bodies

poster

Jansen, Camden S

Inferring transcriptional mis-regulation in T-LGL leukemia through integration of highly dimensional multi-omic data

poster

Jenike, Katharine M

Establishing a Solanum pan-genome to dissect dynamics of paralog evolution

talk

Jithesh, Puthen Veettil V

PoPGx—Pharmacogenomics analysis of population-scale genome data

virtual poster

Joshi, Suraj

Scalable inference of tumor evolution in multi-region bulk tumor sequencing using phylogenetic analysis

poster

Kalhor, Reza 

Quantitative fate mapping—Reconstructing dynamics of complex progenitor fields using lineage barcoding

talk

Kamali, Kaivan

AskGalaxy—A public service for predicting analysis resource requirements

talk

Karasikov, Mikhail

Searching in nucleotide archives at Petabase scale with MetaGraph

poster

Karlsson, Elinor

Leveraging base pair mammalian constraint to understand mammalian evolution and human disease

talk

Kars, Meltem Ece

A comprehensive knowledgebase of known and predicted genetic variants associated with COVID-19 severity

poster

Kelley, David

Sequence-based modeling of single-cell ATAC-seq using convolutional neural networks

talk

KESHARWANI, RUPESH K

STRspy-ing hidden variation in forensic DNA profiles with the MinION

poster

Khunsriraksakul, Chachrit

Multi-ancestry and multi-trait genome-wide association meta-analyses inform clinical risk prediction and drug repurposing for systemic lupus erythematosus

virtual poster

Khurana, Ekta

Decoding cancer epigenome and transcriptome from cell-free DNA

poster

Kille, Bryce

From minimizers to minmers—Correcting the bias in the winnowed Jaccard estimator

poster

Koren, Sergey

A complete diploid human genome

poster

Koster, Johannes

Rapidly configurable, portable, interactive visualization of arbitrary tabular results

talk

Kovaka, Sam

Visualization and analysis of nanopore RNA and DNA signal alignments for modification detection and more with Uncalled4

poster

Kozinova, Marya

SETD2 loss in renal carcinoma cells induces expression of peptides from retained introns

poster

Kucher, Natalie

Cloud-scale training and education with Galaxy in the NHGRI Analysis, Visualization, and Informatics Lab-space (AnVIL)

poster

Kurilshikov, Alexander

Gut microbiome and plasma metabolomic markers of pregnant mothers associated with infants’ growth trajectories

virtual poster

Lac, Le An (Leann)

scGMM-VGAE—A Gaussian mixture model-based variational graph autoencoder algorithm for clustering single-cell RNA-seq data

poster

Langer, Christoph C

A visual exploration and hypothesis testing tool for 3D genomics

poster

Lariviere, Delphine

Automated Reference Genome Assembly in Galaxy in collaboration with the Vertebrate Genome Project

talk

Laszloffy, Michael

Exploring challenges in reducing sequencing analysis turnaround time

poster

Lawson, Jonathan

Aligning NIH’s existing data use restrictions to the GA4GH DUO standard

poster

Leary, Allen

TCR-VALID learns a smooth and interpretable landscape for TCR generation and clustering

talk

Lee, Danielle E

A novel protocol for phenotype prediction from cultural heritage objects using Oxford Nanopore sequencing

poster

Lee, Hayan

Familial adenomatous polyposis epigenetic landscape as a precancer model of colorectal cancer

talk

Li, Mingyuan

Topic models reveal cellular contexts and eQTLs in embryoid bodies

poster

Li, Ruoxin

Representation learning of single cell populations identifies network structures associated with differential potential of iPSCs

poster

Li, Tianxiao

Predicting allele-specific variants from the neighboring nucleotide sequence context

poster

Li, Yan Chak

Integrating multimodal data through interpretable heterogeneous ensembles

poster

Li, Yangyang

DELTA—An annotator of structural variations based on deep learning

poster

Lin, Yingxin

Atlas-scale data integration for single-cell meta analysis

talk

Linderman, Michael

Deep metric learning for structural variant genotyping in genome sequencing data

poster

Liu, Changchang

A biophysically-motivated parametric approach to predict the dose response surface of drug combinations

poster

Liu, Jinlu

Shared differential clustering across single-cell RNA sequencing datasets with the hierarchical Dirichlet process

virtual poster

Liu, Lingjie

A unified probabilistic modeling framework for eukaryotic transcription based on nascent RNA sequencing data

poster

Liu, Yuelin

Single-cell methylation sequencing data reveals succinct metastatic migration histories and tumor progression models

talk

Lohia, Ruchi

A global high-density chromatin interaction network reveals functional long-range and trans-chromosomal relationships

talk

Love, Michael

Allelic expression imbalance of cells and isoforms

poster

Ma, Mingyang

Scaling rules for pandemics—Estimating infected fraction from identified cases for the SARS-CoV-2 pandemic

poster

Malachowska, Beata E

scRNA-seq of irradiated bone marrow

poster

Markov, Nikolay

Integrative analysis of longitudinal clinical and single-cell RNA-seq data predicts outcomes in patients with severe SARS-CoV-2 pneumonia

poster

Marti-Gomez, Carlos

gpmap-tools—A python library for visualizing complex genotype-phenotype maps

poster

Masarone, Sara

A deep learning approach to integrate different data modalities and improve patient stratification in trauma

poster

Masarone, Sara

Representation learning to effectively integrate and interpret omics data

talk

McGaughey, David M

PLAE web app enables powerful searching and multiple visualizations across one million unified single-cell ocular transcriptomes

poster

Meier-Scherling, Cecile

evoModes—A modeling framework for estimating tumor evolutionary modes from single-cell RNA sequencing data

poster

Mineeva, Olga

ResMiCo—Increasing the quality of metagenome-assembled genomes with deep learning

talk

Minkin, Ilia

Quality assessment of human splice site annotation based on conservation in multiple species

poster

Mishra, Priti

Plasma lipidomic signatures of body fat depots and their link with child metabolic health

virtual poster

Mo, Ziyi

Robust supervised machine learning for population genetic inference with domain adaptation

poster

Mosher, Stephen L

A unified computing environment for genomics data storage, management, and analysis—NHGRI Genomic Data Science Analysis, Visualization, and Informatics Lab-Space (AnVIL)

poster

Moss, Matthew A

A multimodal approach for characterizing progesterone receptor based transcriptional regulation in menstrual effluent derived endometrial stromal cells

poster

Mustafa, Harun

A modular multi-label framework for aligning sequences to large read set databases and (pan)genomes

talk

Musunuri, Rajeeva

Lancet2 – Performance improvements to Lancet somatic variant calling

poster

Negi, Soumya

Visualizing scRNA-seq data at billion cell scale using MetaCellViz

talk

Ni, Bohan

Modeling the effects of rare structural variants on gene expression across multiple tissues

poster

Olivieri, Julia E

Analysis of RNA processing directly from spatial transcriptomics data reveals previously unknown regulation

poster

Osmanbeyoglu, Hatice U

STAN, a computational framework for inferring spatially informed transcription factor activity networks

virtual poster

Ospina, Oscar E

Spatial assessment of the tumor immune microenvironment in immunotherapy-treated patients with ovarian cancer

poster

Ou, Shujun

Transposable element dynamics across the Solanum pan-genome

poster

Patro, Robert

Keeping k-mers in check—Building fast, small, and composable indices based on the De Bruijn graph

talk

Patterson, Andrew

Mutated processes predict immune checkpoint inhibitor therapy benefit in metastatic melanoma

poster

Phillippy, Adam M

The human genome is finally finished—What's next?

talk

Pickard, Joshua B

Deciphering multi-way interactions in the human genome

poster

Pickett, Brandon D

A panel of complete X chromosomes from ten diverse humans and six non-human primates

poster

Pinello, Luca

SIMBA—Building interpretable regulatory maps using graph-embedding on single-cell multiomics data

poster

Pranzatelli, Thomas J

Colocalization changes in autoimmune disease of a complex tissue

poster

Prummer, Michael

Cell type-specific gene co-expression modules define tumor heterogeneity in melanoma patients

poster

Przeworski, Molly

Causes and consequences of recombination hotspots in vertebrates.

talk

Qin, Ke

DNA methylation profiling based sarcoma classifier with regularized generalized linear model

poster

Rahman Hera, Mahmudur

Debiasing FracMinHash and estimating average nucleotide identity (ANI) using FracMinHash

poster

Rahman, Amatur

Uncovering hidden assembly artifacts—When unitigs are not safe and bidirected graphs are not helpful

poster

Railey, Caylyn E

A systematic exploration of antisense long non-coding RNAs in the Brassicaceae

poster

Rao, Jingyou

Computational approaches for inferring gene regulation in in situ perturbation screens

talk

Ravi, Aditya Narayan

SuffixHash—Sequence alignment via variable-length substring matching

poster

Rendeiro, Andre F

Unsupervised discovery of tissue architecture in multiplexed imaging

virtual talk

Richardson, Reese A

A rationally designed tool to promote the investigation of under-investigated genes

poster

Rigoutsos, Isidore

IsomiRs, tRNA-derived fragments, and rRNA-derived fragments—Novel types of small RNAs with intriguing dependencies and great diagnostic, prognostic, and therapeutic potential

poster

Rohbeck, Martin

iCLVM—Enhanced contrastive models by multiple nonlinear regression

poster

Roy, Sumedha

Unbiased differential abundance analysis and novel biomarker discovery using FlowSOM clustering on clinical flow cytometry data

poster

Schalkamp, Ann-Kathrin

Seeing Parkinson's disease coming

poster

Schreiber, Jacob M

The ENCODE Imputation Challenge—A critical assessment of methods for cross-cell type imputation of epigenomic profiles

poster

Schuetz, Robert

CNVpred—Classifying pathogenic copy number variants with machine learning

poster

Schweickart, Annalise

Examining the human metabolome to assess metabolic rewiring of the ketogenic diet

poster

Sen, Shurjo K

Nudging genomics into the cloud through compute credits—Experiences from the NHGRI AnVIL’s AC2 and AC3 funding programs

poster

Sergushichev, Alexey

GESECA—Multi-conditional pathway enrichment analysis for bulk and single cell transcriptomics

poster

Sheffield, Nathan C

How can I share my project metadata? Improving interoperability of biological sample metadata with microservices and APIs

poster

Shiraishi, Yuichi

Systematic identification of splice-site creating variants from massive publicly available transcriptome sequencing data

poster

Shivakumar, Vikram

Signal-level adaptive sampling using HMMs and matching statistics

poster

Sick, Beate

Interpretable deep ensemble models for stroke patient outcome prediction

poster

Singh, Amartya

Uncovering intra-tumoral heterogeneity and mechanisms of response to treatment using single-cell biclustering

poster

Singh, Noor P

Tree Terminus—Creating transcript trees using inferential replicate counts

poster

Singh, Noor P

Tree-based differential testing using inferential replicate counts for RNASeq

poster

Smith, Shaleigh A

SpliceSliceTM—Biomarker discovery and patient stratification using a splicing-centric approach

poster

Sommer, Markus J

Structure-guided isoform analysis for the human transcriptome

talk

Soupir, Alex C

High abundance and low spatial clustering of tumor infiltrating lymphocytes associate with survival of high grade serous ovarian cancer

poster

Stark, Stefan G

Learning single-cell perturbation responses using neural optimal transport

talk

Starostik, Margaret R

A machine learning approach to prioritize functionally relevant endogenous mRNA targets of piRNAs in C. elegans

poster

Stastny, Tiana H

Using machine learning to identify the best CRISPR-Cas9 targets for functional gene knockout

poster

Stoeger, Thomas

A data-driven and proven approach for unknomics to benefit junior investigators

poster

Subong, Bryan John J

Mutation analysis and protein-protein interaction analysis of nsp7 and nsp8 of SARS-CoV-2

virtual poster

Suderman, Keith

Benchmarking popular bioinformatics tools for the cloud

poster

Sullivan, Delaney K

Flexible preprocessing and reformatting of sequencing data

poster

Sweeten, Alexander

Mod.Plot—A rapid and interactive visualization of tandem repeats

poster

Tareen, Ammar

MAVE-NN 2—Flexible quantitative modeling for multiplex assays of variant effect

poster

Tate, Alex

tinyRNA—Precision analysis of small RNA-seq data with user-defined hierarchical selection rules

poster

Thompson, Jacqueline R

Single-cell RNA sequencing and machine learning reveal novel mechanisms driving activity-dependent neuronal maturation in the dentate gyrus

poster

Tio, Earvin

Exploration of the biopsychosocial mechanisms of suicidal ideation using a whole-person approach

poster

Turner, Tychele N

Precision genomics as a key component for the future of precision medicine

talk

Vock, Isaac W

bakR—Uncovering the kinetics of regulated gene expression with nucleotide recoding RNA-seq and Bayesian hierarchical modeling

poster

Vohringer, Harald

Single cell canonical correlation analysis—A generative probabilistic framework to integrate CITE-seq data

poster

Wagner, Justin

Active evaluation of a draft genome in a bottle benchmark for chromosomes X and Y

virtual poster

Wang, Zishan

Master regulators of protein abundance across six cancer types

poster

Warris, Sven

Genomics and high-throughput phenotyping

poster

Wright, Adam J

Evaluating the predictive accuracy of Reactome's curated biological pathways

poster

Xu, Feihong

An investigation of clinical states from EHR data

poster

Xue, Albert

DOTEARS—Causal structure learning using interventional data

talk

Yao, Sijie

SmartImpute—A targeted imputation framework for single-cell transcriptome data

poster

Yao, Yuelin

Model-free estimation of driver interactions across cancers

poster

Yeaton, Anna

DifferentialNeighborhoodEnrichment, a statistical method to model enrichments in cell-type specific neighborhoods between groups

poster

Yuan, Alex

Rigorous and general statistical tests for correlations between time series

poster

Yunusov, Dinar

Accurate counting of multi-mapping reads substantially improves single-cell RNA-seq gene quantification

talk

Zhang, Huizi

Bayesian modelling of transcriptional dynamics

virtual poster