Data Science and Engineering @ DCEG

epiSphere

Portable Data Science Applications for Cancer Precision Prevention. For positions opened see also pdf. Prospective intership candidate are typically challenged by a test project which is then discussed in the selection interview.

Connect

Cancer Precision Prevention places an increasing focus on data-intensive platforms that can reach, and can be engaged, as consumer-facing digital applications. Ultimately, the emergence of a Learning Health Care System is orchestrated by computational systems that orchestrate both medical reccords and consumer-facing services, from wearable sensors to genomics. A new generation of cohort studies, such as NCI/DCEG Connect, is being designed accordingly.

Confluence

BigData designates the computational aggregation of large volumes of diverse data and diverse analytical environments in order to enable comprehensive integrative analysis. Even more than the logistic challanges, BigData typically has to navigate complex governance and complaince landscapes that can only be accomplished in Cloud Computing environments. Confluence is an international initiative aggregating data on 300k control and 300k breast cancer cases.

FAIR Data Platform

The data platform developed for Confluence is being abstracted into a distributed FAIR data platform for cohort studies.

Commons

Identifying novel algorithms and designing Web Applications backed by Cloud hosted APIs is the upbiquitous technology stack. EpiSphere seeks to integrate a multitude of health data streams generated and consumed in real time with the goal of contextualization of individual observatin by reference BigData. This process defines the API ecosystems of Epidemiology Data Commons.

Digital Pathology (patterns)

epiPath, imageBox, Active Learning (in press)

Time series

Mortality tracker - J. Bioinformatics PMID:33135727

Wearables

MutationSignature (bioinformatics)

under development

Code

Open-source code repositories at github.com/episphere.

Who, Where

EpiSphere is a software engineering research project of the Data Science Group at Division of Cancer Epidemiology and Genetics(DCEG) of the National Cancer Institute (NIH.NCI).