Methods
(async, static) convertMatrix(data, group_byopt, batch_sizeopt, genomeopt, tcgaopt) → {Promise.<Object>}
Converts input mutational data into a mutational spectrum matrix grouped by a specified field.
This function processes raw mutational data, extracts trinucleotide contexts, and aggregates
mutational spectra for each group. It supports TCGA and non-TCGA formats, allowing batch processing
for large datasets. The resulting mutational spectrum matrix is formatted for downstream visualization
and analysis.
Parameters:
Name | Type | Attributes | Default | Description |
---|---|---|---|---|
data |
Array.<Array.<Object>> | Array of patient-level mutational data. Each patient's data is represented as an array of objects, where each object contains mutational details (e.g., chromosome, position, mutation type). | ||
group_by |
string |
<optional> |
"Center" | Field to group data by (e.g., "Center" or "sample_id"). This field should exist in the input data. |
batch_size |
number |
<optional> |
100 | Number of mutations to process in parallel batches. Adjust this for memory management. |
genome |
string |
<optional> |
"hg19" | Reference genome build. Defaults to "hg19" unless specified in the data. |
tcga |
boolean |
<optional> |
false | Flag indicating whether the input data is in TCGA format. If true, expects TCGA-specific fields. |
Returns:
- A promise resolving to an object where each key is a group (e.g., "CNIC"),
and each value is a mutational spectrum object. The mutational spectrum object contains trinucleotide contexts as keys (e.g., "A[C>A]A") and counts as values.
- Type
- Promise.<Object>
Example
// Example input data
const data = [
[
{ chromosome: "1", start_position: "12345", reference_allele: "C", tumor_seq_allele2: "T", variant_type: "SNP", build: "hg19", Center: "CNIC" },
{ chromosome: "2", start_position: "67890", reference_allele: "G", tumor_seq_allele2: "A", variant_type: "SNP", build: "hg19", Center: "CNIC" }
],
[
{ chromosome: "3", start_position: "101112", reference_allele: "T", tumor_seq_allele2: "C", variant_type: "SNP", build: "hg19", Center: "OtherCenter" }
]
];
// Convert data to mutational spectra grouped by Center
const mutationalSpectra = await convertMatrix(data, "Center", 50, "hg19", false);
console.log(mutationalSpectra);
// Output:
// {
// "CNIC": {
// "A[C>A]A": 9,
// "A[C>A]C": 7,
// "A[C>A]G": 6,
// ...
// },
// "OtherCenter": {
// "T[T>C]A": 15,
// "T[T>C]T": 8,
// ...
// }
// }
(static) convertMutationalSpectraIntoJSON(MAFfiles, mutSpec, sample_name, dataTypeopt) → {Array}
Converts mutational spectra into JSON objects for downstream processing or storage.
This function takes mutation annotation format (MAF) files, mutational spectra data,
and sample names, and outputs JSON objects representing the spectra in a structured format.
Parameters:
Name | Type | Attributes | Default | Description |
---|---|---|---|---|
MAFfiles |
Array | An array of arrays containing mutation annotation data for each sample. Each inner array represents a sample's mutation data, where each entry is an object with key-value pairs. | ||
mutSpec |
Object | An object representing the mutational spectra. Keys are patient identifiers, and values are objects with mutation types as keys and counts as values. | ||
sample_name |
string | The key in MAFfiles to be used as the sample identifier. | ||
dataType |
string |
<optional> |
"WGS" | The sequencing strategy, e.g., "WGS" (Whole Genome Sequencing) or "WES" (Whole Exome Sequencing). |
- Source:
Throws:
-
- If the number of MAF files and the number of mutational spectra do not match.
- Type
- Error
Returns:
- An array of arrays, where each inner array represents the JSON objects for a patient's mutational spectra.
- Type
- Array
Examples
// Example input:
const MAFfiles = [
[{ sample_id: "Sample1", ch mosome: "1", start_position: "12345", ... }],
[{ sample_id: "Sample2", ch mosome: "2", start_position: "67890", ... }]
];
const mutSpec = {
Patient1: { "A[C>A]A": 10, [C>A]C": 5, "A[C>A]G": 8, ... },
Patient2: { "A[C>A]A": 7, " C>A]C": 4, "A[C>A]G": 6, ... }
};
const sample_name = "sample_id";
const result = convertMutationalSpectraIntoJSON(MAFfiles, mutSpec, sample_name, "WES");
console.log(result);
// Example output:
[
[
{ sample: "Sample1", strategy: "WES", profile: "SBS", matrix: 96, mutationType: "C>A", mutations: 10 },
{ sample: "Sample1", strategy: "WES", profile: "SBS", matrix: 96, mutationType: "C>G", mutations: 5 },
...
],
[
{ sample: "Sample2", strategy: "WES", profile: "SBS", matrix: 96, mutationType: "C>A", mutations: 7 },
{ sample: "Sample2", strategy: "WES", profile: "SBS", matrix: 96, mutationType: "C>G", mutations: 4 },
...
]
];
(async, static) convertWGStoPanel(WgMAFs, panelDf) → {Array}
Converts Whole Genome Sequencing (WGS) mutation data into panel data by downsampling based on a BED file.
This function takes WGS mutation annotation files (MAFs) and a BED file defining panel regions, then filters
the WGS data to include only mutations within the specified panel regions.
Parameters:
Name | Type | Description |
---|---|---|
WgMAFs |
Array | An array of WGS mutation data, where each element is an array representing mutations for a single sample. Each mutation record should be an object with fields such as `chromosome`, `start_position`, etc. |
panelDf |
string | Array | A BED file defining the regions of the panel or an array of arrays representing the panel regions. If a string is provided, it is treated as a file path to a BED file and read into memory. |
- Source:
Returns:
- An array of downsampled MAFs, where each element corresponds to a sample from the input WGS data,
filtered to include only mutations within the panel regions.
- Type
- Array
Example
// Example input:
const WgMAFs = [
[
{ chromosome: "1", start_position: 12345, Hugo_Symbol: "TP53", ... },
{ chromosome: "2", start_position: 67890, Hugo_Symbol: "BRCA1", ... },
...
],
[
{ chromosome: "1", start_position: 54321, Hugo_Symbol: "KRAS", ... },
{ chromosome: "2", start_position: 98765, Hugo_Symbol: "EGFR", ... },
...
],
];
const panelDf = "panel_regions.bed"; // Path to a BED file.
// Convert WGS data to panel data
const panelMAFs = await convertWGStoPanel(WgMAFs, panelDf);
// Example output:
// panelMAFs = [
// [
// { chromosome: "1", start_position: 12345, Hugo_Symbol: "TP53", ... },
// ...
// ],
// [
// { chromosome: "2", start_position: 98765, Hugo_Symbol: "EGFR", ... },
// ...
// ]
// ];