YAP1--MAML2
per ticket 1618 and pbta-histologies.tsv
has been updated to include newly subtyped samples.fusion_filtering
resulted in a minor update to pbta-fusion-recurrent-fusion-bysample.tsv
and pbta-fusion-recurrently-fused-genes-byhistology.tsv
to correctly capture EWSR1::FLI1 recurrent fusions/genes (see PR review here.
data
└──release-v23-20230115
├── StrexomeLite_Targets_CrossMap_hg38_filtered_chr_prefixed.bed
├── StrexomeLite_hg38_liftover_100bp_padded.bed
├── WGS.hg38.lancet.300bp_padded.bed
├── WGS.hg38.lancet.unpadded.bed
├── WGS.hg38.mutect2.vardict.unpadded.bed
├── WGS.hg38.strelka2.unpadded.bed
├── WGS.hg38.vardict.100bp_padded.bed
├── WXS.hg38.100bp_padded.bed
├── WXS.hg38.lancet.400bp_padded.bed
├── consensus_seg_annotated_cn_autosomes.tsv.gz
├── consensus_seg_annotated_cn_x_and_y.tsv.gz
├── consensus_seg_with_status.tsv
├── data-files-description.md
├── fusion_summary_embryonal_foi.tsv
├── fusion_summary_ependymoma_foi.tsv
├── fusion_summary_ewings_foi.tsv
├── fusion_summary_lgat_foi.tsv
├── independent-specimens.rnaseq.primary-plus-polya.tsv
├── independent-specimens.rnaseq.primary-plus-stranded.tsv
├── independent-specimens.wgs.primary-plus.tsv
├── independent-specimens.wgs.primary.tsv
├── independent-specimens.wgswxs.primary-plus.tsv
├── independent-specimens.wgswxs.primary.tsv
├── intersect_cds_lancet_WXS.bed
├── intersect_cds_lancet_strelka_mutect_WGS.bed
├── intersect_strelka_mutect_WGS.bed
├── intersected_whole_exome_agilent_designed_120_AND_tcga_6k_genes.Gh38.bed
├── intersected_whole_exome_agilent_plus_tcga_6k_AND_tcga_6k_genes.Gh38.bed
├── md5sum.txt
├── pbta-cnv-cnvkit-gistic.zip
├── pbta-cnv-cnvkit.seg.gz
├── pbta-cnv-consensus-gistic.zip
├── pbta-cnv-consensus.seg.gz
├── pbta-cnv-controlfreec.tsv.gz
├── pbta-fusion-arriba.tsv.gz
├── pbta-fusion-putative-oncogenic.tsv
├── pbta-fusion-recurrently-fused-genes-byhistology.tsv
├── pbta-fusion-recurrently-fused-genes-bysample.tsv
├── pbta-fusion-starfusion.tsv.gz
├── pbta-gene-counts-rsem-expected_count.polya.rds
├── pbta-gene-counts-rsem-expected_count.stranded.rds
├── pbta-gene-expression-kallisto.polya.rds
├── pbta-gene-expression-kallisto.stranded.rds
├── pbta-gene-expression-rsem-fpkm-collapsed.polya.rds
├── pbta-gene-expression-rsem-fpkm-collapsed.stranded.rds
├── pbta-gene-expression-rsem-fpkm.polya.rds
├── pbta-gene-expression-rsem-fpkm.stranded.rds
├── pbta-gene-expression-rsem-tpm.polya.rds
├── pbta-gene-expression-rsem-tpm.stranded.rds
├── pbta-histologies-base.tsv
├── pbta-histologies.tsv
├── pbta-isoform-counts-rsem-expected_count.polya.rds
├── pbta-isoform-counts-rsem-expected_count.stranded.rds
├── pbta-isoform-expression-rsem-tpm.polya.rds
├── pbta-isoform-expression-rsem-tpm.stranded.rds
├── pbta-mb-pathology-subtypes.tsv
├── pbta-mend-qc-manifest.tsv
├── pbta-mend-qc-results.tar.gz
├── pbta-snv-consensus-mutation-tmb-all.tsv
├── pbta-snv-consensus-mutation-tmb-coding.tsv
├── pbta-snv-consensus-mutation.maf.tsv.gz
├── pbta-snv-lancet.vep.maf.gz
├── pbta-snv-mutation-tmb-all.tsv
├── pbta-snv-mutation-tmb-coding.tsv
├── pbta-snv-mutect2.vep.maf.gz
├── pbta-snv-scavenged-hotspots.maf.tsv.gz
├── pbta-snv-strelka2.vep.maf.gz
├── pbta-snv-vardict.vep.maf.gz
├── pbta-star-log-final.tar.gz
├── pbta-star-log-manifest.tsv
├── pbta-sv-manta.tsv.gz
├── pbta-tcga-manifest.tsv
├── pbta-tcga-snv-lancet.vep.maf.gz
├── pbta-tcga-snv-mutect2.vep.maf.gz
├── pbta-tcga-snv-strelka2.vep.maf.gz
├── release-notes.md
├── tcga-snv-consensus-snv.maf.tsv.gz
├── tcga-snv-mutation-tmb-all.tsv
├── tcga-snv-mutation-tmb-coding.tsv
├── tcga_6k_genes.targetIntervals.Gh38.bed
├── tcga_6k_genes.targetIntervals.bed
├── whole_exome_agilent_1.1_refseq_plus_3_boosters.targetIntervals.Gh38.bed
├── whole_exome_agilent_1.1_refseq_plus_3_boosters.targetIntervals.bed
├── whole_exome_agilent_designed_120.targetIntervals.Gh38.bed
├── whole_exome_agilent_designed_120.targetIntervals.bed
├── whole_exome_agilent_plus_tcga_6k.targetIntervals.Gh38.bed
└── whole_exome_agilent_plus_tcga_6k.targetIntervals.bed
consensus_seg_with_status.tsv
(from focal-cn-prep
) was added to the release since it is being used in the tp53-nf1-score
modulepbta-histologies-base.tsv
and pbta-histologies.tsv
were updated to reflect the previous code changes to molecular-subtyping-MB
1207 and molecular-subtyping-LGAT
, which were not previously propagated to the files.
OS_days
and tumor_descriptor
are documented in this V22-release PR comment.data
└── release-v22-20220505
├── consensus_seg_annotated_cn_autosomes.tsv.gz
├── consensus_seg_annotated_cn_x_and_y.tsv.gz
├── consensus_seg_with_status.tsv
├── data-files-description.md
├── fusion_summary_embryonal_foi.tsv
├── fusion_summary_ependymoma_foi.tsv
├── fusion_summary_ewings_foi.tsv
├── fusion_summary_lgat_foi.tsv
├── independent-specimens.rnaseq.primary-plus-polya.tsv
├── independent-specimens.rnaseq.primary-plus-stranded.tsv
├── independent-specimens.wgs.primary-plus.tsv
├── independent-specimens.wgs.primary.tsv
├── independent-specimens.wgswxs.primary-plus.tsv
├── independent-specimens.wgswxs.primary.tsv
├── intersect_cds_lancet_strelka_mutect_WGS.bed
├── intersect_cds_lancet_WXS.bed
├── intersect_strelka_mutect_WGS.bed
├── md5sum.txt
├── pbta-cnv-cnvkit-gistic.zip
├── pbta-cnv-cnvkit.seg.gz
├── pbta-cnv-consensus-gistic.zip
├── pbta-cnv-consensus.seg.gz
├── pbta-cnv-controlfreec.tsv.gz
├── pbta-fusion-arriba.tsv.gz
├── pbta-fusion-putative-oncogenic.tsv
├── pbta-fusion-recurrently-fused-genes-byhistology.tsv
├── pbta-fusion-recurrently-fused-genes-bysample.tsv
├── pbta-fusion-starfusion.tsv.gz
├── pbta-gene-counts-rsem-expected_count.polya.rds
├── pbta-gene-counts-rsem-expected_count.stranded.rds
├── pbta-gene-expression-kallisto.polya.rds
├── pbta-gene-expression-kallisto.stranded.rds
├── pbta-gene-expression-rsem-fpkm-collapsed.polya.rds
├── pbta-gene-expression-rsem-fpkm-collapsed.stranded.rds
├── pbta-gene-expression-rsem-fpkm.polya.rds
├── pbta-gene-expression-rsem-fpkm.stranded.rds
├── pbta-gene-expression-rsem-tpm.polya.rds
├── pbta-gene-expression-rsem-tpm.stranded.rds
├── pbta-histologies.tsv
├── pbta-histologies-base.tsv
├── pbta-isoform-counts-rsem-expected_count.polya.rds
├── pbta-isoform-counts-rsem-expected_count.stranded.rds
├── pbta-isoform-expression-rsem-tpm.polya.rds
├── pbta-isoform-expression-rsem-tpm.stranded.rds
├── pbta-mend-qc-manifest.tsv
├── pbta-mend-qc-results.tar.gz
├── pbta-snv-consensus-mutation.maf.tsv.gz
├── pbta-snv-consensus-hotspot-mutation.maf.tsv.gz
├── pbta-snv-consensus-mutation-tmb-all.tsv
├── pbta-snv-consensus-mutation-tmb-coding.tsv
├── pbta-snv-lancet.vep.maf.gz
├── pbta-snv-mutect2.vep.maf.gz
├── pbta-snv-scavenged-hotspots.maf.tsv.gz
├── pbta-snv-strelka2.vep.maf.gz
├── pbta-snv-vardict.vep.maf.gz
├── pbta-star-log-final.tar.gz
├── pbta-star-log-manifest.tsv
├── pbta-sv-manta.tsv.gz
├── pbta-tcga-manifest.tsv
├── pbta-tcga-snv-lancet.vep.maf.gz
├── pbta-tcga-snv-mutect2.vep.maf.gz
├── pbta-tcga-snv-strelka2.vep.maf.gz
├── release-notes.md
├── StrexomeLite_hg38_liftover_100bp_padded.bed
├── StrexomeLite_Targets_CrossMap_hg38_filtered_chr_prefixed.bed
├── WGS.hg38.lancet.300bp_padded.bed
├── WGS.hg38.lancet.unpadded.bed
├── WGS.hg38.mutect2.vardict.unpadded.bed
├── WGS.hg38.strelka2.unpadded.bed
├── WGS.hg38.vardict.100bp_padded.bed
├── WXS.hg38.100bp_padded.bed
├── WXS.hg38.lancet.400bp_padded.bed
├── intersected_whole_exome_agilent_designed_120_AND_tcga_6k_genes.Gh38.bed
├── intersected_whole_exome_agilent_plus_tcga_6k_AND_tcga_6k_genes.Gh38.bed
├── tcga_6k_genes.targetIntervals.Gh38.bed
├── tcga_6k_genes.targetIntervals.bed
├── tcga-snv-consensus-snv.maf.tsv.gz
├── tcga-snv-mutation-tmb-all.tsv
├── tcga-snv-mutation-tmb-coding.tsv
├── whole_exome_agilent_1.1_refseq_plus_3_boosters.targetIntervals.Gh38.bed
├── whole_exome_agilent_1.1_refseq_plus_3_boosters.targetIntervals.bed
├── whole_exome_agilent_designed_120.targetIntervals.Gh38.bed
├── whole_exome_agilent_designed_120.targetIntervals.bed
├── whole_exome_agilent_plus_tcga_6k.targetIntervals.Gh38.bed
└── whole_exome_agilent_plus_tcga_6k.targetIntervals.bed
data
└── release-v21-20210820
├── consensus_seg_annotated_cn_autosomes.tsv.gz
├── consensus_seg_annotated_cn_x_and_y.tsv.gz
├── data-files-description.md
├── fusion_summary_embryonal_foi.tsv
├── fusion_summary_ependymoma_foi.tsv
├── fusion_summary_ewings_foi.tsv
├── fusion_summary_lgat_foi.tsv
├── independent-specimens.rnaseq.primary-plus-polya.tsv
├── independent-specimens.rnaseq.primary-plus-stranded.tsv
├── independent-specimens.wgs.primary-plus.tsv
├── independent-specimens.wgs.primary.tsv
├── independent-specimens.wgswxs.primary-plus.tsv
├── independent-specimens.wgswxs.primary.tsv
├── intersect_cds_lancet_strelka_mutect_WGS.bed
├── intersect_cds_lancet_WXS.bed
├── intersect_strelka_mutect_WGS.bed
├── md5sum.txt
├── pbta-cnv-cnvkit-gistic.zip
├── pbta-cnv-cnvkit.seg.gz
├── pbta-cnv-consensus-gistic.zip
├── pbta-cnv-consensus.seg.gz
├── pbta-cnv-controlfreec.tsv.gz
├── pbta-fusion-arriba.tsv.gz
├── pbta-fusion-putative-oncogenic.tsv
├── pbta-fusion-recurrently-fused-genes-byhistology.tsv
├── pbta-fusion-recurrently-fused-genes-bysample.tsv
├── pbta-fusion-starfusion.tsv.gz
├── pbta-gene-counts-rsem-expected_count.polya.rds
├── pbta-gene-counts-rsem-expected_count.stranded.rds
├── pbta-gene-expression-kallisto.polya.rds
├── pbta-gene-expression-kallisto.stranded.rds
├── pbta-gene-expression-rsem-fpkm-collapsed.polya.rds
├── pbta-gene-expression-rsem-fpkm-collapsed.stranded.rds
├── pbta-gene-expression-rsem-fpkm.polya.rds
├── pbta-gene-expression-rsem-fpkm.stranded.rds
├── pbta-gene-expression-rsem-tpm.polya.rds
├── pbta-gene-expression-rsem-tpm.stranded.rds
├── pbta-histologies.tsv
├── pbta-histologies-base.tsv
├── pbta-isoform-counts-rsem-expected_count.polya.rds
├── pbta-isoform-counts-rsem-expected_count.stranded.rds
├── pbta-isoform-expression-rsem-tpm.polya.rds
├── pbta-isoform-expression-rsem-tpm.stranded.rds
├── pbta-mend-qc-manifest.tsv
├── pbta-mend-qc-results.tar.gz
├── pbta-snv-consensus-mutation.maf.tsv.gz
├── pbta-snv-consensus-hotspot-mutation.maf.tsv.gz
├── pbta-snv-consensus-mutation-tmb-all.tsv
├── pbta-snv-consensus-mutation-tmb-coding.tsv
├── pbta-snv-lancet.vep.maf.gz
├── pbta-snv-mutect2.vep.maf.gz
├── pbta-snv-scavenged-hotspots.maf.tsv.gz
├── pbta-snv-strelka2.vep.maf.gz
├── pbta-snv-vardict.vep.maf.gz
├── pbta-star-log-final.tar.gz
├── pbta-star-log-manifest.tsv
├── pbta-sv-manta.tsv.gz
├── pbta-tcga-manifest.tsv
├── pbta-tcga-snv-lancet.vep.maf.gz
├── pbta-tcga-snv-mutect2.vep.maf.gz
├── pbta-tcga-snv-strelka2.vep.maf.gz
├── release-notes.md
├── StrexomeLite_hg38_liftover_100bp_padded.bed
├── StrexomeLite_Targets_CrossMap_hg38_filtered_chr_prefixed.bed
├── WGS.hg38.lancet.300bp_padded.bed
├── WGS.hg38.lancet.unpadded.bed
├── WGS.hg38.mutect2.vardict.unpadded.bed
├── WGS.hg38.strelka2.unpadded.bed
├── WGS.hg38.vardict.100bp_padded.bed
├── WXS.hg38.100bp_padded.bed
├── WXS.hg38.lancet.400bp_padded.bed
├── intersected_whole_exome_agilent_designed_120_AND_tcga_6k_genes.Gh38.bed
├── intersected_whole_exome_agilent_plus_tcga_6k_AND_tcga_6k_genes.Gh38.bed
├── tcga_6k_genes.targetIntervals.Gh38.bed
├── tcga_6k_genes.targetIntervals.bed
├── tcga-snv-consensus-snv.maf.tsv.gz
├── tcga-snv-mutation-tmb-all.tsv
├── tcga-snv-mutation-tmb-coding.tsv
├── whole_exome_agilent_1.1_refseq_plus_3_boosters.targetIntervals.Gh38.bed
├── whole_exome_agilent_1.1_refseq_plus_3_boosters.targetIntervals.bed
├── whole_exome_agilent_designed_120.targetIntervals.Gh38.bed
├── whole_exome_agilent_designed_120.targetIntervals.bed
├── whole_exome_agilent_plus_tcga_6k.targetIntervals.Gh38.bed
└── whole_exome_agilent_plus_tcga_6k.targetIntervals.bed
data
└── release-v20-20210726
├── consensus_seg_annotated_cn_autosomes.tsv.gz
├── consensus_seg_annotated_cn_x_and_y.tsv.gz
├── data-files-description.md
├── fusion_summary_embryonal_foi.tsv
├── fusion_summary_ependymoma_foi.tsv
├── fusion_summary_ewings_foi.tsv
├── fusion_summary_lgat_foi.tsv
├── independent-specimens.rnaseq.primary-plus-polya.tsv
├── independent-specimens.rnaseq.primary-plus-stranded.tsv
├── independent-specimens.wgs.primary-plus.tsv
├── independent-specimens.wgs.primary.tsv
├── independent-specimens.wgswxs.primary-plus.tsv
├── independent-specimens.wgswxs.primary.tsv
├── intersect_cds_lancet_strelka_mutect_WGS.bed
├── intersect_cds_lancet_WXS.bed
├── intersect_strelka_mutect_WGS.bed
├── md5sum.txt
├── pbta-cnv-cnvkit-gistic.zip
├── pbta-cnv-cnvkit.seg.gz
├── pbta-cnv-consensus-gistic.zip
├── pbta-cnv-consensus.seg.gz
├── pbta-cnv-controlfreec.tsv.gz
├── pbta-fusion-arriba.tsv.gz
├── pbta-fusion-putative-oncogenic.tsv
├── pbta-fusion-recurrently-fused-genes-byhistology.tsv
├── pbta-fusion-recurrently-fused-genes-bysample.tsv
├── pbta-fusion-starfusion.tsv.gz
├── pbta-gene-counts-rsem-expected_count.polya.rds
├── pbta-gene-counts-rsem-expected_count.stranded.rds
├── pbta-gene-expression-kallisto.polya.rds
├── pbta-gene-expression-kallisto.stranded.rds
├── pbta-gene-expression-rsem-fpkm-collapsed.polya.rds
├── pbta-gene-expression-rsem-fpkm-collapsed.stranded.rds
├── pbta-gene-expression-rsem-fpkm.polya.rds
├── pbta-gene-expression-rsem-fpkm.stranded.rds
├── pbta-gene-expression-rsem-tpm.polya.rds
├── pbta-gene-expression-rsem-tpm.stranded.rds
├── pbta-histologies.tsv
├── pbta-histologies-base.tsv
├── pbta-isoform-counts-rsem-expected_count.polya.rds
├── pbta-isoform-counts-rsem-expected_count.stranded.rds
├── pbta-isoform-expression-rsem-tpm.polya.rds
├── pbta-isoform-expression-rsem-tpm.stranded.rds
├── pbta-mend-qc-manifest.tsv
├── pbta-mend-qc-results.tar.gz
├── pbta-snv-consensus-mutation.maf.tsv.gz
├── pbta-snv-consensus-hotspot-mutation.maf.tsv.gz
├── pbta-snv-consensus-mutation-tmb-all.tsv
├── pbta-snv-consensus-mutation-tmb-coding.tsv
├── pbta-snv-lancet.vep.maf.gz
├── pbta-snv-mutect2.vep.maf.gz
├── pbta-snv-scavenged-hotspots.maf.tsv.gz
├── pbta-snv-strelka2.vep.maf.gz
├── pbta-snv-vardict.vep.maf.gz
├── pbta-star-log-final.tar.gz
├── pbta-star-log-manifest.tsv
├── pbta-sv-manta.tsv.gz
├── pbta-tcga-manifest.tsv
├── pbta-tcga-snv-lancet.vep.maf.gz
├── pbta-tcga-snv-mutect2.vep.maf.gz
├── pbta-tcga-snv-strelka2.vep.maf.gz
├── release-notes.md
├── StrexomeLite_hg38_liftover_100bp_padded.bed
├── StrexomeLite_Targets_CrossMap_hg38_filtered_chr_prefixed.bed
├── WGS.hg38.lancet.300bp_padded.bed
├── WGS.hg38.lancet.unpadded.bed
├── WGS.hg38.mutect2.vardict.unpadded.bed
├── WGS.hg38.strelka2.unpadded.bed
├── WGS.hg38.vardict.100bp_padded.bed
├── WXS.hg38.100bp_padded.bed
├── WXS.hg38.lancet.400bp_padded.bed
├── intersected_whole_exome_agilent_designed_120_AND_tcga_6k_genes.Gh38.bed
├── intersected_whole_exome_agilent_plus_tcga_6k_AND_tcga_6k_genes.Gh38.bed
├── tcga_6k_genes.targetIntervals.Gh38.bed
├── tcga_6k_genes.targetIntervals.bed
├── tcga-snv-consensus-snv.maf.tsv.gz
├── tcga-snv-mutation-tmb-all.tsv
├── tcga-snv-mutation-tmb-coding.tsv
├── whole_exome_agilent_1.1_refseq_plus_3_boosters.targetIntervals.Gh38.bed
├── whole_exome_agilent_1.1_refseq_plus_3_boosters.targetIntervals.bed
├── whole_exome_agilent_designed_120.targetIntervals.Gh38.bed
├── whole_exome_agilent_designed_120.targetIntervals.bed
├── whole_exome_agilent_plus_tcga_6k.targetIntervals.Gh38.bed
└── whole_exome_agilent_plus_tcga_6k.targetIntervals.bed
pbta-mb-pathology-subtypes.tsv
containing pathology report molecular subtypes for medulloblastoma to be used as a “truth” set for medulloblastoma classifier testing per #746data
└── release-v19-20210423
├── consensus_seg_annotated_cn_autosomes.tsv.gz
├── consensus_seg_annotated_cn_x_and_y.tsv.gz
├── data-files-description.md
├── fusion_summary_embryonal_foi.tsv
├── fusion_summary_ependymoma_foi.tsv
├── fusion_summary_ewings_foi.tsv
├── fusion_summary_lgat_foi.tsv
├── independent-specimens.rnaseq.primary-plus-polya.tsv
├── independent-specimens.rnaseq.primary-plus-stranded.tsv
├── independent-specimens.wgs.primary-plus.tsv
├── independent-specimens.wgs.primary.tsv
├── independent-specimens.wgswxs.primary-plus.tsv
├── independent-specimens.wgswxs.primary.tsv
├── intersect_cds_lancet_strelka_mutect_WGS.bed
├── intersect_cds_lancet_WXS.bed
├── intersect_strelka_mutect_WGS.bed
├── md5sum.txt
├── pbta-cnv-cnvkit-gistic.zip
├── pbta-cnv-cnvkit.seg.gz
├── pbta-cnv-consensus-gistic.zip
├── pbta-cnv-consensus.seg.gz
├── pbta-cnv-controlfreec.tsv.gz
├── pbta-fusion-arriba.tsv.gz
├── pbta-fusion-putative-oncogenic.tsv
├── pbta-fusion-recurrently-fused-genes-byhistology.tsv
├── pbta-fusion-recurrently-fused-genes-bysample.tsv
├── pbta-fusion-starfusion.tsv.gz
├── pbta-gene-counts-rsem-expected_count.polya.rds
├── pbta-gene-counts-rsem-expected_count.stranded.rds
├── pbta-gene-expression-kallisto.polya.rds
├── pbta-gene-expression-kallisto.stranded.rds
├── pbta-gene-expression-rsem-fpkm-collapsed.polya.rds
├── pbta-gene-expression-rsem-fpkm-collapsed.stranded.rds
├── pbta-gene-expression-rsem-fpkm.polya.rds
├── pbta-gene-expression-rsem-fpkm.stranded.rds
├── pbta-gene-expression-rsem-tpm.polya.rds
├── pbta-gene-expression-rsem-tpm.stranded.rds
├── pbta-histologies.tsv
├── pbta-histologies-base.tsv
├── pbta-isoform-counts-rsem-expected_count.polya.rds
├── pbta-isoform-counts-rsem-expected_count.stranded.rds
├── pbta-isoform-expression-rsem-tpm.polya.rds
├── pbta-isoform-expression-rsem-tpm.stranded.rds
├── pbta-mend-qc-manifest.tsv
├── pbta-mend-qc-results.tar.gz
├── pbta-snv-consensus-mutation.maf.tsv.gz
├── pbta-snv-consensus-mutation-tmb-all.tsv
├── pbta-snv-consensus-mutation-tmb-coding.tsv
├── pbta-snv-lancet.vep.maf.gz
├── pbta-snv-mutect2.vep.maf.gz
├── pbta-snv-scavenged-hotspots.maf.tsv.gz
├── pbta-snv-strelka2.vep.maf.gz
├── pbta-snv-vardict.vep.maf.gz
├── pbta-star-log-final.tar.gz
├── pbta-star-log-manifest.tsv
├── pbta-sv-manta.tsv.gz
├── pbta-tcga-manifest.tsv
├── pbta-tcga-snv-lancet.vep.maf.gz
├── pbta-tcga-snv-mutect2.vep.maf.gz
├── pbta-tcga-snv-strelka2.vep.maf.gz
├── release-notes.md
├── StrexomeLite_hg38_liftover_100bp_padded.bed
├── StrexomeLite_Targets_CrossMap_hg38_filtered_chr_prefixed.bed
├── WGS.hg38.lancet.300bp_padded.bed
├── WGS.hg38.lancet.unpadded.bed
├── WGS.hg38.mutect2.vardict.unpadded.bed
├── WGS.hg38.strelka2.unpadded.bed
├── WGS.hg38.vardict.100bp_padded.bed
├── WXS.hg38.100bp_padded.bed
├── WXS.hg38.lancet.400bp_padded.bed
├── intersected_whole_exome_agilent_designed_120_AND_tcga_6k_genes.Gh38.bed
├── intersected_whole_exome_agilent_plus_tcga_6k_AND_tcga_6k_genes.Gh38.bed
├── tcga_6k_genes.targetIntervals.Gh38.bed
├── tcga_6k_genes.targetIntervals.bed
├── tcga-snv-consensus-snv.maf.tsv.gz
├── tcga-snv-mutation-tmb-all.tsv
├── tcga-snv-mutation-tmb-coding.tsv
├── whole_exome_agilent_1.1_refseq_plus_3_boosters.targetIntervals.Gh38.bed
├── whole_exome_agilent_1.1_refseq_plus_3_boosters.targetIntervals.bed
├── whole_exome_agilent_designed_120.targetIntervals.Gh38.bed
├── whole_exome_agilent_designed_120.targetIntervals.bed
├── whole_exome_agilent_plus_tcga_6k.targetIntervals.Gh38.bed
└── whole_exome_agilent_plus_tcga_6k.targetIntervals.bed
pbta-histologies-base.tsv
file, which does not contain molecular_subtype
nor integrated_diagnosis
and will be used for molecular subtyping modules in conjunction with data releasespathology_diagnosis
to CBTN pathology_diagnosis
and move original diagnosis to pathology_free_text_diagnosis
cohort
to “CBTN”data
└── release-v18-20201123
├── consensus_seg_annotated_cn_autosomes.tsv.gz
├── consensus_seg_annotated_cn_x_and_y.tsv.gz
├── data-files-description.md
├── fusion_summary_embryonal_foi.tsv
├── fusion_summary_ependymoma_foi.tsv
├── fusion_summary_ewings_foi.tsv
├── fusion_summary_lgat_foi.tsv
├── independent-specimens.rnaseq.primary-plus-polya.tsv
├── independent-specimens.rnaseq.primary-plus-stranded.tsv
├── independent-specimens.wgs.primary-plus.tsv
├── independent-specimens.wgs.primary.tsv
├── independent-specimens.wgswxs.primary-plus.tsv
├── independent-specimens.wgswxs.primary.tsv
├── intersect_cds_lancet_strelka_mutect_WGS.bed
├── intersect_cds_lancet_WXS.bed
├── intersect_strelka_mutect_WGS.bed
├── md5sum.txt
├── pbta-cnv-cnvkit-gistic.zip
├── pbta-cnv-cnvkit.seg.gz
├── pbta-cnv-consensus-gistic.zip
├── pbta-cnv-consensus.seg.gz
├── pbta-cnv-controlfreec.tsv.gz
├── pbta-fusion-arriba.tsv.gz
├── pbta-fusion-putative-oncogenic.tsv
├── pbta-fusion-recurrently-fused-genes-byhistology.tsv
├── pbta-fusion-recurrently-fused-genes-bysample.tsv
├── pbta-fusion-starfusion.tsv.gz
├── pbta-gene-counts-rsem-expected_count.polya.rds
├── pbta-gene-counts-rsem-expected_count.stranded.rds
├── pbta-gene-expression-kallisto.polya.rds
├── pbta-gene-expression-kallisto.stranded.rds
├── pbta-gene-expression-rsem-fpkm-collapsed.polya.rds
├── pbta-gene-expression-rsem-fpkm-collapsed.stranded.rds
├── pbta-gene-expression-rsem-fpkm.polya.rds
├── pbta-gene-expression-rsem-fpkm.stranded.rds
├── pbta-gene-expression-rsem-tpm.polya.rds
├── pbta-gene-expression-rsem-tpm.stranded.rds
├── pbta-histologies.tsv
├── pbta-histologies-base.tsv
├── pbta-isoform-counts-rsem-expected_count.polya.rds
├── pbta-isoform-counts-rsem-expected_count.stranded.rds
├── pbta-isoform-expression-rsem-tpm.polya.rds
├── pbta-isoform-expression-rsem-tpm.stranded.rds
├── pbta-mend-qc-manifest.tsv
├── pbta-mend-qc-results.tar.gz
├── pbta-snv-consensus-mutation.maf.tsv.gz
├── pbta-snv-consensus-mutation-tmb-all.tsv
├── pbta-snv-consensus-mutation-tmb-coding.tsv
├── pbta-snv-lancet.vep.maf.gz
├── pbta-snv-mutect2.vep.maf.gz
├── pbta-snv-strelka2.vep.maf.gz
├── pbta-snv-vardict.vep.maf.gz
├── pbta-star-log-final.tar.gz
├── pbta-star-log-manifest.tsv
├── pbta-sv-manta.tsv.gz
├── pbta-tcga-manifest.tsv
├── pbta-tcga-snv-lancet.vep.maf.gz
├── pbta-tcga-snv-mutect2.vep.maf.gz
├── pbta-tcga-snv-strelka2.vep.maf.gz
├── release-notes.md
├── StrexomeLite_hg38_liftover_100bp_padded.bed
├── StrexomeLite_Targets_CrossMap_hg38_filtered_chr_prefixed.bed
├── WGS.hg38.lancet.300bp_padded.bed
├── WGS.hg38.lancet.unpadded.bed
├── WGS.hg38.mutect2.vardict.unpadded.bed
├── WGS.hg38.strelka2.unpadded.bed
├── WGS.hg38.vardict.100bp_padded.bed
├── WXS.hg38.100bp_padded.bed
├── WXS.hg38.lancet.400bp_padded.bed
├── intersected_whole_exome_agilent_designed_120_AND_tcga_6k_genes.Gh38.bed
├── intersected_whole_exome_agilent_plus_tcga_6k_AND_tcga_6k_genes.Gh38.bed
├── tcga_6k_genes.targetIntervals.Gh38.bed
├── tcga_6k_genes.targetIntervals.bed
├── tcga-snv-consensus-snv.maf.tsv.gz
├── tcga-snv-mutation-tmb-all.tsv
├── tcga-snv-mutation-tmb-coding.tsv
├── whole_exome_agilent_1.1_refseq_plus_3_boosters.targetIntervals.Gh38.bed
├── whole_exome_agilent_1.1_refseq_plus_3_boosters.targetIntervals.bed
├── whole_exome_agilent_designed_120.targetIntervals.Gh38.bed
├── whole_exome_agilent_designed_120.targetIntervals.bed
├── whole_exome_agilent_plus_tcga_6k.targetIntervals.Gh38.bed
└── whole_exome_agilent_plus_tcga_6k.targetIntervals.bed
aliquot_id
, parent_aliquot_id
, reported_gender
, race
, ethnicity
, tumor_descriptor
, primary_site
, age_at_diagnosis_days
, pathology_diagnosis
, integrated_diagnosis
, short_histology
, broad_histology
, Notes
, OS_days
, OS_status
, age_last_update_days
,seq_center
, cancer_predispositions
, molecular_subtype
integrated_diagnosis
, Notes
, short_histology
and broad_histology
: If the sample was subtyped, values were updated based on new subtyping, if the pathology_diagnosis
changed between v16 and v17, all values were set to NA
and will be updated in the next release, all remaining samples were set to the values from v16.glioma_brain_region
to CNS_region
, updated logic and entities, and assigned regions for all tumorspathology_free_text_diagnosis
and cohort_participant_id
BS_QB84TBA3
and BS_S2TA8R29
(previously NA)BS_MVYA262V
sample_id 7316-3213
was changed from A05265
to A05233
pbta-snv-consensus-mutation-tmb-coding.tsv
post bug fix, etc. per #727 and #728data
└── release-v17-20200908
├── consensus_seg_annotated_cn_autosomes.tsv.gz
├── consensus_seg_annotated_cn_x_and_y.tsv.gz
├── data-files-description.md
├── fusion_summary_embryonal_foi.tsv
├── fusion_summary_ependymoma_foi.tsv
├── fusion_summary_ewings_foi.tsv
├── independent-specimens.wgs.primary-plus.tsv
├── independent-specimens.wgs.primary.tsv
├── independent-specimens.wgswxs.primary-plus.tsv
├── independent-specimens.wgswxs.primary.tsv
├── intersect_cds_lancet_strelka_mutect_WGS.bed
├── intersect_cds_lancet_WXS.bed
├── intersect_strelka_mutect_WGS.bed
├── md5sum.txt
├── pbta-cnv-cnvkit-gistic.zip
├── pbta-cnv-cnvkit.seg.gz
├── pbta-cnv-consensus-gistic.zip
├── pbta-cnv-consensus.seg.gz
├── pbta-cnv-controlfreec.tsv.gz
├── pbta-fusion-arriba.tsv.gz
├── pbta-fusion-putative-oncogenic.tsv
├── pbta-fusion-recurrently-fused-genes-byhistology.tsv
├── pbta-fusion-recurrently-fused-genes-bysample.tsv
├── pbta-fusion-starfusion.tsv.gz
├── pbta-gene-counts-rsem-expected_count.polya.rds
├── pbta-gene-counts-rsem-expected_count.stranded.rds
├── pbta-gene-expression-kallisto.polya.rds
├── pbta-gene-expression-kallisto.stranded.rds
├── pbta-gene-expression-rsem-fpkm-collapsed.polya.rds
├── pbta-gene-expression-rsem-fpkm-collapsed.stranded.rds
├── pbta-gene-expression-rsem-fpkm.polya.rds
├── pbta-gene-expression-rsem-fpkm.stranded.rds
├── pbta-gene-expression-rsem-tpm.polya.rds
├── pbta-gene-expression-rsem-tpm.stranded.rds
├── pbta-histologies.tsv
├── pbta-isoform-counts-rsem-expected_count.polya.rds
├── pbta-isoform-counts-rsem-expected_count.stranded.rds
├── pbta-isoform-expression-rsem-tpm.polya.rds
├── pbta-isoform-expression-rsem-tpm.stranded.rds
├── pbta-mend-qc-manifest.tsv
├── pbta-mend-qc-results.tar.gz
├── pbta-snv-consensus-mutation.maf.tsv.gz
├── pbta-snv-consensus-mutation-tmb-all.tsv
├── pbta-snv-consensus-mutation-tmb-coding.tsv
├── pbta-snv-lancet.vep.maf.gz
├── pbta-snv-mutect2.vep.maf.gz
├── pbta-snv-strelka2.vep.maf.gz
├── pbta-snv-vardict.vep.maf.gz
├── pbta-star-log-final.tar.gz
├── pbta-star-log-manifest.tsv
├── pbta-sv-manta.tsv.gz
├── pbta-tcga-manifest.tsv
├── pbta-tcga-snv-lancet.vep.maf.gz
├── pbta-tcga-snv-mutect2.vep.maf.gz
├── pbta-tcga-snv-strelka2.vep.maf.gz
├── release-notes.md
├── StrexomeLite_hg38_liftover_100bp_padded.bed
├── StrexomeLite_Targets_CrossMap_hg38_filtered_chr_prefixed.bed
├── WGS.hg38.lancet.300bp_padded.bed
├── WGS.hg38.lancet.unpadded.bed
├── WGS.hg38.mutect2.vardict.unpadded.bed
├── WGS.hg38.strelka2.unpadded.bed
├── WGS.hg38.vardict.100bp_padded.bed
├── WXS.hg38.100bp_padded.bed
├── WXS.hg38.lancet.400bp_padded.bed
├── intersected_whole_exome_agilent_designed_120_AND_tcga_6k_genes.Gh38.bed
├── intersected_whole_exome_agilent_plus_tcga_6k_AND_tcga_6k_genes.Gh38.bed
├── tcga_6k_genes.targetIntervals.Gh38.bed
├── tcga_6k_genes.targetIntervals.bed
├── tcga-snv-consensus-snv.maf.tsv.gz
├── tcga-snv-mutation-tmb-all.tsv
├── tcga-snv-mutation-tmb-coding.tsv
├── whole_exome_agilent_1.1_refseq_plus_3_boosters.targetIntervals.Gh38.bed
├── whole_exome_agilent_1.1_refseq_plus_3_boosters.targetIntervals.bed
├── whole_exome_agilent_designed_120.targetIntervals.Gh38.bed
├── whole_exome_agilent_designed_120.targetIntervals.bed
├── whole_exome_agilent_plus_tcga_6k.targetIntervals.Gh38.bed
└── whole_exome_agilent_plus_tcga_6k.targetIntervals.bed
##archived release
gencode.v19.basic.exome.hg38liftover.100bp_padded.bed
pbta-fusion-putative-oncogenic.tsv
pbta-fusion-recurrently-fused-genes-byhistology.tsv
pbta-fusion-recurrently-fused-genes-bysample.tsv
analyses/fusion-summary
fusion_summary_ewings_foi.tsv
data
└── release-v16-20200320
├── consensus_seg_annotated_cn_autosomes.tsv.gz
├── consensus_seg_annotated_cn_x_and_y.tsv.gz
├── data-files-description.md
├── fusion_summary_embryonal_foi.tsv
├── fusion_summary_ependymoma_foi.tsv
├── fusion_summary_ewings_foi.tsv
├── independent-specimens.wgs.primary-plus.tsv
├── independent-specimens.wgs.primary.tsv
├── independent-specimens.wgswxs.primary-plus.tsv
├── independent-specimens.wgswxs.primary.tsv
├── intersect_cds_lancet_strelka_mutect_WGS.bed
├── intersect_cds_lancet_WXS.bed
├── intersect_strelka_mutect_WGS.bed
├── md5sum.txt
├── pbta-cnv-cnvkit-gistic.zip
├── pbta-cnv-cnvkit.seg.gz
├── pbta-cnv-consensus-gistic.zip
├── pbta-cnv-consensus.seg.gz
├── pbta-cnv-controlfreec.tsv.gz
├── pbta-fusion-arriba.tsv.gz
├── pbta-fusion-putative-oncogenic.tsv
├── pbta-fusion-recurrently-fused-genes-byhistology.tsv
├── pbta-fusion-recurrently-fused-genes-bysample.tsv
├── pbta-fusion-starfusion.tsv.gz
├── pbta-gene-counts-rsem-expected_count.polya.rds
├── pbta-gene-counts-rsem-expected_count.stranded.rds
├── pbta-gene-expression-kallisto.polya.rds
├── pbta-gene-expression-kallisto.stranded.rds
├── pbta-gene-expression-rsem-fpkm-collapsed.polya.rds
├── pbta-gene-expression-rsem-fpkm-collapsed.stranded.rds
├── pbta-gene-expression-rsem-fpkm.polya.rds
├── pbta-gene-expression-rsem-fpkm.stranded.rds
├── pbta-gene-expression-rsem-tpm.polya.rds
├── pbta-gene-expression-rsem-tpm.stranded.rds
├── pbta-histologies.tsv
├── pbta-isoform-counts-rsem-expected_count.polya.rds
├── pbta-isoform-counts-rsem-expected_count.stranded.rds
├── pbta-isoform-expression-rsem-tpm.polya.rds
├── pbta-isoform-expression-rsem-tpm.stranded.rds
├── pbta-mend-qc-manifest.tsv
├── pbta-mend-qc-results.tar.gz
├── pbta-snv-consensus-mutation.maf.tsv.gz
├── pbta-snv-consensus-mutation-tmb-all.tsv
├── pbta-snv-consensus-mutation-tmb-coding.tsv
├── pbta-snv-lancet.vep.maf.gz
├── pbta-snv-mutect2.vep.maf.gz
├── pbta-snv-strelka2.vep.maf.gz
├── pbta-snv-vardict.vep.maf.gz
├── pbta-star-log-final.tar.gz
├── pbta-star-log-manifest.tsv
├── pbta-sv-manta.tsv.gz
├── pbta-tcga-manifest.tsv
├── pbta-tcga-snv-lancet.vep.maf.gz
├── pbta-tcga-snv-mutect2.vep.maf.gz
├── pbta-tcga-snv-strelka2.vep.maf.gz
├── release-notes.md
├── StrexomeLite_hg38_liftover_100bp_padded.bed
├── StrexomeLite_Targets_CrossMap_hg38_filtered_chr_prefixed.bed
├── WGS.hg38.lancet.300bp_padded.bed
├── WGS.hg38.lancet.unpadded.bed
├── WGS.hg38.mutect2.vardict.unpadded.bed
├── WGS.hg38.strelka2.unpadded.bed
├── WGS.hg38.vardict.100bp_padded.bed
├── WXS.hg38.100bp_padded.bed
├── WXS.hg38.lancet.400bp_padded.bed
├── intersected_whole_exome_agilent_designed_120_AND_tcga_6k_genes.Gh38.bed
├── intersected_whole_exome_agilent_plus_tcga_6k_AND_tcga_6k_genes.Gh38.bed
├── tcga_6k_genes.targetIntervals.Gh38.bed
├── tcga_6k_genes.targetIntervals.bed
├── whole_exome_agilent_1.1_refseq_plus_3_boosters.targetIntervals.Gh38.bed
├── whole_exome_agilent_1.1_refseq_plus_3_boosters.targetIntervals.bed
├── whole_exome_agilent_designed_120.targetIntervals.Gh38.bed
├── whole_exome_agilent_designed_120.targetIntervals.bed
├── whole_exome_agilent_plus_tcga_6k.targetIntervals.Gh38.bed
└── whole_exome_agilent_plus_tcga_6k.targetIntervals.bed
pbta-histologies.tsv
to:
data
└── release-v15-20200228
├── release-notes.md
├── data-files-description.md
├── StrexomeLite_Targets_CrossMap_hg38_filtered_chr_prefixed.bed
├── StrexomeLite_hg38_liftover_100bp_padded.bed
├── WGS.hg38.lancet.300bp_padded.bed
├── WGS.hg38.lancet.unpadded.bed
├── WGS.hg38.mutect2.vardict.unpadded.bed
├── WGS.hg38.strelka2.unpadded.bed
├── WGS.hg38.vardict.100bp_padded.bed
├── WXS.hg38.100bp_padded.bed
├── WXS.hg38.lancet.400bp_padded.bed
├── md5sum.txt
├── pbta-cnv-cnvkit.seg.gz
├── pbta-cnv-consensus.seg.gz
├── pbta-cnv-controlfreec.tsv.gz
├── pbta-cnv-cnvkit-gistic.zip
├── pbta-cnv-consensus-gistic.zip
├── pbta-fusion-arriba.tsv.gz
├── pbta-fusion-starfusion.tsv.gz
├── pbta-fusion-putative-oncogenic.tsv
├── pbta-gene-counts-rsem-expected_count.polya.rds
├── pbta-gene-counts-rsem-expected_count.stranded.rds
├── pbta-gene-expression-kallisto.polya.rds
├── pbta-gene-expression-kallisto.stranded.rds
├── pbta-gene-expression-rsem-fpkm.polya.rds
├── pbta-gene-expression-rsem-fpkm.stranded.rds
├── pbta-histologies.tsv
├── pbta-snv-lancet.vep.maf.gz
├── pbta-snv-mutect2.vep.maf.gz
├── pbta-snv-strelka2.vep.maf.gz
├── pbta-snv-vardict.vep.maf.gz
├── pbta-sv-manta.tsv.gz
├── independent-specimens.wgs.primary-plus.tsv
├── independent-specimens.wgs.primary.tsv
├── independent-specimens.wgswxs.primary-plus.tsv
├── independent-specimens.wgswxs.primary.tsv
├── pbta-gene-expression-rsem-fpkm-collapsed.polya.rds
├── pbta-gene-expression-rsem-fpkm-collapsed.stranded.rds
├── pbta-gene-expression-rsem-tpm.polya.rds
├── pbta-gene-expression-rsem-tpm.stranded.rds
├── pbta-isoform-expression-rsem-tpm.polya.rds
├── pbta-isoform-expression-rsem-tpm.stranded.rds
├── pbta-isoform-counts-rsem-expected_count.polya.rds
├── pbta-isoform-counts-rsem-expected_count.stranded.rds
├── pbta-snv-consensus-mutation.maf.tsv.gz
├── pbta-snv-consensus-mutation-tmb-all.tsv
├── pbta-snv-consensus-mutation-tmb-coding.tsv
├── pbta-fusion-recurrently-fused-genes-byhistology.tsv
├── pbta-fusion-recurrently-fused-genes-bysample.tsv
├── pbta-tcga-snv-lancet.vep.maf.gz
├── pbta-tcga-snv-strelka2.vep.maf.gz
├── pbta-tcga-snv-mutect2.vep.maf.gz
├── pbta-tcga-manifest.tsv
├── pbta-mend-qc-results.tar.gz
├── pbta-mend-qc-manifest.tsv
├── pbta-star-log-final.tar.gz
├── pbta-star-log-manifest.tsv
├── intersect_cds_lancet_strelka_mutect_WGS.bed
├── intersect_cds_lancet.bed
├── intersect_strelka_mutect_WGS.bed
├── fusion_summary_embryonal_foi.tsv
├── fusion_summary_ependymoma_foi.tsv
├── gencode.v19.basic.exome.hg38liftover.100bp_padded.bed
├── consensus_seg_annotated_cn_autosomes.tsv.gz
└── consensus_seg_annotated_cn_x_and_y.tsv.gz
analyses/fusion-summary
to include all RNA biospecimens in new pbta-histologies.tsv
file without fusion calls. Files updated:
cnv_consensus.tsv
file per this comment.exon
to cds
per #440 and this comment:
pbta-histologies.tsv
to add embryonal molecular_subtypes
per #251 using results here and high-grade glioma (HGG) molecular_subtypes
per #249 and this commit.
glioma_brain_region
(if sample was not previously classified as glioma)Notes
(to concatenate old notes for disease_type_new
and molecular_subtype
changes and current changes based on OpenPBTA subtyping)disease_type_new
DMG
and rest were “High-grade glioma”. (Previously, DMGs were both HGG and DIPG, but following WHO 2016 nomenclature, we stick with DMG instead of DIPG now that we have subtypes).
molecular_subtype
contains ETMR
, these became Embryonal tumor with multilayer rosettes
while the rest became CNS Embryonal Tumor
short_histology
HGAT
molecular_subtype
contains ETMR
, these became ETMR
while the rest became Embryonal Tumor
broad_histology
Diffuse astrocytic and oligodendroglial tumor
Embryonal Tumor
data
└── release-v14-20200203
├── release-notes.md
├── data-files-description.md
├── StrexomeLite_Targets_CrossMap_hg38_filtered_chr_prefixed.bed
├── StrexomeLite_hg38_liftover_100bp_padded.bed
├── WGS.hg38.lancet.300bp_padded.bed
├── WGS.hg38.lancet.unpadded.bed
├── WGS.hg38.mutect2.vardict.unpadded.bed
├── WGS.hg38.strelka2.unpadded.bed
├── WGS.hg38.vardict.100bp_padded.bed
├── WXS.hg38.100bp_padded.bed
├── WXS.hg38.lancet.400bp_padded.bed
├── md5sum.txt
├── pbta-cnv-cnvkit.seg.gz
├── pbta-cnv-consensus.seg.gz
├── pbta-cnv-controlfreec.tsv.gz
├── pbta-cnv-cnvkit-gistic.zip
├── pbta-cnv-consensus-gistic.zip
├── pbta-fusion-arriba.tsv.gz
├── pbta-fusion-starfusion.tsv.gz
├── pbta-fusion-putative-oncogenic.tsv
├── pbta-gene-counts-rsem-expected_count.polya.rds
├── pbta-gene-counts-rsem-expected_count.stranded.rds
├── pbta-gene-expression-kallisto.polya.rds
├── pbta-gene-expression-kallisto.stranded.rds
├── pbta-gene-expression-rsem-fpkm.polya.rds
├── pbta-gene-expression-rsem-fpkm.stranded.rds
├── pbta-histologies.tsv
├── pbta-snv-lancet.vep.maf.gz
├── pbta-snv-mutect2.vep.maf.gz
├── pbta-snv-strelka2.vep.maf.gz
├── pbta-snv-vardict.vep.maf.gz
├── pbta-sv-manta.tsv.gz
├── independent-specimens.wgs.primary-plus.tsv
├── independent-specimens.wgs.primary.tsv
├── independent-specimens.wgswxs.primary-plus.tsv
├── independent-specimens.wgswxs.primary.tsv
├── pbta-gene-expression-rsem-fpkm-collapsed.polya.rds
├── pbta-gene-expression-rsem-fpkm-collapsed.stranded.rds
├── pbta-gene-expression-rsem-tpm.polya.rds
├── pbta-gene-expression-rsem-tpm.stranded.rds
├── pbta-isoform-expression-rsem-tpm.polya.rds
├── pbta-isoform-expression-rsem-tpm.stranded.rds
├── pbta-isoform-counts-rsem-expected_count.polya.rds
├── pbta-isoform-counts-rsem-expected_count.stranded.rds
├── pbta-snv-consensus-mutation.maf.tsv.gz
├── pbta-snv-consensus-mutation-tmb-all.tsv
├── pbta-snv-consensus-mutation-tmb-coding.tsv
├── pbta-fusion-recurrently-fused-genes-byhistology.tsv
├── pbta-fusion-recurrently-fused-genes-bysample.tsv
├── pbta-tcga-snv-lancet.vep.maf.gz
├── pbta-tcga-snv-strelka2.vep.maf.gz
├── pbta-tcga-snv-mutect2.vep.maf.gz
├── pbta-tcga-manifest.tsv
├── pbta-mend-qc-results.tar.gz
├── pbta-mend-qc-manifest.tsv
├── pbta-star-log-final.tar.gz
├── pbta-star-log-manifest.tsv
├── intersect_cds_lancet_strelka_mutect_WGS.bed
├── intersect_cds_lancet.bed
├── intersect_strelka_mutect_WGS.bed
├── fusion_summary_embryonal_foi.tsv
└── fusion_summary_ependymoma_foi.tsv
analyses/fusion-summary
, subsetted for RNA biospecimens in new pbta-histologies.tsv
file. Files added:
cnv_consensus.tsv
file from #128.WXS.hg38.lancet.400bp_padded.bed
file.pbta-histologies.tsv
to remove RNA-Seq samples listed above, propagate medulloblastoma molecular_subtypes
per #379, harmonize “Diagnosis” to “Initial CNS Tumor”, fix PNOC003 seq_center
for RNA-Seq samples to “TGEN”, harmonize ethnicity
to match CBTTC data in KidsFirst:Old ethnicity | New ethnicity |
---|---|
Non-hispanic | Not Hispanic or Latino |
Unknown | Unavailable/Not Reported |
Not Reported | Unavailable/Not Reported |
data
└── release-v13-20200116
├── release-notes.md
├── data-files-description.md
├── StrexomeLite_Targets_CrossMap_hg38_filtered_chr_prefixed.bed
├── StrexomeLite_hg38_liftover_100bp_padded.bed
├── WGS.hg38.lancet.300bp_padded.bed
├── WGS.hg38.lancet.unpadded.bed
├── WGS.hg38.mutect2.vardict.unpadded.bed
├── WGS.hg38.strelka2.unpadded.bed
├── WGS.hg38.vardict.100bp_padded.bed
├── WXS.hg38.100bp_padded.bed
├── WXS.hg38.lancet.400bp_padded.bed
├── md5sum.txt
├── pbta-cnv-cnvkit.seg.gz
├── pbta-cnv-controlfreec.tsv.gz
├── pbta-cnv-cnvkit-gistic.zip
├── pbta-fusion-arriba.tsv.gz
├── pbta-fusion-starfusion.tsv.gz
├── pbta-fusion-putative-oncogenic.tsv
├── pbta-gene-counts-rsem-expected_count.polya.rds
├── pbta-gene-counts-rsem-expected_count.stranded.rds
├── pbta-gene-expression-kallisto.polya.rds
├── pbta-gene-expression-kallisto.stranded.rds
├── pbta-gene-expression-rsem-fpkm.polya.rds
├── pbta-gene-expression-rsem-fpkm.stranded.rds
├── pbta-histologies.tsv
├── pbta-snv-lancet.vep.maf.gz
├── pbta-snv-mutect2.vep.maf.gz
├── pbta-snv-strelka2.vep.maf.gz
├── pbta-snv-vardict.vep.maf.gz
├── pbta-sv-manta.tsv.gz
├── independent-specimens.wgs.primary-plus.tsv
├── independent-specimens.wgs.primary.tsv
├── independent-specimens.wgswxs.primary-plus.tsv
├── independent-specimens.wgswxs.primary.tsv
├── pbta-gene-expression-rsem-fpkm-collapsed.polya.rds
├── pbta-gene-expression-rsem-fpkm-collapsed.stranded.rds
├── pbta-gene-expression-rsem-tpm.polya.rds
├── pbta-gene-expression-rsem-tpm.stranded.rds
├── pbta-isoform-expression-rsem-tpm.polya.rds
├── pbta-isoform-expression-rsem-tpm.stranded.rds
├── pbta-isoform-counts-rsem-expected_count.polya.rds
├── pbta-isoform-counts-rsem-expected_count.stranded.rds
├── pbta-snv-consensus-mutation.maf.tsv.gz
├── pbta-snv-consensus-mutation-tmb-all.tsv
├── pbta-snv-consensus-mutation-tmb-coding.tsv
├── pbta-fusion-recurrently-fused-genes-byhistology.tsv
├── pbta-fusion-recurrently-fused-genes-bysample.tsv
├── pbta-tcga-snv-lancet.vep.maf.gz
├── pbta-tcga-snv-strelka2.vep.maf.gz
├── pbta-tcga-snv-mutect2.vep.maf.gz
├── pbta-tcga-manifest.tsv
├── pbta-mend-qc-results.tar.gz
├── pbta-mend-qc-manifest.tsv
├── pbta-star-log-final.tar.gz
├── pbta-star-log-manifest.tsv
├── cnv_consensus.tsv
├── intersect_exon_lancet_strelka_mutect_WGS.bed
├── intersect_exon_WXS.bed
├── intersect_strelka_mutect_WGS.bed
├── fusion_summary_embryonal_foi.tsv
└── fusion_summary_ependymoma_foi.tsv
data-file-descriptions.md
with data release to better track file types, origins, and workflows per #334 and #336WGS.hg38.mutect2.unpadded.bed
to WGS.hg38.mutect2.vardict.unpadded.bed
to better reflect usagepbta-histologies.tsv
to add new RNA-Seq samples listed above, #222 harmonize disease separators, and reran medulloblastoma classifier using V12 RSEM fpkm collapsed files
data
└── release-v12-20191217
├── CHANGELOG.md
├── data-file-descriptions.md
├── StrexomeLite_Targets_CrossMap_hg38_filtered_chr_prefixed.bed
├── StrexomeLite_hg38_liftover_100bp_padded.bed
├── WGS.hg38.lancet.300bp_padded.bed
├── WGS.hg38.lancet.unpadded.bed
├── WGS.hg38.mutect2.vardict.unpadded.bed
├── WGS.hg38.strelka2.unpadded.bed
├── WGS.hg38.vardict.100bp_padded.bed
├── WXS.hg38.100bp_padded.bed
├── md5sum.txt
├── pbta-cnv-cnvkit.seg.gz
├── pbta-cnv-controlfreec.tsv.gz
├── pbta-fusion-arriba.tsv.gz
├── pbta-fusion-starfusion.tsv.gz
├── pbta-fusion-putative-oncogenic.tsv
├── pbta-gene-counts-rsem-expected_count.polya.rds
├── pbta-gene-counts-rsem-expected_count.stranded.rds
├── pbta-gene-expression-kallisto.polya.rds
├── pbta-gene-expression-kallisto.stranded.rds
├── pbta-gene-expression-rsem-fpkm.polya.rds
├── pbta-gene-expression-rsem-fpkm.stranded.rds
├── pbta-histologies.tsv
├── pbta-isoform-counts-rsem-expected_count.polya.rds
├── pbta-isoform-counts-rsem-expected_count.stranded.rds
├── pbta-snv-lancet.vep.maf.gz
├── pbta-snv-mutect2.vep.maf.gz
├── pbta-snv-strelka2.vep.maf.gz
├── pbta-snv-vardict.vep.maf.gz
├── pbta-sv-manta.tsv.gz
├── independent-specimens.wgs.primary-plus.tsv
├── independent-specimens.wgs.primary.tsv
├── independent-specimens.wgswxs.primary-plus.tsv
├── independent-specimens.wgswxs.primary.tsv
├── pbta-gene-expression-rsem-fpkm-collapsed.polya.rds
├── pbta-gene-expression-rsem-fpkm-collapsed.stranded.rds
├── pbta-gene-expression-rsem-tpm.polya.rds
├── pbta-gene-expression-rsem-tpm.stranded.rds
├── pbta-isoform-expression-rsem-tpm.polya.rds
├── pbta-isoform-expression-rsem-tpm.stranded.rds
├── pbta-snv-consensus-mutation.maf.tsv.gz
├── pbta-snv-consensus-mutation-tmb-all.tsv
├── pbta-snv-consensus-mutation-tmb-coding.tsv
├── pbta-fusion-recurrently-fused-genes-byhistology.tsv
└── pbta-fusion-recurrently-fused-genes-bysample.tsv
data
└── release-v11-20191126
├── CHANGELOG.md
├── StrexomeLite_Targets_CrossMap_hg38_filtered_chr_prefixed.bed
├── StrexomeLite_hg38_liftover_100bp_padded.bed
├── WGS.hg38.lancet.300bp_padded.bed
├── WGS.hg38.lancet.unpadded.bed
├── WGS.hg38.mutect2.unpadded.bed
├── WGS.hg38.strelka2.unpadded.bed
├── WGS.hg38.vardict.100bp_padded.bed
├── WXS.hg38.100bp_padded.bed
├── md5sum.txt
├── pbta-cnv-cnvkit.seg.gz
├── pbta-cnv-controlfreec.tsv.gz
├── pbta-fusion-arriba.tsv.gz
├── pbta-fusion-starfusion.tsv.gz
├── pbta-fusion-putative-oncogenic.tsv
├── pbta-gene-counts-rsem-expected_count.polya.rds
├── pbta-gene-counts-rsem-expected_count.stranded.rds
├── pbta-gene-expression-kallisto.polya.rds
├── pbta-gene-expression-kallisto.stranded.rds
├── pbta-gene-expression-rsem-fpkm.polya.rds
├── pbta-gene-expression-rsem-fpkm.stranded.rds
├── pbta-histologies.tsv
├── pbta-isoform-counts-rsem-expected_count.polya.rds
├── pbta-isoform-counts-rsem-expected_count.stranded.rds
├── pbta-snv-lancet.vep.maf.gz
├── pbta-snv-mutect2.vep.maf.gz
├── pbta-snv-strelka2.vep.maf.gz
├── pbta-snv-vardict.vep.maf.gz
├── pbta-sv-manta.tsv.gz
├── independent-specimens.wgs.primary-plus.tsv
├── independent-specimens.wgs.primary.tsv
├── independent-specimens.wgswxs.primary-plus.tsv
├── independent-specimens.wgswxs.primary.tsv
├── pbta-gene-expression-rsem-fpkm-collapsed.polya.rds
├── pbta-gene-expression-rsem-fpkm-collapsed.stranded.rds
├── pbta-gene-expression-rsem-tpm.polya.rds
├── pbta-gene-expression-rsem-tpm.stranded.rds
├── pbta-isoform-expression-rsem-tpm.polya.rds
├── pbta-isoform-expression-rsem-tpm.stranded.rds
├── pbta-snv-consensus-mutation.maf.tsv.gz
└── pbta-snv-consensus-mutation-tmb.tsv
### release-v9-20191105
- release date: 2019-11-05
- status: available
- changes:
- Updated RNA-Seq FPKM merge files
- added geneIDs that were missed in previous release
- folder structure:
data └── release-v9-20191105 ├── CHANGELOG.md ├── StrexomeLite_Targets_CrossMap_hg38_filtered_chr_prefixed.bed ├── StrexomeLite_hg38_liftover_100bp_padded.bed ├── WGS.hg38.lancet.300bp_padded.bed ├── WGS.hg38.lancet.unpadded.bed ├── WGS.hg38.mutect2.unpadded.bed ├── WGS.hg38.strelka2.unpadded.bed ├── WGS.hg38.vardict.100bp_padded.bed ├── WXS.hg38.100bp_padded.bed ├── md5sum.txt ├── pbta-cnv-cnvkit.seg.gz ├── pbta-cnv-controlfreec.tsv.gz ├── pbta-fusion-arriba.tsv.gz ├── pbta-fusion-starfusion.tsv.gz ├── pbta-gene-counts-rsem-expected_count.polya.rds ├── pbta-gene-counts-rsem-expected_count.stranded.rds ├── pbta-gene-expression-kallisto.polya.rds ├── pbta-gene-expression-kallisto.stranded.rds ├── pbta-gene-expression-rsem-fpkm.polya.rds ├── pbta-gene-expression-rsem-fpkm.stranded.rds ├── pbta-histologies.tsv ├── pbta-isoform-counts-rsem-expected_count.polya.rds ├── pbta-isoform-counts-rsem-expected_count.stranded.rds ├── pbta-snv-lancet.vep.maf.gz ├── pbta-snv-mutect2.vep.maf.gz ├── pbta-snv-strelka2.vep.maf.gz ├── pbta-snv-vardict.vep.maf.gz ├── pbta-sv-manta.tsv.gz ├── independent-specimens.wgs.primary-plus.tsv ├── independent-specimens.wgs.primary.tsv ├── independent-specimens.wgswxs.primary-plus.tsv └── independent-specimens.wgswxs.primary.tsv
### release-v8-20191104
- release date: 2019-11-04
- status: available
- changes:
- Updated clinical file
- fixed error in `primary_site`: [#214](https://github.com/AlexsLemonade/OpenPBTA-analysis/issues/214)
- Updated ControlFreeC TSV file
- added `tumor_ploidy` and updated `genotype` to `segment_genotype`: [PR comment](https://github.com/AlexsLemonade/OpenPBTA-analysis/pull/216#discussion_r341868007)
- Updated RNA-Seq FPKM merge files
- fixed ID merge error: [#221](https://github.com/AlexsLemonade/OpenPBTA-analysis/issues/221)
- folder structure: same to release-v9-20191105
### release-v7-20191031
- release date: 2019-10-31 :jack_o_lantern:
- status: available
- changes:
- update molecular_subtype for the clinical file
- Add ControlFreeC CNV merged TSV file. [data format](https://github.com/AlexsLemonade/OpenPBTA-analysis/blob/master/doc/format/controlfreec-tsv.md)
- Add ControlFreeC ploidy to clinical file.
- Add sample lists for analyses requiring unique patient specimens. [issues/155](https://github.com/AlexsLemonade/OpenPBTA-analysis/issues/155)
- independent-specimens.wgs.primary-plus.tsv
- independent-specimens.wgswxs.primary-plus.tsv
- independent-specimens.wgs.primary.tsv
- independent-specimens.wgswxs.primary.tsv
- folder structure: same to release-v9-20191105
### release-v6-20191030
- release date: 2019-10-30
- status: available
- changes:
- Clinical file updates:
- Missing `aliquot_id` and `sample_id` added
- Updated `broad_composition` to `cell line` for WGS samples denoted as Cell line
- Removed duplicate `BS_4M0ZMCDC` with wrong age at diagnosis
- Add `cohort` column for CBTTC or PNOC003 samples
- Tumor specimens missing `composition` were changed to `Solid Tumor`
- Blood specimens missing `primary_site` were changed to `Peripheral Whole Blood`
- Updated `age_at_diagnosis` to earliest age reported (same age used in OS calculations)
- Updated `OS_days` and `OS_status` based on updated clinical data
- Added `cancer_predispositions` information
- Added `seq_center` (could not add seq_instrument at this time due to multiple entries for BS_IDs)
- Harmonized `Diagnosis` and `Initial CNS Tumor` for `tumor_descriptor` field
- Changed `Relapse` sample to `Progressive` (DIPG sample truly progressive, not relapse)
- Add tumor purity derived from Theta2 (`normal_fraction` and `tumor_fraction`)
- Add `glioma_brain_region` for low- and high-grade gliomas
- SV:
- Removed LUMPY data, as additional benchmarking to remove normal SVs needs to be done. We may not include this in a future release.
- SNV:
- Re-ran BS_7KR13R3P using targeted panel bed files; removed WXS calls from MAFs
- Added WXS calls to all MAFs
- Added targeted panel bed and padded bed files
- CNV:
- Re-ran ControlFreeC and CNVkit with optional BAF inputs; Added Theta2 purity correction to CNVkit
- Added copy number to CNVkit and removed ControlFreeC seg file
- folder structure:
data └── release-v6-20191030 ├── CHANGELOG.md ├── StrexomeLite_Targets_CrossMap_hg38_filtered_chr_prefixed.bed ├── StrexomeLite_hg38_liftover_100bp_padded.bed ├── WGS.hg38.lancet.300bp_padded.bed ├── WGS.hg38.lancet.unpadded.bed ├── WGS.hg38.mutect2.unpadded.bed ├── WGS.hg38.strelka2.unpadded.bed ├── WGS.hg38.vardict.100bp_padded.bed ├── WXS.hg38.100bp_padded.bed ├── md5sum.txt ├── pbta-cnv-cnvkit.seg.gz ├── pbta-fusion-arriba.tsv.gz ├── pbta-fusion-starfusion.tsv.gz ├── pbta-gene-counts-rsem-expected_count.polya.rds ├── pbta-gene-counts-rsem-expected_count.stranded.rds ├── pbta-gene-expression-kallisto.polya.rds ├── pbta-gene-expression-kallisto.stranded.rds ├── pbta-gene-expression-rsem-fpkm.polya.rds ├── pbta-gene-expression-rsem-fpkm.stranded.rds ├── pbta-histologies.tsv ├── pbta-isoform-counts-rsem-expected_count.polya.rds ├── pbta-isoform-counts-rsem-expected_count.stranded.rds ├── pbta-snv-lancet.vep.maf.gz ├── pbta-snv-mutect2.vep.maf.gz ├── pbta-snv-strelka2.vep.maf.gz ├── pbta-snv-vardict.vep.maf.gz └── pbta-sv-manta.tsv.gz
### release-v5-20190924
- release date: 2019-09-24
- status: available
- changes:
- [Separated RNA-Seq files](https://github.com/AlexsLemonade/OpenPBTA-analysis/issues/121):
- Created separate RDS files for stranded and polyA RNA-Seq samples
- [new RNA-Seq counts files](https://github.com/AlexsLemonade/OpenPBTA-analysis/issues/14):
- Added RSEM count matrices for genes and transcripts
- [new ARRIBA file](https://github.com/AlexsLemonade/OpenPBTA-analysis/pull/92#discussion_r324873300):
- Add `annots` column header which was removed during FusionAnnotator run
- [new SNV files](https://github.com/AlexsLemonade/OpenPBTA-analysis/pull/114):
- Add Lancet VEP-annotated MAF
- Add VarDict VEP-annotated MAF
- [new BED interval files](https://github.com/AlexsLemonade/OpenPBTA-analysis/issues/3):
- Add WXS BED (same file used for each variant caller)
- Add WGS BED files for each variant caller
- Methods described [here](https://github.com/AlexsLemonade/OpenPBTA-manuscript/blob/master/content/03.methods.md).
- folder structure:
data └── release-v5-20190924 ├── CHANGELOG.md ├── md5sum.txt ├── WGS.hg38.lancet.300bp_padded.bed ├── WGS.hg38.lancet.unpadded.bed ├── WGS.hg38.mutect2.unpadded.bed ├── WGS.hg38.strelka2.unpadded.bed ├── WGS.hg38.vardict.100bp_padded.bed ├── WXS.hg38.100bp_padded.bed ├── pbta-cnv-cnvkit.seg.gz ├── pbta-cnv-controlfreec.seg.gz ├── pbta-fusion-arriba.tsv.gz ├── pbta-fusion-starfusion.tsv.gz ├── pbta-histologies.tsv ├── pbta-snv-mutect2.vep.maf.gz ├── pbta-snv-strelka2.vep.maf.gz ├── pbta-sv-lumpy.tsv.gz ├── pbta-sv-manta.tsv.gz ├── pbta-gene-expression-kallisto.polya.rds ├── pbta-gene-expression-kallisto.stranded.rds ├── pbta-gene-expression-rsem-fpkm.polya.rds ├── pbta-gene-expression-rsem-fpkm.stranded.rds ├── pbta-gene-counts-rsem-expected_count.polya.rds ├── pbta-gene-counts-rsem-expected_count.stranded.rds ├── pbta-isoform-counts-rsem-expected_count.polya.rds ├── pbta-isoform-counts-rsem-expected_count.stranded.rds ├── pbta-snv-lancet.vep.maf.gz └── pbta-snv-vardict.vep.maf.gz
### release-v4-20190909
- release date: 2019-09-10
- status: available
- changes:
- [Clinical V4](https://github.com/AlexsLemonade/OpenPBTA-analysis/issues/95):
- De-duplicated `BS_6DCSD5Y6` in the clinical file by removing row with wrong age
- Added units to `age_at_diagnosis` column (age_at_diagnosis_days)
- [new RSEM file](https://github.com/AlexsLemonade/OpenPBTA-analysis/issues/93):
- Remove QC-failed samples from V3 RSEM.
- Add `BS_XM1AHBDJ` which should be included, but was missing from the V3 RSEM
- [new LUMPY file](https://github.com/AlexsLemonade/OpenPBTA-analysis/issues/97):
- Add data for 628 more subjects which was missing from V3 LUMPY
- folder structure:
data └── release-v4-20190909 ├── CHANGELOG.md ├── md5sum.txt ├── pbta-cnv-cnvkit.seg.gz ├── pbta-cnv-controlfreec.seg.gz ├── pbta-fusion-arriba.tsv.gz ├── pbta-fusion-starfusion.tsv.gz ├── pbta-gene-expression-kallisto.rds ├── pbta-gene-expression-rsem.fpkm.rds ├── pbta-histologies.tsv ├── pbta-snv-mutect2.vep.maf.gz ├── pbta-snv-strelka2.vep.maf.gz ├── pbta-sv-lumpy.tsv.gz ├── pbta-sv-manta.tsv.gz └── README.md
## archived release
### release-v3-20190829
- release date: 2019-08-29
- status: available
- changes:
- Arriba fusions run with https://github.com/FusionAnnotator/CTAT_HumanFusionLib/wiki#red-herrings-fusion-pairs-that-may-not-be-relevant-to-cancer-and-potential-false-positives to match STAR-fusion results
- Clinical data sheet, added information for:
- germline_sex_estimate
- overall survival (days)
- overall survival status (living/deceased)
- RNA_library type
- edited previous sex variable to be reported_gender
- folder structure: same to `release-v4-20190909`
### release-v2-20190809
- release date: 2019-08-09
- status: available
- changes:
- removed QC failed RNA-seqs
- alignment rate < 60%, n=
- wrong strandedness samples, n=
- removed tumor/normal genotype mis-match samples
- removed consent failed samples
- clincial infomation updated
- added and harmonized sample ID and diagnosis for PNOC003 samples
- added medullo subtype information
- CNV
- added seg files for CNV data
- added CNVkit results
- SV
- added LUMPY results
- added annotated SV files
- folder structure:
data └── release-v2-20190809 ├── CHANGELOG.md ├── md5sum.txt ├── pbta-cnv-cnvkit.seg.gz ├── pbta-cnv-controlfreec.seg.gz ├── pbta-fusion-arriba.tsv.gz ├── pbta-fusion-starfusion.tsv.gz ├── pbta-gene-expression-kallisto.rds ├── pbta-gene-expression-rsem.fpkm.rds ├── pbta-histologies.tsv ├── pbta-snv-mutect2.vep.maf.gz ├── pbta-snv-strelka2.vep.maf.gz ├── pbta-sv-lumpy.tsv.gz ├── pbta-sv-manta.tsv.gz └── README.md
### release-v1-20190730
- release date: 2019-07-30
- status: not-available as it contains consent/qc failed samples
- changes: initial push
data └── release-v1-20190730 ├── arriba.fusions.tsv.gz ├── clinical.tsv ├── controlfreec.cnv.tsv.gz ├── kallisto.abundance.tsv.gz ├── kallisto.genes.list ├── manta-sv.tsv.gz ├── mutect2.maf.gz ├── rsem.genes.list ├── rsem.genes.tsv.gz ├── rsem.isoforms.tsv.gz ├── star-fusion.fusions.tsv.gz ├── strelka2.maf.gz └── tumor-normal-pair.tsv ```