This section discusses TransAtlasDB input data format and output formats.
TransAtlasDB accepts input data from the different software required for differential expression analysis.
The current prototype accepts outputs from the Tuxedo Suite - TopHat and Cufflinks.
The Sample information, or samples metadata, is the reference point of the corresponding results from RNAseq data and therefore important for data archival and retrieval of the various transcriptome analysis. TransAtlasDB preferably accepts the sample information through the FAANG sample submission spreadsheet template to BioSamples. Details are provided here.
The FAANG sample submission spreadsheet template provides a detailed questionnaire for each sample and hence our database system was modeled to accept the FAANG excel template. However, the required fields in the spreadsheet are the Animal and Specimen sheets; with the Animal-‘Sample Name’, Animal-‘Organism’, Specimen-‘Sample Name’ and Specimen-‘Organism Part’ column filled. Also, the Specimen-‘Derived from’ should be the same as the Animal-‘Sample Name’ column of each sample.
||Sample identification number|
||Description of sample|
||Animal identification number|
||Name of tissue|
||Person's first name|
||Person's middle initial|
||Person's last name|
TransAtlasDB outputs user-defined queries as a tab-delimited table.
This table is the default output format which is accepted by most text editors or statistics tools such as Microsoft Excel, R and JMP software.
Asides from the tab-delimited format for exporting results, the variant information can be generated as a VCF output using the
Predicted functional annotations and sample metadata are added in the INFO field of the VCF file, using the key “CSQ” and “MTD” respectively.
Data fields are encoded separated by “|”; the order of fields is written in the VCF header.
VCFs produced by TransAtlasDB follow the standard VCF version 4 file format, and can be used for further downstream analysis or visualization using various variant viewers such as the University of California Santa Cruz (UCSC) Genome Browser, JBrowse, Integrative Genomics Viewer (IGV), and other programs that accept VCF files.
Please click the menu items to navigate through this repository. If you have questions, comments and bug reports, please email me directly.
Thank you very much for your help and support!