Coding DNA sequences are provided in one file per species containing the annotated transcripts using the assembly accession as a name. Transcripts were annotated with CESAR2.0 using the human (hg38) genome as a reference, the 120way-alignment and human ENSEMBL96 annotations. If there was more than one transcripts for a gene with exactly the same annotated coding sequence only the sequence for one of the transcripts is provided. FASTA headers contain the ENSEMBL-TRANSCRIPT-ID|GENE-NAME if the gene name was available from biomart. Sequences are plain DNA sequecnes and may contain frame-shifting deletions/insertions, or premature codons.