Amino acid sequences are provided in one file per species containing the
annotated transcripts using the assembly accession as a name. Transcripts were 
annotated with CESAR2.0 using the human (hg38) genome as a reference, the
120way-alignment and human ENSEMBL96 annotations. If there was more than one 
transcripts for a gene with exactly the same annotated coding sequence only the 
sequence for one of the transcripts is provided.

FASTA headers contain the ENSEMBL-TRANSCRIPT-ID|GENE-NAME if the gene name was
 available from biomart. Amino acid sequences are given in one letter code.
 Special characters:

X: frame-shifting insertion or deletion
*: stop codon
?: no aligned sequence, sequence contains assembly gaps, or 
   a low sequencing quality score