If you would like us to add a new genome to CHOPCHOP, you will need to submit a genome assembly file (FASTA) and gene annotation file (GFF3) corresponding to your organism of interest. Please note that we add new genomes in batches approximately every 3 months. To ensure the successful upload of your genome, please complete the following tasks before submission:
- Ensure your gene annotation file is a valid GFF3 file. To validate your file, please run gff3ToGenePred and ensure there are no errors. You will need to run this program in the terminal (e.g. "./gff3ToGenePred octopus.gff3 octopus.gp” then check the octopus.gp file is populated with gene locations). If you are unsure how to do that, please ask someone (e.g. a bioinformatician) to help you.
- Check that the ID attribute in the GFF3 file corresponds to the identifier you want to use for looking up genes. For instance, in this line (ID=gene6;Name=NCU09906;gbkey=Gene;gene_biotype=protein_coding;locus_tag=NCU09906) of a GFF3 file, the gene name that will be searchable in CHOPCHOP is “gene6”. If you would prefer it to be the value in the Name attribute (“NCU09906”), please let us know.
- Ensure that the chromosome names are identical across the GFF3 and FASTA files.
When all of this is fulfilled, please send us the following information by email:
- Full species name (as you would like it to be listed on CHOPCHOP).
- Assembly name (e.g. GCA_001661325.1 or v2.0).
- "ID" or "Name" attribute for the gene IDs (See point 2., above)?
- Did you validate your GFF3 file (See point 1., above)?
- Did you check that your chromosome names match (See point 3., above)?
- Please send the FASTA and GFF3 files by email, Dropbox or Google drive. Please avoid links that expire.