GENERAL ADVANCED OPTIONS
Gene identifiers: Gene IDs are retrieved from the ENSEMBL (ensGene) and RefSeq (refGene) tables from the UCSC genome browser where possible. We provide the following species-specific gene IDs:
Aedes aegypti (AaegL5)
Aedes albopictus (GCA_001876365.2)
Anopheles funestus (AfunF1)
Aedes aegypti (AaegL3)
Anopheles gambiae (anoGam1)
Anopheles gambiae (AgamP4)
Acyrthosiphon pisum (Acyr_2.0)
Anopheles stephensi (Indian, AsteI2)
Anopheles stephensi (SDA-500, AsteS1)
Bactrocera cucurbitae (ASM80634v1)
Bactrocera dorsalis (ASM78921v2)
Bactrocera oleae (GCA_001188975.2)
Bombyx mori (release 31)
Ceratitis capitata (Ccap1.0)
Culex quinquefasciatus (CpipJ2)
Diaphorina Citri (GCF_000475195.1)
Drosophila melanogaster (dm3)
Drosophila melanogaster (dm6)
Drosophila suzukii (Dsuz 1.0)
Mayetiola destructor (v1.0)
Ixodes scapularis (IscaW1)
Lutzomyia longipalpis (LlonJ1)
Manduca sexta (Mansex1.0)
N. vitripennis (Nvit1.0)
Nasonia vitripennis (Nvit2.1)
Phlebotomus papatasi (Ppapl1)
Papilio polytes (Ppolytes v1)
Rhodnius prolixus (RproC3)
Astatotilapia burtoni (AstBur1.0)
Astyanax mexicanus (AstMex102)
Bos taurus (Btau 5.0)
Chinese Hamster Ovary (CHO-K1)
Canis familiaris (CanFam3)
Ciona intestinalis (KH2012)
Callithrix jacchus (C_jacchus3.2.1)
Cyprinodon variegatus (GCF_000732505.1)
Danio rerio (danRer10/GRCz10)
Danio rerio (danRer11/GRCz11)
Gasterosteus aculeatus (gasAcu1)
Gallus gallus (galGal4)
Gallus gallus (galGal5)
Homo sapiens (hg19/GRCh37)
Homo sapiens (hg38/GRCh38)
Larimichthys crocea (GCA_900246015.1)
Mus musculus (mm10/GRCm38)
Mustela putorius furo (MusPutFur1.0/musFur1)
Nothobranchius furzeri (NotFur1)
Oryctolagus cuniculus (GCA_000003625.1)
Oikopleura dioica (Odioica)
Oryzias latipes (oryLat2)
Oncorhynchus mykiss (Omyk_1.0)
Oreochromis niloticus (ASM185804v2)
Rattus norvegicus (rn6)
Salmo salar (GCF_000233375.1)
Sus scrofa (Sscrofa11.1)
Xenopus tropicalis (xenTro9.0/JGI v9.0)
Xenopus laevis (Xenla9.1/JGI v9.1)
Nematostella vectensis (nemVec1)
Mnemiopsis leidyi (GCA_000226015.1)
Lytechinus variegatus (Lvar_2.2)
Strongylocentrotus purpuratus (EchinoBase v3.1)
Biomphalaria glabrata (BglaB1)
Octopus bimaculoides (PRJNA270931)
Caenorhabditis elegans (ce10/WS220)
Pristionchus pacificus (TAU2011)
Anabaena (PCC 7120)
Azotobacter vinelandii DJ (NC_012560.1)
Bdellovibrio bacteriovorus (ASM19617v1)
Bacteroides fragilis 638R (FQ312004.1)
Bacteroides fragilis 9343 (NC_003228.3)
Bacillus RZ2MS9 (MJBF01)
Bacillus subtilis 168 (NC_000964.3)
Bacteroides thetaiotaomicron (FR901300.1)
Caulobacter crescentus (NA1000)
Cupriavidus malaysiensis USMAA1020 chr1 (CP017754.1)
Cupriavidus malaysiensis USMAA1020 chr2 (CP017754.1)
Escherichia coli (str. CFT073/ASM744v1)
Escherichia coli (str. K-12/MG1655)
Escherichia coli (Nissle 1917/CP007799.1)
Enterococcus faecalis (GCF_000007785.1)
Lactobacillus delbrueckii (ATCC 11842)
Lysobacter enzymogenes (NZ_CP013140.1)
Mycobacterium marinum M (ASM1834v1)
Mycobacterium tuberculosis H37Rv (ASM19595v2)
Pseudomonas aeruginosa (UCBPP-PA14)
Pantoea agglomerans (NZ_CP014129.1)
Pseudomonas cichorii JBC1 (PRJNA232591)
Rhodobacter sphaeroides (KD131)
Salmonella enterica subsp. (GCF_000022165.1)
Shigella flexneri 1c (CP020753)
Streptomyces platensis (DSM-40041)
Streptococcus pyogenes M1 GAS (GCA_000321355.1)
Treponema caldarium (GCF_000219725.1)
Aspergillus fumigatus (Af293)
Aspergillus niger (ATCC 1015)
Botrytis cinerea (ASM83294v1)
Candida albicans (A22)
Candida glabrata (CBS138)
Collectotrichum graminicola M1.001 (Colgr1)
Cochliobolus heterostrophus C5 (v2.0)
Candida tropicalis (MYA-3404)
Metarhizium anisopliae (MAN 1.0)
Neurospora crassa (GCA_000182925.2)
Ogataea polymorpha NCYC495 leu1.1 (Hanpo2)
Pochonia chlamydosporia (GCF_001653235.2)
Purpureocillium lilacinum (GCA_001653205.1)
Pichia pastoris (v2 CBS 7435)
Saccharomyces cerevisiae (sacCer3/S288c)
Schizosaccharomyces pombe (ASM294v2.30)
Yarrowia lipolytica (GCF_000002525.2)
Yarrowia lipolytica (W29)
Leishmania braziliensis (MHOM/BR/75/M2904)
Leishmania major Friedlin (LmjF.01)
Plasmodium berghei (ANKA v2.0)
Plasmodium cynomolgi (GCF_000321355.1)
Plasmodium falciparum (3D7 v3.0)
Trypanosoma brucei (TREU927)
Toxoplasma gondii GT1 (ToxoDB-28)
Toxoplasma gondii ME49 (ToxoDB-28)
Aquilegia coerulea (v3.1)
Arachis duranensis (Aradu 1.0)
Arachis ipaensis (Araip 1.0)
Arabidopsis thaliana (TAIR10)
Cicer arietinum (ASM33114v1)
Cannabis sativa (v2)
Chlorella sorokiniana (CSI2_1230)
Eucalyptus grandis (v1.0)
Emiliania huxleyi (CCMP1516)
Eutrema salsugineum (v1.0)
Eragrostis tef (1.1.2)
Glycine max (GCA_000004515.3)
Nannochloropsis oceanica (IMET1.v2)
Solanum lycopersicum (Slyc3.0)
Zea mays (B73 AGPv4)
Autographa californica nucleopolyhedrovirus (AcMNPV) (NC_001623.1)
Gallid herpesvirus 2 (str. CVI988, DQ530348.1)
Gallid herpesvirus 3 (str. SB-1, HQ840738.1)
Human herpesvirus 2 (str. 333)
Human herpesvirus 2 (str. HG52)
Human herpesvirus 2 (str. SD90e)
Human herpesvirus 4 (str. Akata)
Human herpesvirus 6A (U1102_X83413)
Human herpesvirus 6B (Z29_NC000898)
Meleagrid herpesvirus 1 strain FC126 (AF291866.1)
Vaccinia virus (str. WR/AY243312.1)
Naegleria gruberi (GCF_000004985.1)
Phaeodactylum tricornutum (ASM15095v2)
Tetrahymena thermophila (MAC)
Tetrahymena thermophila (MIC)
Target a specific region of the gene: You may choose to target: (i) only the coding region (default), (ii) the entire exonic sequence (including 5' and 3' UTR), (iii) splice sites, (iv) 5' UTR only, (v) 3' UTR only, or (vi) a specific exon/subset of exons. If you wish to target an intron, specify the genomic coordinates of the intron (max size = 20,000 bp).
Restrict targeting: When searching for target sites in a region of interest, in the default mode CHOPCHOP allows the sgRNA or one TALE to bind just outside of that region so that the cut (in most cases) still occurs within the targeted region. You can turn off this functionality.
Restriction company preference: Some users may choose to assess successful mutagenesis using restriction enzyme digestion. CHOPCHOP displays restriction sites at the target locus, and it allows you to restrict the search to restriction sites of enzymes from a particular company.
Note about pasting sequence: Please be aware if you paste DNA sequence into CHOPCHOP, the algorithm doesn't know where in the genome the sequence originated from and therefore will likely find a perfect 'off-target' that corresponds to the sequence's endogenous locus.
sgRNA length: According to recent papers e.g. (Fu et al., 2014), using truncated sgRNAs may improve specificity. You can select different Cas9/Cpf1 sgRNA lengths or keep the standard 20 nt (default).
sgRNA PAM sequence: The Cas9 3’ PAM default is NGG; the Cpf1 5’ default is TTN. You can alternatively select a PAM from an orthologous type II CRISPR/Cas system (Fonfara et al., 2014) or enter a custom PAM.
Self-complementarity: New data suggests that self-complementarity within the sgRNA or between the sgRNA and RNA backbone can inhibit sgRNA efficiency (Thyme et al., 2016 ). This option searches for complementarity within the sgRNA, and between the sgRNA and either a standard backbone (AGGCTAGTCCGT), an extended backbone (AGGCTAGTCCGT, ATGCTGGAA) or a custom backbone. Some users will choose to replace the leading nucleotides of their sgRNA with “GG” for T7 transcription. Check this box to search for complementarity with the GG replacement.
Method for determining off-targets in the genome: There are three options. According to (Cong et al., 2013), single-base mismatches up to 11 bp 5' of the PAM completely abolish cleavage by Cas9. However, mutations further upstream of the PAM retain cleavage activity. We have created a uniqueness method that searches for mismatches only in the first 9 bp, since a mismatch further towards the PAM motif is predicted to cause no cleavage. According to (Hsu et al., 2013), mismatches can be tolerated at any position except in the PAM motif. We have therefore created a second uniqueness method that searches for mismatches only in the 20 bp upstream of the PAM. This is the default method.
Requirements for the 5' end of the sgRNA: Depending on the polymerase used for sgRNA synthesis, you may wish to limit the 5' end dinucleotides to, for example, 5' GN- (for the U6 promoter) or 5' GG- (for T7 polymerase).
Efficiency score: We have implemented a number of different efficiency scores based on the current literature. They are normalized to a 0-1 interval and are displayed in the “Efficiency” column. The simplest form of efficiency score is “G20”, which prioritizes a guanine at position 20, just upstream of PAM. The other methods use more sophisticated metrics (see references below).
Doench et al. 2014 - only for NGG PAM
Doench et al. 2016 - only for NGG PAM
Chari et al. 2015 - only NGG and NNAGAAW PAM's in hg19 and mm10
Xu et al. 2015 - only for NGG PAM (default)
Distance between Cas9 guides: Cas9 nickases are effective within a range of distances. The default distance measured in (Shen et al., 2014) is 10-31 bp, but you can change this to your preference.
Maximum distance between off-targets: CHOPCHOP searches the genome for off-targets with 0, 1, 2 or 3 mismatches. In addition, in nickase mode each pair of sgRNAs is evaluated for off-targets where binding and cutting within a given distance (default: 100 bp) would result in a DSB. This distance can be changed to your preference.
5'PAM: Cpf1 recognizes a T-rich 5’ PAM. CHOPCHOP offers a 5’ TTTN PAM search option, 5’ TTN (default), or you can specify your own PAM.
Distance between TALENs: TALENs seem to be effective at cutting at a range of distances, but the architecture may depend upon the assembly kit being used. The default distance is 14-20 bp but you can change this to your preference.
Number of mismatches when searching off-targets: CHOPCHOP can search the genome for off-targets with 0, 1, 2 or 3 mismatches. The greater the number of mismatches, the more rigorous the calculation of off-target activity, but the longer it takes for the program to run.
RVD preference: Depending upon the assembly kit being used, you may choose to use either the repeat variable di-residue (RVD) 'NN' for guanines, or 'NH', which has been shown to bind guanines more specifically than 'NN' (Cong et al., 2012, Streubel et al. 2012).
In order to analyze whether Cas9, Cpf1 or TALENs have successfully cleaved the target locus, users may need to amplify the region of interest for further analysis by deep sequencing or a T7E1 assay. CHOPCHOP integrates primer design with sgRNA/TALEN target site design using Primer3. Primers are designed to amplify the target cut site, and are mapped against the genome to avoid off-targets producing amplicons of similar length. You can adjust the amplicon size, primer Tm, primer length, and the minimum distance between each primer and the target site.
INTERPRETING THE RESULTS
CHOPCHOP displays the results of the query in a dynamic visualization and interactive table. The dynamic visualization displays all of the target site options for the given region color-coded according to our scoring: green (best), amber (okay), and red (bad). In all cases the gene is displayed 5' to 3'. Click on a target in the visualization or an option in the table to be taken to an individual results page containing information about any off-targets and the restriction sites and primer designs for that region.
Off-targets: CHOPCHOP lists how many off-targets each target site has with 0, 1, 2 or 3 mismatches. Clicking on the result will reveal more information about the off-targets.
Ranking: Target sites are ranked according to: (i) efficiency score (Cas9 mode, see above),
(ii) number of off-targets and whether they have mismatches,
(iii) existence of self-complementarity regions longer than 3 nt (Thyme et al., 2016 , the number indicates how many regions of self-complementarity are predicted),
(iv) GC-content (Cas9 mode): sgRNAs are most effective with a GC-content between 40 and 70% (Wang et al. 2014, Tsai et al. 2015),
(v) location of sgRNA within a gene (5’ (best) -> 3’ (worst)).
Due to the few sequence restrictions for TALEN designs, there are many options to choose from. CHOPCHOP therefore suppresses some output and groups similar TALEN pairs in a 'cluster'. The results table and visualization only displays the highest rank in each cluster, but other cluster members can be seen on the individual results pages.
More information about how we rank our results is available from the Scoring page.
Individual results page: This provides (i) a visualization of the target site (with the predicted cut site in blue in Cas9/Cpf1 mode), (ii) Primer designs (purple), (iii) restriction sites (green - unique in the amplicon, red - not unique), (iv) details about the off-targets (genomic location, number of mismatches and sequence of off-targets), (v) primer designs.