Instructions

GENERAL ADVANCED OPTIONS

Gene identifiers: Gene IDs are retrieved from the ENSEMBL (ensGene) and RefSeq (refGene) tables from the UCSC genome browser where possible. We provide the following species-specific gene IDs:

Mammals
B. taurus (Btau)
C. familiaris (CanFam3)
Chinese Hamster Ovary (CHO-K1)
H. sapiens (hg19/GRCh37)
H. sapiens (hg38/GRCh38)
M. putorius furo (MusPutFur1.0/musFur1)
M. musculus (mm10/GRCm38)
O. cuniculus (OryCun2.0)
R. norvegicus (rn6)
S. scrofa (Sscrofa11.1)

Vertebrates
A. burtoni (AstBur)
A. mexicanus (AstMex102)
C. variegatus
D. rerio (danRer10/GRCz10)
D. rerio (danRer11/GRCz11)
G. aculeatus (gasAcu1)
G. gallus (galGal4)
G. gallus (galGal5)
L. crocea (GCA_900246015.1)
N. furzeri (NotFur1)
O. niloticus (ASM185804v2)
O. latipes (oryLat2)
O. mykiss (Omyk_1.0)
S. salar (GCF_000233375.1)
X. tropicalis (XenTro9.0)
X. laevis (Xenla9.1)

Insect
Ac. pisum (Acyr_2.0)
An. funestus (AfunF1)
An. gambiae (anoGam1)
An. gambiae (AgamP4)
An. stephensi (AsteS1)
An. stephensi (AsteI2)
Ae. aegypti (AaegL3)
Ae. aegypti L5 (AaegL5)
B. cucurbitae (ASM80634v1)
B. dorsalis (ASM78921v2)
B. mori (release 31)
B. oleae (GCA_001188975.2)
C. capitata (Ccap1.0)
C. quinquefasicatus (CpipJ2)
D. citri (GCF_000475195.1)
D. melanogaster (dm3)
D. melanogaster (dm6)
D. suzukii (Version 1.0)
Hessian fly (Mayetiola_destructor v1.0)
L. longipalpis (LlonJ1)
N. vitripennis (Nvit1.0)
N. vitripennis (Nvit2.1)
M. sexta (Mansex1.0)
P. papatasi (Ppapl1)
Papilio polytes (Ppolytes v1)

Animalia
B. glabrata (BglaB1)
C. elegans (ce10/WS220)
C. intestinalis (KH2012)
I. scapularis (IscaW1)
L. variegatus (Lvar_2.2)
M. leidyi (GCA_000226015.1)
N. gruberi (GCF_000004985.1)
N. vectensis (Nemve1)
O. bimaculoides (PRJNA270931)
O. dioica (Odioica)
P. pacificus (TAU2011)
S. purpuratus (EchinoBase v3.1)

Plants
A. coerulea (v3.1)
A. duranensis (Aradu)
A. ipaensis (Araip)
A. thaliana (TAIR10)
C. arietinum (ASM33114v1)
C. sativa (v2)
E. grandis (v1.0)
E. tef (1.1.2)
E. salsugineum (v1.0)
S. lycopersicum (SL3.0)
Zea mays B73 (AGPv4)

Bacteria
Anabaena (PCC 7120)
A. vinelandii DJ (NC_012560.1)
Bacillus RZ2MS9 (MJBF01)
B. bacteriovorus HD100 (ASM19617v1)
B. fragilis 638R (FQ312004.1)
B. fragilis 9343 (NC_003228.3)
B. subtilis 168 (NC_000964.3)
B. thetaiotamicron (FR901300.1)
C. crescentus (NA1000)
C. malaysiensis USMAA1020 chr1 (CP017754.1)
C. malaysiensis USMAA1020 chr2 (CP017754.1)
E. coli (str. K-12/MG1655)
E. coli Nissle 1917 (CP007799.1)
E. coli CFT073 (ASM744v1)
E. faecalis V583 (GCF_000007785.1)
L. delbrueckii (ATCC11842)
L. enzymogenes (NZ_CP013140.1)
M. marinum M (ASM1834v1)
M. tuberculosis H37Rv (ASM19595v2)
P. aeruginosa (UCBPP-PA14)
P. agglomerans (NZ_CP014129.1)
P. cichorii JBC1 (PRJNA232591)
R. sphaeroides (KD131)
S. enterica subsp. (GCF_000022165.1)
S. platensis (DSM-40041)
T. caldarium (GCF_000219725.1)

Fungi
A. niger (ATCC 1015)
A. fumigatus (Af293)
C. albicans (A22)
C. glabrata (CBS138)
C. graminicola M1.001 (Colgr1)
C. heterostrophus C5 (v2.0)
C. tropicalis (MYA-3404)
H. polymorpha NCYC495 leu1.1 (Hanpo2)
M. anisopliae (MAN 1.0)
N. crassa (GCA_000182925.2)
P. pastoris (CBS 7435)
P. chlamydosporia (GCF_001653235.2)
P. lilacinum (GCA_001653205.1)
S. cerevisiae (sacCer3/S288c)
S. pombe (ASM294v2.30)
Y. lipolytica (CLIB122)
Y. lipolytica (W29)

Parasites
L. major (Friedlin strain)
L. braziliensis (MHOM/BR/75/M2904)
P. berghei (ANKA v2.0)
P. falciparum (3D7 v3.0)
T. brucei (TREU927)
T. gondii GT1 (ToxoDB-28)
T. gondii ME49 (ToxoDB-28)

Others
AcMNPV (NC_001623.1)
C. sorokiniana (CSI2_1230)
Emiliania huxleyi (CCMP1516)
Gallid herpes (DQ530348.1)
Human herpesvirus 6A (U1102_X83413)
Human herpesvirus 6B (Z29_NC000898)
Human herpesvirus 4 (Akata)
Human herpesvirus 2 (333)
Human herpesvirus 2 (HG52)
Human herpesvirus 2 (SD90e)
HIV-1 (AF033819.3)
P. tricornutum (ASM15095v2)
T. thermophila (MAC)
T. thermophila (MIC)
Vaccinia virus WR (AY243312.1)

Target a specific region of the gene: You may choose to target: (i) only the coding region (default), (ii) the entire exonic sequence (including 5' and 3' UTR), (iii) splice sites, (iv) 5' UTR only, (v) 3' UTR only, or (vi) a specific exon/subset of exons. If you wish to target an intron, specify the genomic coordinates of the intron (max size = 20,000 bp).

Restrict targeting: When searching for target sites in a region of interest, in the default mode CHOPCHOP allows the sgRNA or one TALE to bind just outside of that region so that the cut (in most cases) still occurs within the targeted region. You can turn off this functionality.

Restriction company preference: Some users may choose to assess successful mutagenesis using restriction enzyme digestion. CHOPCHOP displays restriction sites at the target locus, and it allows you to restrict the search to restriction sites of enzymes from a particular company.

Note about pasting sequence: Please be aware if you paste DNA sequence into CHOPCHOP, the algorithm doesn't know where in the genome the sequence originated from and therefore will likely find a perfect 'off-target' that corresponds to the sequence's endogenous locus.

CRISPR-SPECIFIC OPTIONS

sgRNA length: According to recent papers e.g. (Fu et al., 2014), using truncated sgRNAs may improve specificity. You can select different Cas9/Cpf1 sgRNA lengths or keep the standard 20 nt (default).

sgRNA PAM sequence: The Cas9 3’ PAM default is NGG; the Cpf1 5’ default is TTN. You can alternatively select a PAM from an orthologous type II CRISPR/Cas system (Fonfara et al., 2014) or enter a custom PAM.

Self-complementarity: New data suggests that self-complementarity within the sgRNA or between the sgRNA and RNA backbone can inhibit sgRNA efficiency (Thyme et al., 2016 ). This option searches for complementarity within the sgRNA, and between the sgRNA and either a standard backbone (AGGCTAGTCCGT), an extended backbone (AGGCTAGTCCGT, ATGCTGGAA) or a custom backbone. Some users will choose to replace the leading nucleotides of their sgRNA with “GG” for T7 transcription. Check this box to search for complementarity with the GG replacement.

Cas9-SPECIFIC OPTIONS

Method for determining off-targets in the genome: There are three options. According to (Cong et al., 2013), single-base mismatches up to 11 bp 5' of the PAM completely abolish cleavage by Cas9. However, mutations further upstream of the PAM retain cleavage activity. We have created a uniqueness method that searches for mismatches only in the first 9 bp, since a mismatch further towards the PAM motif is predicted to cause no cleavage. According to (Hsu et al., 2013), mismatches can be tolerated at any position except in the PAM motif. We have therefore created a second uniqueness method that searches for mismatches only in the 20 bp upstream of the PAM. This is the default method.

Requirements for the 5' end of the sgRNA: Depending on the polymerase used for sgRNA synthesis, you may wish to limit the 5' end dinucleotides to, for example, 5' GN- (for the U6 promoter) or 5' GG- (for T7 polymerase).

Efficiency score: We have implemented a number of different efficiency scores based on the current literature. They are normalized to a 0-1 interval and are displayed in the “Efficiency” column. The simplest form of efficiency score is “G20”, which prioritizes a guanine at position 20, just upstream of PAM. The other methods use more sophisticated metrics (see references below).

Doench et al. 2014 - only for NGG PAM
Doench et al. 2016 - only for NGG PAM
Chari et al. 2015 - only NGG and NNAGAAW PAM's in hg19 and mm10
Xu et al. 2015 - only for NGG PAM (default)

NICKASE-SPECIFIC OPTIONS

Distance between Cas9 guides: Cas9 nickases are effective within a range of distances. The default distance measured in (Shen et al., 2014) is 10-31 bp, but you can change this to your preference.

Maximum distance between off-targets: CHOPCHOP searches the genome for off-targets with 0, 1, 2 or 3 mismatches. In addition, in nickase mode each pair of sgRNAs is evaluated for off-targets where binding and cutting within a given distance (default: 100 bp) would result in a DSB. This distance can be changed to your preference.

Cpf1-SPECIFIC OPTIONS

5'PAM: Cpf1 recognizes a T-rich 5’ PAM. CHOPCHOP offers a 5’ TTTN PAM search option, 5’ TTN (default), or you can specify your own PAM.

TALEN-SPECIFIC OPTIONS

Distance between TALENs: TALENs seem to be effective at cutting at a range of distances, but the architecture may depend upon the assembly kit being used. The default distance is 14-20 bp but you can change this to your preference.

Number of mismatches when searching off-targets: CHOPCHOP can search the genome for off-targets with 0, 1, 2 or 3 mismatches. The greater the number of mismatches, the more rigorous the calculation of off-target activity, but the longer it takes for the program to run.

RVD preference: Depending upon the assembly kit being used, you may choose to use either the repeat variable di-residue (RVD) 'NN' for guanines, or 'NH', which has been shown to bind guanines more specifically than 'NN' (Cong et al., 2012, Streubel et al. 2012).

PRIMER OPTIONS

In order to analyze whether Cas9, Cpf1 or TALENs have successfully cleaved the target locus, users may need to amplify the region of interest for further analysis by deep sequencing or a T7E1 assay. CHOPCHOP integrates primer design with sgRNA/TALEN target site design using Primer3. Primers are designed to amplify the target cut site, and are mapped against the genome to avoid off-targets producing amplicons of similar length. You can adjust the amplicon size, primer Tm, primer length, and the minimum distance between each primer and the target site.

INTERPRETING THE RESULTS

CHOPCHOP displays the results of the query in a dynamic visualization and interactive table. The dynamic visualization displays all of the target site options for the given region color-coded according to our scoring: green (best), amber (okay), and red (bad). In all cases the gene is displayed 5' to 3'. Click on a target in the visualization or an option in the table to be taken to an individual results page containing information about any off-targets and the restriction sites and primer designs for that region.

Off-targets: CHOPCHOP lists how many off-targets each target site has with 0, 1, 2 or 3 mismatches. Clicking on the result will reveal more information about the off-targets.

Ranking: Target sites are ranked according to: (i) efficiency score (Cas9 mode, see above), (ii) number of off-targets and whether they have mismatches, (iii) existence of self-complementarity regions longer than 3 nt (Thyme et al., 2016 , the number indicates how many regions of self-complementarity are predicted), (iv) GC-content (Cas9 mode): sgRNAs are most effective with a GC-content between 40 and 70% (Wang et al. 2014, Tsai et al. 2015), (v) location of sgRNA within a gene (5’ (best) -> 3’ (worst)).

Due to the few sequence restrictions for TALEN designs, there are many options to choose from. CHOPCHOP therefore suppresses some output and groups similar TALEN pairs in a 'cluster'. The results table and visualization only displays the highest rank in each cluster, but other cluster members can be seen on the individual results pages.

More information about how we rank our results is available from the Scoring page.

Individual results page: This provides (i) a visualization of the target site (with the predicted cut site in blue in Cas9/Cpf1 mode), (ii) Primer designs (purple), (iii) restriction sites (green - unique in the amplicon, red - not unique), (iv) details about the off-targets (genomic location, number of mismatches and sequence of off-targets), (v) primer designs.