HINT domains and modules, the Superfamily
Inteins are a part of a large HINT-superfamily of domains that self-process their proteins post-translationally by protein splicing and self-cleavage (Dori-Bachash et al., 2009).
Intein HINT domains are in charge of protein splicing of the precursor protein flanks (Paulus, 2000)
Other identified classes of proteins in the HINT-superfamily contain three types of Bacterial intein-like domains (BIL), Vint domains and Hog proteins, such as Hedgehogs (Hh). Members of the HINT-superfamily can be found in all domains of life: Eubacteria, Archaea and Eukaryotes. (Bürglin, 2008)
In different protein families, the HINT-domain is involved in similar biochemical reactions, but varying biological processes (Amitai et al., 2003).
Hogs and hedgehogs
Hogs are a class of HINT-proteins, consisting of a N-terminal Hint domain and a C-terminal adduct recognition region, abbreviated ARR (Bürglin, 2008). Hogs’ function is to rearrange their N-peptide bond into a thioester bond (Porter et al., 1996).
Hog class spans throughout eukaryotes, and one of its most prominent protein families are Hedgehog (Hh) proteins. For the maturation of Hh proteins, HINT domains are absolutely essential, and this has been speculated to be the role of HINT domains in other classes of Hog proteins. (Porter et al., 1996)
Figure 7: Schematic drawing of a two-step mechanism for Hh autoprocessing according to Porter, J.A., Young, K.E., and Beachy, P.A. Cholesterol Modification of Hedgehog Signaling Proteins in Animal Development.2, doi: 10.1126/science.274.5285.255
Hh proteins contain a N-terminal Hedge (HhN) domain and C-terminal Hog (HhC) domain, and the HhC is followed by a cholesterol-binding domain (Bürglin, 2008). The thioester bond formed by the Hog domain is the target of a cleaving nucleophilic attack by a cholesterol molecule by the downstream binding domain (Porter et al., 1996).
Hedgehog proteins are inactive precursors when synthesized, and the auto-processing pathway begins by an acyl rearrangement targeting the Cys in the beginning of HhC. The rearrangement is analogous to the step 1 of standard intein-mediated protein splicing pathway. After the nucleophilic attack by cholesterol’s hydroxyl group, cholesterol is attached to the C-terminus of the signaling domain, resulting in the signaling domain being anchored to the cell surface. (Porter, Jeffrey A. et al., 1996)
Bacterial intein-like domains (BIL domains)
Despite being somewhat similar to inteins and Hogs, the bacterial intein-like domains are different from the other members of HINT superfamily in sequence, host protein type and phylogenetic distribution. It has also been suggested that the biological role of BIL domains have roles different from Hogs or inteins. (Amitai et al., 2003)
BIL domains can catalyze protein splicing and the cleavage of their host proteins, which indicates a close relation to inteins. However, BIL domains are found integrated in non-conserved, non-essential and secreted proteins (Amitai et al., 2003) , distincting them from inteins, which are usually inserted into a active site of essential proteins, such as ones involved in DNA metabolism. It’s speculated, that BIL activity is a mechanism for generating protein variability post-translationally, especially in secreted proteins. (Dassa et al., 2004)
BIL domain types
BIL domains usually contain 130-155 residues (Amitai et al., 2003), and three types of BIL domains are known at the moment; A, B and most recently C. The classification is done based on distinct characteristic sequence features. (Dori-Bachash et al., 2009)
The phylogenetic and genomic analysis of BIL sequences suggests, that they were positively selected for in different lineages (Amitai et al., 2003). The domain types A and B are common and appear in diverse bacterial divisions (Dassa et al., 2004).
Most BIL domains lack Cys, Ser or Thr at the +1 position, and diverse amino acid types have been found at the position. Usually, these other amino acids cannot serve as a nucleophile, but BIL A type domains have been demonstrated to catalyze protein splicing, despite their lack of a thiol or hydroxyl group in the side chain. (Amitai et al., 2003; Dassa et al., 2004) None of the known A-type BIL domains are followed by Cys residues, and just 15% are followed by Ser or Thr (Dassa et al., 2004). Nearly all A-type domains have apparently functional protein-splicing active sites (Amitai et al., 2003).
Asparagine cyclization can occur without transesterification by the flanking residue in case an A-type BIL domains lacks a C-terminal flanking Thr or Ser. In this case, the peptide bond between the Asn the following Thr is proved to undergo cleaving. (Amitai et al., 2003)
The type A BIL domains have been found in Proteobacteria, Cyanobacteria, Spirochaetes, Planctomycetes and Verrucomicrobia (Dassa et al., 2004).
The B-type BIL domains are in their mechanism more analogous to the canonical intein-mediated way of protein splicing, and they can auto-catalytically cleave a N-, or C-terminus of a protein. (Dassa et al., 2004). The C-terminal end of B-type BIL domains have a conserved position os Cys, Ser or Thr, but lacks a conserved Asn or Gln residue (Amitai et al., 2003)
The C-type BIL is notably different from other previously described BIL domain types: so far it is found to be solely specific to predatory aerobic δ-proteobacteria, and it has many unique features (Dori-Bachash et al., 2009).
The most prominent feature is the C-type domain’s tendency to appear with other domains, especially a conserved putative predator-specific domain 1, abbreviated to PPS-1 (Dori-Bachash et al., 2009). The PPS-1 is found to usually appear at the N-termini of the C-type BIL domains, immediately upstream. Based on the research done by Dori-Bachash et al, since they appear in different combinations in predatory bacteria, the C-type BIL domains and PPS-1 domains are in fact modular, and therefore may have different roles in different species.
The N-terminal cleavage mechanism mediated by C-type BIL resembles other HINT-domains. Conserved residues in the domain can catalyze a S/N acyl shift, followed by the forming of a labile thioester bond, susceptible to hydrolysis, and thus leading to the cleavage from the N-terminal end of the BIL domain.
C-type BIL domains post-translational catalytic activity is speculated to regulate the localization and function of its N-terminal protein part over the life cycle of the bacteria.
Since the purpose of BILs is thought to be generating variability in proteins post-translationally, that diversity C-type domains create may be used for recognition on prey cells in the predatory bacteria. (Dori-Bachash et al., 2009).
The existence of a Pretoxin-HINT domain has been predicted based on sequence studies, and according to the results it is a member of the HINT superfamily. Pretoxin-HINT domains are typically found in polymorphic toxin (PT) systems, in the N-terminal to the toxin module. The function of this domain type is predicted to be involved in releasing the toxin domain with its autoproteolytic functions. (Zhang et al., 2011)
If present in a PT system, the HINT domain occurs between the PT domains, for example PT-VENN or PT-TG, and the nuclease toxin domain. This is location is suspected to indicate HINT domain’s role as a peptidase that undergoes autoproteolytic cleavage reaction. (Zhang et al., 2011)
Vint proteins are a class of proteins consisting of HINT domain and a von Willebrand Factor type A domain, abbreviated to VWA. The VWA domain is found in the N-terminus of Vint proteins. (Bürglin, 2008a)
The von Willebrand factor is a multimeric glycoprotein found most importantly in blood plasma and platelets (Sadler, 1991). The type A domain of von Willebrand factor is a prototype for a protein superfamily and contains at least 75 proteins similar in sequence (Colombatti and Bonaldo, 1991). These domains have been found in varying molecules, such as other plasma proteins, multiple collagen types and α-subunits of integrins (Edwards and Perkins, 1995). Despite similarities in ligand binding between different proteins containing an A domain, they have progressed to additional and more diverse recognition specificities. (Ruggeri and Ware, 1993)